Files
clawtap/tests/e2e-spec.feature
kuannnn 0fcf66fc22 feat: ClawTap v0.2.0
Interactive Prompts:
- Unified InteractivePrompt type across all 3 adapters (Claude/Codex/Gemini)
- InteractivePromptOverlay component with options, text input, countdown
- Gemini + Codex pane monitors detect tool confirmation, ask user, plan approval
- respondInteractivePrompt routing: permission → respondPermission, options → _selectOption
- Claude AskUserQuestion nested questions[0] structure parsing

Cross-AI Review:
- Client-generated reviewId, removed pendingReview state
- FloatingReviewPanel uses CSS display:none instead of unmount (keeps hooks alive)
- Child review sessions default to YOLO/bypass permission mode
- Send back to parent, send to existing/new review, tab switching, end review
- Collapsed review cards with read-only panel for ended reviews
- Full reconnect support: active + ended reviews restore correctly

AskUserQuestion Tool Card UI:
- Dedicated renderer replaces raw JSON display
- Options shown with selected (green) / unselected (gray) indicators
- Free text answers shown in quoted format with green border
- Collapsed summary: question → answer
- Shared parseAskQuestionInput utility (client + server)
- Historical tool results attached via _result on tool_use blocks

Adapter Fixes:
- Session→adapter mapping persisted in SQLite (survives server restart)
- SESSION_CREATED deferred for pendingRekey adapters (Codex/Gemini)
- session-rekeyed handler sends complete SESSION_CREATED with adapter + cwd
- Gemini: auto-accept folder trust, privacy notice, IDE nudge, YOLO * prompt
- Claude: auto-accept bypass permissions confirmation (v2.1.85+)
- Port fallback (EADDRINUSE → try +1), statusLine shell script wrapper

Other:
- Desktop Enter sends / Shift+Enter newline; Mobile Enter newline
- Strip CLAWTAP_REF marker from session list
- Active sessions tab shows adapter badge
- Rename CLAUDE_UI_PASSWORD → CLAWTAP_PASSWORD

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:46:00 +08:00

3727 lines
162 KiB
Gherkin

# =============================================================================
# CodeTap — E2E Test Specification
# =============================================================================
#
# GLOBAL CONFIG:
# Server URL: http://localhost:${PORT:-3456}
# Password: value of CLAWTAP_PASSWORD env var
# Browser: agent-browser with mobile viewport (e.g. "iPhone 14")
#
# STEP DEFINITIONS:
# "I open a new chat"
# → Navigate to sessions view
# → Pick any project (or use New Project to select a directory)
# → Tap "New Chat" within that project
#
# "I have an active chat session"
# → Same as above, but a message has already been sent
#
# SCREENSHOT MARKERS: # [Screenshot: name.png]
# → agent-browser should capture a screenshot at that point
# → saved to tests/screenshots/<name>.png
#
# TIMELINE MARKERS: # T0, T1, T2...
# → sequential state transitions within a scenario
# → each T-point should be visually verified via screenshot
# → wait for the described condition before capturing
#
# E2E DEFINITION:
# E2E = "a user opens the mobile app and can see/do X".
# If it requires checking server logs, API response shapes, file descriptors,
# or settings.json — it's NOT E2E. Those are in the Appendix.
# =============================================================================
# =============================================================================
# CORE USER FLOWS
# =============================================================================
Feature: Authentication
Background:
Given the server is running
And the browser is open to the app URL
Scenario: Login with correct password
When I navigate to the app
Then I should see the login page with a password field
# [Screenshot: login-page.png]
When I enter the correct password
And I tap "Login"
Then I should be redirected to the sessions/projects view
And the token should be stored in localStorage
# [Screenshot: after-login.png]
Scenario: Login with wrong password
When I enter an incorrect password
And I tap "Login"
Then I should see an error message "Invalid password"
And I should remain on the login page
Scenario: Rate limiting after repeated failed attempts
When I enter an incorrect password multiple times (server limit: 5 per minute)
Then I should see "Too many login attempts. Try again later."
And further login attempts should be rejected
Scenario: Token persistence across page reload
Given I am logged in
When I reload the page
Then I should still be on the sessions view (not login)
Scenario: Logout clears session and returns to login
Given I am logged in and viewing the projects list
When I tap "Logout"
Then I should be redirected to the login page
And the token should be removed from localStorage
When I reload the page
Then I should see the login page (not sessions)
Feature: Session List & Project Navigation
Background:
Given I am logged in
Scenario: View projects list
Then I should see a list of projects grouped by working directory
And each project should show a session count badge
And projects should be sorted by most recent activity
# [Screenshot: projects-list.png]
Scenario: Navigate into a project
When I tap on a project
Then I should see the sessions within that project
And each session should show the first prompt (truncated)
And each session should show a relative timestamp (e.g. "3h ago")
And a back button should appear in the header
# [Screenshot: sessions-list.png]
Scenario: Navigate back to projects
Given I am viewing sessions within a project
When I tap the back button
Then I should return to the projects list
Scenario: Start new chat within a project
Given I am viewing sessions within a project
When I tap "New Chat"
Then a new empty chat should open
And the header should show the project name
And "Send a message to start" should appear in the chat area
And the send button should be disabled
Scenario: Start new chat with directory browser
When I tap "New Project"
Then a directory browser modal should appear
And it should show directories under the home folder
# [Screenshot: directory-browser.png]
When I select a directory
Then a new chat should open with that directory as cwd
Scenario: Navigate directories in directory browser
Given the directory browser is open at the home folder
When I tap a folder (e.g. "Documents")
Then the browser should show the contents of that folder
And the breadcrumb should update to show "~ / Users / name / Documents"
When I tap "~" in the breadcrumb
Then the browser should return to the home folder
Scenario: Tab bar with Projects and Active tabs
When I view the projects list
Then I should see a tab bar with "Projects" and "Active (N)" tabs
And "Projects" should be selected by default
When I tap "Active (N)"
Then the Active tab should be selected (highlighted with accent color)
Scenario: Active tab shows running sessions
Given I have 2 active chat sessions (tmux running)
When I tap the "Active" tab
Then I should see 2 session rows, each with:
| Element | Description |
| Green dot | Filled green circle (bg-green-500, 8px) |
| First prompt | Truncated first user message (or session UUID) |
| Time | Relative timestamp from lastActivity (e.g. "3m") |
| Project name | Short directory name (e.g. "code-tap") |
| Permission mode | "Normal", "YOLO", "Auto-edit", or "Plan" |
| Client count | 👤N shown only when clients are connected |
When I tap an active session row
Then I should enter the chat view for that session
Scenario: Active tab auto-refreshes every 3 seconds
Given I am viewing the Active tab
When I wait 3 seconds
Then the active sessions list should refresh automatically
And the session count in "Active (N)" should update
Scenario: Active tab empty state
Given no sessions are currently running
When I tap the "Active" tab
Then I should see "No active sessions"
And a "Refresh" button should be visible
Scenario: Green dot on active sessions in project drill-down
Given I have an active chat session in the "code-tap" project
When I drill into the "code-tap" project from the Projects tab
Then the active session should show a green dot before its title
And non-active (historical) sessions should NOT have a green dot
Scenario: Session ends and disappears from Active tab
Given I have an active session visible in the Active tab
When the Claude CLI session terminates
Then the session should disappear from the Active tab on next refresh
And the green dot should disappear from the project drill-down
Scenario: Session list loads quickly (getSessions optimization)
# Regression: previously parsed ALL session headers before sorting
Given there are 100+ session files
When I load the sessions list
Then the list should appear within 3 seconds
And sessions should be sorted by most recently modified
Feature: Chat — Send & Receive Messages
Background:
Given I am logged in
And I open a new chat
Scenario: Empty chat view shows correct initial state
When I open a new chat
Then "Send a message to start" should appear in the center
And the StatusBar should show model name and permission mode
And the input placeholder should say "Send a message..."
And the send button should be disabled
And the image upload button should be visible
Scenario: Send button disabled when input is empty
Given I am in an empty chat
Then the send button should be disabled (grayed out)
When I type any text in the input field
Then the send button should become enabled
When I clear all text from the input field
Then the send button should be disabled again
Scenario: Send a message and receive response
# --- Timeline: Message Lifecycle ---
# T0: Input ready
Then I should see the input field with placeholder "Send a message..."
# [Screenshot: T0-empty-chat.png]
When I type "Say hello in one sentence"
And I tap the send button
# T1: User message appears immediately (optimistic)
Then I should see my message in a blue bubble on the right
And the input field should be cleared
And a streaming indicator should appear ("Working...")
# [Screenshot: T1-user-message-sent.png]
# T2: Streaming text preview appears
Then within 10 seconds I should see streaming text preview
And the indicator should change to "Responding..."
# [Screenshot: T2-streaming-preview.png]
# T3: Thinking indicator (if model thinks)
# Note: may or may not appear depending on query complexity
# If visible: shows spinner + verb (e.g. "Thinking…")
# [Screenshot: T3-thinking-indicator.png — if visible]
# T4: Final message appears
Then within 30 seconds the streaming indicator should disappear
And I should see the assistant's response in a dark bubble on the left
And the response should contain rendered markdown
# [Screenshot: T4-response-complete.png]
Scenario: Streaming text preview shows live response
When I send a message and Claude begins responding
Then a streaming text preview should appear below the thinking indicator
And the preview text should update in real-time as Claude writes
And the preview should be line-clamped (max 3 lines visible)
When the response completes
Then the streaming preview should disappear
And the full response should appear as a message bubble
Scenario: Markdown rendering in responses
When I send "Show me a code block in Python and a bullet list"
Then the response should contain:
| Element | Rendered As |
| Code block | Syntax-highlighted block (oneDark) |
| Bullet list | Proper list with bullet markers |
| Inline code | Monospace with background |
Scenario: Session ID assigned after first message
When I send my first message
Then the header should update to show the project name
Then the header should show the CLI UUID (truncated, e.g. "625c60d0-aedb...")
And a copy icon should appear next to the CLI UUID
And the session ID should be a CLI UUID format (e.g. "d6d56787-bfaf-4312...")
When I tap the copy icon
Then the full CLI UUID should be copied to clipboard
Scenario: Chat auto-scrolls to latest message
Given I have a long conversation that fills the screen
When a new assistant message arrives
Then the chat should auto-scroll to show the latest message
Scenario: Scroll position preserved when user scrolls up
Given I have manually scrolled up to read earlier messages
When a new message arrives
Then my scroll position should NOT jump to the bottom
# =============================================================================
# TOOL & PERMISSION FLOWS
# =============================================================================
Feature: Tool Calls — Display & Status
Background:
Given I am logged in
And I open a new chat
Scenario: Tool execution lifecycle
# --- Timeline: Tool Status Transitions ---
When I send "Read the file package.json"
# T1: Tool card appears with "running" status
Then a tool card should appear with name "Read"
And the card should show the file path "package.json"
And the status should show a spinning loader
# [Screenshot: T1-tool-running.png]
# T2: Tool completes
Then within 15 seconds the tool status should change to a green checkmark
# [Screenshot: T2-tool-success.png]
# T3: Response with tool result
Then the assistant should reference the file contents in its response
Scenario: Edit tool with diff preview
When I send "Add a comment '// entry point' to the top of src/index.js"
Then a tool card for "Edit" should appear
When the tool completes
And I tap the tool card to expand it
Then I should see a diff view with red (removed) and green (added) lines
# [Screenshot: tool-edit-diff.png]
Scenario: View full diff in full-screen viewer
Given an Edit tool card has completed
When I tap the tool card to expand it
Then I should see a diff preview with red/green lines
When I tap "View full diff"
Then a full-screen DiffViewer modal should open
And it should show the file path in the header
And line numbers should be visible
And removed lines should have red background
And added lines should have green background
# [Screenshot: diff-viewer-fullscreen.png]
When I tap the X button
Then the modal should close
Scenario: Multiple tools in sequence
When I send "Read src/index.js and src/utils.js, then describe them"
Then multiple tool cards should appear in order
And each should transition from running success independently
# [Screenshot: multiple-tools.png]
Scenario: Subagent group display
When I send a request that spawns an Agent tool with subtasks
Then I should see a collapsible "Agent" card
And it should show completion count (e.g. "1 of 3 agents running")
When I expand the agent card
Then I should see the individual subtool cards inside
# [Screenshot: subagent-group.png]
Scenario: Tool error display
When a tool execution fails
Then the tool card should show a red X icon
And the status should be "error"
# [Screenshot: tool-error.png]
Scenario: Known tools show specific descriptions
When a "Read" tool card appears with file_path "src/index.ts"
Then the card summary should show "src/index.ts"
When a "WebFetch" tool card appears with url "https://example.com"
Then the card summary should show "https://example.com"
When a "WebSearch" tool card appears with query "react hooks"
Then the card summary should show "react hooks"
Scenario: Unknown tools show first input value
When a custom MCP tool "mcp__myserver__query" appears
And its input contains {"query": "SELECT * FROM users", "limit": 10}
Then the card summary should show "SELECT * FROM users"
# Fallback: first non-empty string value from input
Feature: Permission System
Background:
Given I am logged in
And permission mode is set to "Normal"
And I open a new chat
Scenario: Permission overlay shows 3 vertically stacked options
When I send "Create a file called test.txt with hello world"
# --- Timeline ---
# T0: User message sent
# T+1s: Tool card appears with "Write" tool (Loading spinner)
# T+2s: Permission overlay slides up from bottom
Then a permission overlay should appear from the bottom
And it should show the tool name and input details
And it should display 3 vertically stacked buttons:
| Position | Button | Style |
| Top | Allow | Green/primary |
| Middle | Allow all for this command | Secondary |
| Bottom | Deny | Ghost |
And a countdown timer should show (starting at 120s)
# [Screenshot: T2-permission-overlay-3-options.png]
Scenario: Allow — tool executes and completes
Given a permission overlay is showing for Write /tmp/test.txt
# T0: User taps "Allow"
When I tap "Allow"
# T+0.5s: CLI selects "Yes", overlay dismisses on mobile
Then the permission overlay should dismiss
# T+1s: CLI executes the tool
# T+2s: PostToolUse fires → tool card transitions to Complete
And the tool card should show Complete (green )
And the file /tmp/test.txt should exist
And Claude's response should appear
# [Screenshot: T0-permission-allow-tap.png]
# [Screenshot: T2-permission-allow-complete.png]
# Verify: tool card did NOT revert to Loading after Allow
Scenario: Allow all — auto-approve future same-type tools
Given a permission overlay is showing for Write /tmp/test2.txt
# T0: User taps "Allow all for this command"
When I tap "Allow all for this command"
# T+0.5s: CLI selects "Yes, allow all...", switches to accept-edits mode
Then the permission overlay should dismiss
And the file /tmp/test2.txt should exist
# [Screenshot: T0-allow-all-tap.png]
# T+5s: Claude uses another Write tool
When Claude uses another Write tool
Then the tool should auto-approve WITHOUT showing a permission overlay
And the tool card should transition directly to Complete
# [Screenshot: T5-auto-approve-no-overlay.png]
# Note: "Allow all" activates CLI's per-command auto-approve for this session
# It does NOT change the mobile permission mode selector (still shows "Normal")
Scenario: Deny — tool rejected, file not created
Given a permission overlay is showing for Write /tmp/deny.txt
# T0: User taps "Deny"
When I tap "Deny"
# T+0.5s: CLI shows "User rejected"
Then the permission overlay should dismiss
And /tmp/deny.txt should NOT exist
And Claude should acknowledge the denial in its response
# [Screenshot: T0-deny-tap.png]
# [Screenshot: T1-deny-result.png]
Scenario: Permission overlay timeout — auto-dismiss after 2 minutes
Given a permission overlay is showing
When I wait without responding for 2 minutes
Then the mobile overlay should auto-dismiss
# Note: Timeout dismisses the MOBILE overlay only.
# The CLI terminal prompt remains active. Desktop user can still answer.
# If neither answers, CLI eventually times out on its own.
Scenario: Permission for Agent subtool — overlay appears inline
Given Claude is running an Agent
And the Agent uses a Bash subtool that requires permission
# T0: Agent card appears (Loading)
# T+1s: Bash subtool starts → PreToolUse fires
# T+2s: PermissionRequest fires → permission overlay appears
Then the permission overlay should show the Bash command
# [Screenshot: T2-agent-subtool-permission.png]
When I tap "Allow"
# T+3s: Subtool executes → PostToolUse fires → subtool Complete
Then the subtool card should transition to Complete
And the Agent card should remain in Loading (Agent still working)
# [Screenshot: T3-agent-subtool-approved.png]
Scenario: Permission approved — no state corruption on other tools
Given Claude uses Read (auto-approved), then Write (needs permission), then Bash
# T0: Read tool starts → auto-approved → Complete
# T+1s: Write tool starts → permission overlay appears
# T+3s: User taps Allow → Write completes
# T+4s: Bash tool starts
Then Read tool card should remain Complete throughout
And Write tool card should go Loading Permission Complete
And Bash tool card should show Loading Complete
And no tool card should revert status at any point
# [Screenshot: T0-read-complete.png]
# [Screenshot: T1-write-permission.png]
# [Screenshot: T3-write-complete.png]
# [Screenshot: T4-bash-loading.png]
Scenario: AskUserQuestion — select option
When Claude uses AskUserQuestion with options ["Yes", "No", "Maybe"]
Then a question panel should appear (not the permission overlay)
And it should show the question text
And it should list 3 selectable option buttons
And it should have an "Other..." button
# [Screenshot: ask-question-panel.png]
When I tap "Yes"
Then the panel should change to show "Question answered"
And the response "Yes" should be sent to Claude
Scenario: AskUserQuestion — free-form response
Given a question panel is showing
When I tap "Other..."
Then a text input should appear with placeholder "Type your answer..."
When I type "Custom answer" and press Enter
Then the panel should change to show "Question answered"
And the response "Custom answer" should be sent to Claude
Feature: Permission Mode Switching (Mid-Session)
Background:
Given I am logged in
And I have an active chat session
Scenario: Cycle permission modes in StatusBar
# --- Timeline: Mode Cycling ---
# T0: Default mode
Then the StatusBar should show "Normal" as the permission mode
# [Screenshot: T0-mode-normal.png]
When I tap the permission mode label
Then it should cycle to "Auto-edit"
# [Screenshot: T1-mode-auto-edit.png]
When I tap again
Then it should cycle to "Plan"
# [Screenshot: T2-mode-plan.png]
When I tap again
Then it should cycle to "YOLO"
# [Screenshot: T3-mode-yolo.png]
When I tap again
Then it should cycle back to "Normal"
Scenario: YOLO mode auto-allows all tools
Given I set permission mode to "YOLO"
When I send "Create a file called yolo-test.txt"
Then the tool should execute without showing a permission overlay
And the tool card should go directly from running to success
Scenario: Switch to YOLO while permission overlay is showing
Given permission mode is "Normal"
And a permission overlay is currently showing
When I tap the mode label to switch to "YOLO"
Then the permission overlay should dismiss immediately
And the pending tool should proceed (allowed)
# [Screenshot: mode-switch-dismisses-overlay.png]
Scenario: Plan mode handled by CLI natively
Given I set permission mode to "Plan"
When I send "Create a file called plan-test.txt"
Then Claude CLI should handle plan mode restrictions natively
And the permission request should pass through to the terminal
# Note: CodeTap no longer auto-denies in plan mode — CLI enforces its own restrictions
Scenario: Auto-edit mode allows edits but asks for Bash
Given I set permission mode to "Auto-edit"
When Claude tries to use the "Edit" tool
Then it should auto-allow without showing permission overlay
When Claude tries to use the "Bash" tool
Then a permission overlay should appear
Scenario: Mode persists for resumed sessions
# Regression: resumeSession previously hardcoded --dangerously-skip-permissions
Given I set permission mode to "Normal"
And I close and reopen the app
When I open the same session
Then the mode should still be "Normal"
And permission requests should still appear
Feature: Interrupt / Abort
Background:
Given I am logged in
And I open a new chat
Scenario: Interrupt during streaming response
# --- Timeline: Interrupt Flow ---
When I send "Write a very long essay about the history of computing"
# T1: Streaming begins
Then I should see the streaming indicator
And a stop button should appear (replacing the send button)
# [Screenshot: T1-streaming-with-stop.png]
# T2: User taps stop
When I tap the stop button
# T3: Immediate UI feedback
Then the streaming indicator should disappear immediately
And any running tool cards should show the interrupted icon (ban/circle)
# [Screenshot: T3-interrupted-immediate.png]
# T4: Interrupt marker appears from server
Then within 5 seconds an interrupt marker should appear:
" Interrupted · What should Claude do instead?"
And the input placeholder should change to "What should Claude do instead?"
# [Screenshot: T4-interrupt-marker.png]
Scenario: Interrupt during tool execution
When I send "Run ls -la in the project directory"
And the tool card shows "running"
When I tap the stop button
Then the tool card should immediately show the interrupted icon (not success)
# Regression: previously showed success icon before abort processed
# [Screenshot: tool-interrupted.png]
Scenario: Send follow-up after interrupt
Given I interrupted the previous response
When I type "Instead, just say hi" and send
Then a new user message should appear
And Claude should respond with a new message
And the interrupt marker should remain in history
Scenario: Interrupt detection in session history
# Regression: previously didn't detect interrupts when loading old sessions
Given a session has interrupt markers in its JSONL history
When I open that session
Then the interrupt markers should render as " Interrupted..." (not as user messages)
# =============================================================================
# ADVANCED FEATURES
# =============================================================================
Feature: StatusBar — Model & Context
Background:
Given I am logged in
And I have an active chat session
Scenario: Cycle models in StatusBar
Then the StatusBar should show the current model (default: "Opus 1M")
When I tap the model label
Then it should cycle to the next model
And the new model should be persisted for the next message
Scenario: Context usage display from statusline
# --- Timeline: Context % Updates ---
# T0: No context data yet
Then the StatusBar should NOT show a context progress bar
# [Screenshot: T0-no-context.png]
When I send a message and receive a response
# T1: Statusline hook fires with context data
Then the StatusBar should show a context usage percentage
And a progress bar should appear (green if <50%)
# [Screenshot: T1-context-shown.png]
When I send several more messages
# T2: Context grows
Then the percentage should increase
And the bar color should change:
| Percentage | Color |
| 0-50% | Green |
| 50-80% | Yellow |
| 80-100% | Red |
# [Screenshot: T2-context-growing.png]
Scenario: Compacting status in UI
Given I have an active chat session with high context usage
When Claude compacts the conversation context
Then the mobile UI should show "Compacting context..." as the thinking status
# Note: No explicit "compaction done" event — the thinking status is replaced
# when the next event arrives (tool-start, text-delta, etc.)
Feature: Image Upload
Background:
Given I am logged in
And I open a new chat
Scenario: Upload and send image with message
When I tap the image upload button
And I select an image file
Then a thumbnail preview should appear near the input
# [Screenshot: image-thumbnail-preview.png]
When I type "What is in this image?" and send
Then the message should be sent with the image reference
And Claude should respond about the image content
Scenario: Remove image before sending
Given I have selected an image (thumbnail visible)
When I tap the remove button on the thumbnail
Then the thumbnail should disappear
And I can send a text-only message
Scenario: Paste image from clipboard
Given I am in a chat session
When I paste an image from the clipboard (Ctrl+V / Cmd+V)
Then a thumbnail preview should appear near the input
And I should be able to send the message with the pasted image
Feature: Plan Mode UI
Background:
Given I am logged in
And I open a new chat
Scenario: EnterPlanMode shows plan card inline
When Claude enters plan mode (EnterPlanMode tool)
Then a plan card should appear inline in the chat messages
And the card should show truncated plan text (max 500 chars)
And "Approve", "Reject", and "Approve (YOLO)" buttons should be visible
And the input area should remain active for feedback
# [Screenshot: plan-mode-card.png]
When I tap "Approve"
Then an approval message should be sent to Claude
And Claude should proceed with implementation
Scenario: Reject plan with feedback
Given a plan card is showing
When I tap "Reject"
Then a text input should appear for feedback
When I type "Use a different approach" and submit
Then a rejection message with feedback should be sent
Scenario: ExitPlanMode shows collapsible plan document
When Claude exits plan mode with a plan document
Then a collapsed plan card should appear with the plan title
And it should show a "View" button
When I tap "View"
Then a full-screen plan viewer should open with rendered markdown
And a close (X) button should appear in the top corner
# [Screenshot: plan-fullscreen.png]
When I tap the X button
Then the viewer should close and return to chat
Scenario: Approve plan with YOLO mode
Given a plan card is showing with Approve buttons
When I tap "Approve (YOLO)"
Then permission mode should switch to "YOLO" (bypassPermissions)
And the plan should be approved
And subsequent tools should auto-execute without permission prompts
Scenario: Send feedback during plan review
Given a plan card is showing
When I type feedback text in the input field
And I tap the send feedback button
Then the feedback should be sent as a message to Claude
And Claude should incorporate the feedback
Feature: Message Queuing
Background:
Given I am logged in
And I have an active chat session
Scenario: Queue message during streaming
# --- Timeline: Queue Lifecycle ---
When I send "Tell me about React"
# T0: Streaming
Then the stop button should be visible
When I type "Also tell me about Vue" and tap send
# T1: Message queued
Then a queued message bubble should appear
And it should show a "Queued" badge
And it should show the queued text
And "Edit" and "Cancel" buttons should appear
# [Screenshot: T1-queued-message.png]
# T2: First response completes
Then when the first response finishes
# T3: Queued message auto-sends
Then the queued message should automatically send
And it should appear as a regular user message
# [Screenshot: T3-queue-drained.png]
Scenario: Edit queued message
Given a message is queued
When I tap "Edit"
Then the queued text should appear in the input field
And the queued message bubble should disappear
Scenario: Cancel queued message
Given a message is queued
When I tap "Cancel"
Then the queued message should disappear
And nothing should send when the current response completes
Feature: Task Progress (TodoWrite)
Background:
Given I am logged in
And I have an active chat session
Scenario: View task progress
When Claude uses TodoWrite with a list of tasks
Then a task progress card should appear
And it should show a completion percentage
And each task should show its status:
| Status | Icon |
| pending | Empty circle |
| in_progress | Filled circle |
| completed | Checkmark (strikethrough)|
# [Screenshot: task-progress.png]
Feature: Shimmer Input (Ultra-Think Keywords)
Background:
Given I am logged in and in a chat
Scenario: Shimmer animation on ultra-think keywords (keyword-only)
When I type "ultrathink"
Then ONLY the word "ultrathink" should have a shimmer animation effect
And the rest of the input should render as normal text
# [Screenshot: shimmer-keyword-only.png]
When I clear the text and type "please megathink about this"
Then ONLY the word "megathink" should shimmer
And "please " and " about this" should be normal text
When I clear and type "think harder now"
Then ONLY "think harder" should shimmer
And " now" should be normal text
When I clear and type "normal message"
Then there should be no shimmer effect on any text
When I clear and type "ultrathink and also megathink"
Then both "ultrathink" and "megathink" should shimmer
And " and also " should be normal text
# =============================================================================
# RESILIENCE & PERSISTENCE
# =============================================================================
Feature: WebSocket Connection & Keepalive
Background:
Given I am logged in
Scenario: Connection lifecycle
# --- Timeline: Connection States ---
When I open the app
# T0: Connecting
Then the WebSocket should be in "connecting" state
# T1: Connected
Then within 2 seconds it should transition to "connected"
Scenario: Reconnection on disconnect
Given I have an active chat session with messages
When the WebSocket connection drops
# T0: Disconnected
Then "Reconnecting..." should appear in the header
# [Screenshot: T0-reconnecting.png]
# T1: Auto-reconnect (exponential backoff: 1s, 2s, 4s...)
Then within 5 seconds the connection should be re-established
# T2: Session history reloaded
Then all previous messages should still be visible
And the conversation should be in the correct state
# [Screenshot: T2-reconnected-with-history.png]
Scenario: Reconnect to active session
Given Claude is currently streaming a response
When the WebSocket disconnects and reconnects
Then the session should resume
And new messages should continue appearing
Scenario: WS connection survives long thinking period (60+ seconds)
Given mobile is connected to a session in ChatView
When Claude is thinking for 60+ seconds (e.g., deep analysis with high effort)
# T0: User sends message
# T+5s: Thinking indicator appears
# T+30s: Server ping/pong keeps connection alive
# T+60s: Another ping/pong cycle
# T+65s: Claude starts responding
Then the WebSocket connection should remain open throughout
And the thinking indicator should update continuously
And when the response arrives, it should stream normally
# [Screenshot: T65-response-after-long-think.png]
# Previously: WS disconnected after ~30s idle, losing all real-time updates
Scenario: WS connection survives Agent execution (60+ seconds)
Given mobile is connected to a session in ChatView
When Claude runs an Agent that executes 10+ subtools over 90 seconds
# T0: Agent starts
# T+30s: ping/pong keeps connection alive
# T+60s: ping/pong again
# T+90s: Agent completes
Then the WS connection should remain open for the entire duration
And tool cards should update in real-time (Loading → Complete)
And the final response should stream to mobile
# [Screenshot: T90-agent-complete-after-keepalive.png]
Feature: Session Resume & History
Background:
Given I am logged in
Scenario: Resume an old session with full history
Given I previously had a conversation with multiple messages
When I open that session from the session list
# T0: Loading
Then I should see the session loading
# T1: History loaded
Then all previous messages should appear in order
And user messages should be on the right (blue)
And assistant messages should be on the left (dark)
And interrupt markers should render correctly
And plan cards should render correctly
# [Screenshot: T1-session-history.png]
# T2: Ready for new input
Then the input field should be ready for a new message
When I send a new message
Then it should continue the conversation
Scenario: Session reconnect preserves scroll position
Given I am viewing a long conversation
And I have scrolled up in the message list
When the WebSocket reconnects
Then my scroll position should be preserved
And new messages should not force-scroll to bottom
Feature: Session Persistence
Background:
Given I am logged in
Scenario: Session survives client disconnect
Given I have an active chat session from mobile
When I close the mobile browser tab
And I wait 60 seconds
Then the tmux window should still be running
When I reopen the mobile app and navigate to the session
Then the session should still be active
And all messages should be preserved
And I should be able to send new messages
Feature: Offline Detection & Mascot
The app detects server unreachability via health polling and shows
an offline screen with a sleeping mascot cat.
Background:
Given I am logged in
Scenario: App shows loading mascot during initial connection
When I open the app for the first time
# T0: App starts, health check pending
Then a loading animation should appear with a floating cat mascot (idle state)
And a "Connecting..." label should pulse below the mascot
# [Screenshot: T0-loading-mascot-idle.png]
# T1: Health check passes
Then the mascot should disappear
And the sessions view should load
# [Screenshot: T1-sessions-loaded.png]
Scenario: Offline view appears when server is unreachable
Given the server has stopped (e.g., codetap process killed)
# T0: Health polling fails (2 consecutive failures, 15s interval)
# T+30s: Offline view appears
Then an OfflineView should replace the current view
And it should show a sleeping cat mascot (sleep state)
And it should show the "codetap" command for restarting
And a "Retry" button should be visible
# [Screenshot: T30-offline-view-sleeping-cat.png]
Scenario: Retry button reconnects to server
Given the offline view is showing
And the server has been restarted
When I tap "Retry"
Then the app should attempt to reconnect
And the loading mascot should appear (idle state)
And within 5 seconds the sessions view should load
# [Screenshot: retry-reconnected.png]
Scenario: ChatView shows "Reconnecting..." during temporary disconnect
Given I am in ChatView with an active session
When the server becomes temporarily unreachable
Then "Reconnecting..." should appear in the ChatView header
And the chat messages should remain visible (not replaced by offline view)
When the server becomes reachable again
Then "Reconnecting..." should disappear
And the session should resume normally
# [Screenshot: chatview-reconnecting-header.png]
Scenario: Browser offline event triggers reconnection attempt
Given I am using the app on a mobile device
When the device loses internet connection (airplane mode)
Then the app should detect the offline state
And "Reconnecting..." should appear
When internet is restored
Then the app should automatically reconnect
# =============================================================================
# STREAMING & MONITORING
# =============================================================================
Feature: Streaming Text Pipeline
PaneMonitor captures tmux pane content, extracts response text,
and streams it to mobile as TEXT_DELTA events via WebSocket.
Background:
Given a Claude session is active
And mobile is connected to the session in ChatView
Scenario: Response text streams incrementally to mobile
When Claude generates a 500+ word response
# T0: User message sent
# T+2s: Thinking indicator appears
# T+5s: Claude starts writing response
# T+5.5s: First streaming text visible on mobile (~50 chars)
# T+6.0s: More text appears (~200 chars)
# T+6.5s: More text (~350 chars)
# ...continues every ~500ms...
# T+12s: Response complete → streaming stops
Then text should appear incrementally on mobile (not all at once)
And each update should show more text than the previous
And the final complete message should match the streamed content
# [Screenshot: T5-streaming-start.png] (first ~50 chars visible)
# [Screenshot: T7-streaming-mid.png] (200+ chars visible)
# [Screenshot: T10-streaming-near-end.png] (full text forming)
# [Screenshot: T12-streaming-complete.png] (final state)
Scenario: Streaming works correctly after context compaction
Given Claude is in a long session with high context usage
When context compaction occurs (StatusBar shows "Compacting context...")
And the next user message triggers a response
Then the new response should stream fresh content
And old response text should NOT leak into the new stream
And the streaming preview should show only the latest response
# [Screenshot: post-compaction-streaming.png]
Scenario: Streaming shows only the latest response (multi-prompt session)
Given the user has sent 5+ prompts with responses
When the user sends a new message and Claude responds
Then the streaming preview should show ONLY the new response text
And it should NOT include text from earlier responses
# Previously: PaneMonitor found first ⏺ marker instead of last, mixing old responses
Feature: SubagentStop — Streaming Preservation
When a subagent (Agent tool) completes, PaneMonitor stays active
and the parent session's response continues to stream.
Background:
Given a Claude session is active
And mobile is connected in ChatView
Scenario: Response streams after Agent subtools complete
When Claude is running an Agent with subtools (Read, Bash, etc.)
# T0: Agent card appears (Loading)
# T+2s: Subtools execute, sub-tool cards show Loading → Complete
# T+5s: Agent subtools finish, Agent generates final response text
# T+6s: Response text starts streaming to mobile
# T+8s: More text appears
# T+10s: Response complete
Then after the Agent's subtools finish, response text should still stream to mobile
And the text should appear incrementally (not all at once when turn ends)
And tool cards should remain in their final status (not reset)
# [Screenshot: T5-subtools-done-streaming-starts.png]
# [Screenshot: T8-streaming-after-subagent.png]
# [Screenshot: T10-response-complete.png]
# Previously: SubagentStop killed the streaming monitor, text appeared all at once
Scenario: No premature turn completion after Agent subtools finish
When Claude runs an Agent that completes its subtools
Then the mobile should NOT show "turn complete" state prematurely
And subsequent tool events should appear normally (Loading → Complete)
And new tool cards should NOT be discarded
When the final response is complete
Then the turn should end normally
# Previously: premature TURN_COMPLETE caused tool cards to be discarded
Scenario: Multiple nested agents — each completes independently
Given Claude spawns Agent A which spawns Agent B
# T0: Agent A card appears (Loading)
# T+3s: Agent B card appears nested under A (Loading)
# T+8s: Agent B completes → B's card shows Complete
# T+10s: Agent A continues with more work
# T+15s: Agent A completes → A's card shows Complete
# T+18s: Final response streams
When Agent B completes
Then Agent A should continue working (card still Loading)
And streaming should remain active
When Agent A completes
Then the final response should stream normally
When the turn ends
Then all tool cards should show Complete
# [Screenshot: T8-agent-b-complete.png]
# [Screenshot: T10-agent-a-still-working.png]
# [Screenshot: T18-all-complete.png]
# =============================================================================
# INTEGRATION TEST
# =============================================================================
Feature: Cross-Feature Timeline — Full Chat Lifecycle
# This scenario tests the complete lifecycle of a chat session
# with explicit screenshot points at each state transition.
# It exercises: chat, tools, permissions, interrupt, mode switching,
# context display, and session persistence in a single flow.
Scenario: Complete chat lifecycle with tools, permissions, and interrupt
Given I am logged in
And permission mode is "Normal"
# === Phase 1: New Chat ===
When I tap "New Chat" on a project
Then I should see an empty chat view
# [Screenshot: lifecycle-01-empty-chat.png]
# === Phase 2: First Message ===
When I send "Create a hello.txt file with 'Hello World'"
Then my message appears immediately
# [Screenshot: lifecycle-02-message-sent.png]
# === Phase 3: Streaming ===
Then streaming indicator appears
# [Screenshot: lifecycle-03-streaming.png]
# === Phase 4: Tool Start ===
Then a "Write" tool card appears with running status
# [Screenshot: lifecycle-04-tool-running.png]
# === Phase 5: Permission Request (Non-Blocking) ===
Then permission overlay slides up with 3 options: Allow, Allow all, Deny
# Note: CLI terminal also shows its own permission prompt simultaneously
# Mobile overlay and desktop terminal can both answer — first response wins
# [Screenshot: lifecycle-05-permission-overlay.png]
# === Phase 6: Permission Granted ===
When I tap "Allow"
Then overlay dismisses, tool proceeds
# [Screenshot: lifecycle-06-permission-allowed.png]
# === Phase 7: Tool Complete ===
Then tool card shows success (green checkmark)
# [Screenshot: lifecycle-07-tool-success.png]
# === Phase 8: Response Complete ===
Then assistant response appears, streaming stops
And StatusBar shows context usage percentage
# [Screenshot: lifecycle-08-response-complete.png]
# === Phase 9: Second Message (Trigger Interrupt) ===
When I send "Now write a very long essay about this file"
Then streaming begins again
# [Screenshot: lifecycle-09-streaming-again.png]
# === Phase 10: Interrupt ===
When I tap the stop button
Then streaming stops, interrupt marker appears
And any running tools show interrupted status
# [Screenshot: lifecycle-10-interrupted.png]
# === Phase 11: Follow-up After Interrupt ===
When I send "Just describe what you would have written"
Then conversation continues normally
# [Screenshot: lifecycle-11-follow-up.png]
# === Phase 12: Mode Switch ===
When I tap the permission mode to switch to "YOLO"
Then StatusBar updates to show "YOLO"
# [Screenshot: lifecycle-12-mode-yolo.png]
# === Phase 13: Tool Without Permission (YOLO) ===
When I send "Read package.json"
Then the tool executes without permission overlay
# [Screenshot: lifecycle-13-yolo-no-permission.png]
# === Phase 14: Close & Reopen ===
When I go back to the session list
Then this session should appear at the top (most recent)
# [Screenshot: lifecycle-14-session-in-list.png]
When I tap the session to reopen it
Then all messages, tools, and interrupts should be preserved
# [Screenshot: lifecycle-15-reopened-session.png]
# =============================================================================
# DESKTOP ↔ MOBILE SYNC & SESSION MANAGEMENT
# =============================================================================
# These scenarios test bidirectional sync between the desktop CLI (via tmux)
# and the mobile UI (via agent-browser). They require both tmux commands
# and browser interaction.
#
# STEP DEFINITIONS:
# "I type <text> in the desktop terminal"
# tmux send-keys -t codetap:<window> -l "<text>" Enter
#
# "I start a desktop session"
# run `codetap` in a terminal (creates tmux window + Claude CLI)
#
# "I start a desktop session with /resume"
# in existing Claude CLI, type `/resume` and select a session
# OR run `codetap --resume <session-id>`
Feature: Desktop Mobile Session Discovery
Background:
Given I am logged in
And the server is running with tmux
Scenario: Desktop new session appears in Active tab (A1)
When I start a desktop session via `codetap`
And I type "Hi" in the desktop terminal
Then within 10 seconds the Active tab should show the new session
And it should display the first prompt "Hi" (not a UUID)
And it should show the project name and permission mode
# [Screenshot: sync-A1-desktop-session-in-active.png]
Scenario: Mobile new session creates tmux window (A2)
When I tap "New Chat" on a project
And I send "Hello from mobile"
Then a new tmux window should appear in the codetap session
And the Active tab should show this session
# [Screenshot: sync-A2-mobile-creates-tmux.png]
Scenario: Desktop /resume makes old session active (A3)
Given I have a historical session in the Projects tab
When I type `/resume` in the desktop Claude CLI and select that session
Then within 10 seconds the Active tab should show the resumed session
And it should display the correct firstPrompt from the old session
# [Screenshot: sync-A3-resume-in-active.png]
Scenario: CLI codetap --resume creates mapped session (A4)
Given I have a historical session with a known session ID
When I run `codetap --resume <session-id>` in a new terminal
# tmux windows follow {adapter}-{timestamp} naming: claude-1774210269126, codex-1774210345678
Then a new tmux window named "{adapter}-{timestamp}" should be created
And within 10 seconds the Active tab should show this session
# [Screenshot: sync-A4-cli-resume.png]
Scenario: Multiple active sessions displayed (A5)
When I start 3 desktop sessions via `codetap` in separate windows
And each sends a different first message
Then the Active tab should show all 3 sessions
And each should display its own firstPrompt
# [Screenshot: sync-A5-multiple-active.png]
Scenario: Second terminal detects running server (A6)
Given the server is already running (started by first `codetap`)
When I run `codetap` in a new terminal WITHOUT CLAWTAP_PASSWORD set
Then the CLI should detect the running server via health check
And it should create a new tmux window with Claude Code
And no password prompt should appear
# [Screenshot: sync-A6-second-terminal.png]
Scenario: codetap -a lists active sessions for current project (A7)
Given I have 2 active sessions in the current project directory
When I run `codetap -a`
Then I should see a numbered list with format:
| Field | Example |
| Internal ID | claude-1774210269126 |
| UUID | 625c60d0-aedb-4e0b-b78e-c9fbf0405e67 |
| Preview | First line of tmux pane content |
When I select a session by number
Then I should attach to that tmux window
Scenario: codetap -A lists ALL active sessions across projects (A8)
Given I have active sessions in different project directories
When I run `codetap -A`
Then I should see all active sessions regardless of project
Scenario: codetap stop kills server and all sessions (A9)
Given the server is running with 2 active sessions visible in the Active tab
When I run `codetap stop` in a terminal
Then the server should shut down
And all sessions should disappear from the Active tab
And mobile should show the offline/reconnecting state
# [Screenshot: sync-A9-after-stop.png]
Scenario: codetap --continue resumes most recent session (A10)
Given I have a historical session from a recent conversation
When I run `codetap --continue` in a terminal
Then the most recent session should be resumed
And the Active tab should show the resumed session
# [Screenshot: sync-A10-continue.png]
Feature: CLI codetap new
Background:
Given the CodeTap server is running
Scenario: codetap new starts a Claude session (default adapter)
When I run `codetap new` in a terminal
Then a tmux window named "claude-{timestamp}" should be created
And the Claude CLI should start with --dangerously-skip-permissions
Scenario: codetap new --adapter codex starts a Codex session
When I run `codetap new --adapter codex` in a terminal
Then a tmux window named "codex-{timestamp}" should be created
And the Codex CLI should start with -a never
Scenario: codetap new without --adapter defaults to claude
When I run `codetap new` without specifying --adapter
Then the adapter should default to "claude"
Feature: CLI codetap --resume
Background:
Given the CodeTap server is running
And I have previous sessions from both Claude and Codex
Scenario: codetap --resume with window name
When I run `codetap --resume claude-1774225283`
Then the Claude CLI should resume with --resume claude-1774225283
Scenario: codetap --resume with Claude CLI UUID
Given a Claude session with UUID "81dec4e4-739c-4f24-9b08-23952037fb0f" exists
When I run `codetap --resume 81dec4e4-739c-4f24-9b08-23952037fb0f`
Then the adapter should be auto-detected as "claude" from DB or JSONL scan
And the Claude CLI should resume that session
Scenario: codetap --resume with Codex CLI UUID
Given a Codex session with UUID "019d17de-8a7f-72c3-b879-6fcd21dab303" exists
When I run `codetap --resume 019d17de-8a7f-72c3-b879-6fcd21dab303`
Then the adapter should be auto-detected as "codex" from DB or JSONL scan
And the Codex CLI should resume that session
Scenario: codetap --resume auto-detects adapter from DB
Given a session with id "codex-1774225400" exists in the DB with adapter "codex"
When I run `codetap --resume codex-1774225400` without --adapter
Then the adapter should be detected as "codex" from the DB
And the Codex CLI should resume that session
Scenario: codetap --resume auto-detects adapter from JSONL file scan
Given a Claude JSONL file exists at ~/.claude/projects/.../<uuid>.jsonl
And the session is NOT in the CodeTap DB
When I run `codetap --resume <uuid>`
Then the adapter should be detected as "claude" from the JSONL file scan
Scenario: codetap --adapter codex --resume skips search
When I run `codetap --adapter codex --resume some-id`
Then the search should be skipped
And the Codex CLI should directly attempt to resume "some-id"
Scenario: codetap --resume with unknown ID and no --adapter shows error
When I run `codetap --resume nonexistent-id`
Then an error should be shown: "Session not found: nonexistent-id"
Scenario: codetap --resume with unknown ID + --adapter passes through
When I run `codetap --adapter claude --resume nonexistent-id`
Then the Claude CLI should attempt to resume "nonexistent-id"
And the CLI itself will handle the error if the session doesn't exist
Feature: CLI codetap --continue
Background:
Given the CodeTap server is running
Scenario: codetap --continue resumes most recent Claude session
When I run `codetap --continue`
Then the Claude CLI should run with --continue flag
Scenario: codetap --adapter codex --continue resumes most recent Codex session
When I run `codetap --adapter codex --continue`
Then the Codex CLI should run `codex -a never resume --last`
Scenario: codetap --continue without --adapter defaults to claude
When I run `codetap --continue` without specifying --adapter
Then the adapter should default to "claude"
Feature: CLI codetap hooks
Background:
Given the CodeTap server is running
Scenario: codetap hooks install installs Claude + Codex hooks
When I run `codetap hooks install`
Then hooks should be installed in ~/.claude/settings.json
And hooks should be installed in ~/.codex/hooks.json
Scenario: codetap hooks uninstall removes Claude + Codex hooks
When I run `codetap hooks uninstall`
Then CodeTap hooks should be removed from ~/.claude/settings.json
And CodeTap hooks should be removed from ~/.codex/hooks.json
Scenario: hooks install enables codex_hooks feature flag
When I run `codetap hooks install`
Then ~/.codex/config.toml should contain codex_hooks = true under [features]
Feature: CLI codetap -a / -A (session listing)
Background:
Given the CodeTap server is running
And I have active Claude and Codex sessions in tmux
Scenario: codetap -a lists only current project sessions
When I run `codetap -a` in a project directory
Then only sessions with cwd matching the current directory should be listed
Scenario: codetap -A lists all sessions across all projects
When I run `codetap -A`
Then all active sessions from all projects should be listed
Scenario: codetap -a shows adapter label with color
When I run `codetap -a`
Then Claude sessions should show an amber [Claude] label
And Codex sessions should show a green [Codex] label
Scenario: codetap -a shows both window name and CLI session UUID
When I run `codetap -a`
Then each session should display:
| Field | Example |
| Adapter | [Claude] or [Codex] |
| Window | claude-1774225283 |
| UUID | 81dec4e4-739c-4f24-9b08-23952037fb0f |
| Preview | (first line of pane content) |
Scenario: codetap -a with no active sessions shows helpful message
Given no tmux windows exist (except main)
When I run `codetap -a`
Then I should see "No active sessions"
Scenario: codetap -A shows cwd path for each session
When I run `codetap -A`
Then each session should additionally show its working directory path
Scenario: codetap -a with tmux windows not in DB shows adapter as unknown
Given a tmux window exists that was not created by CodeTap
When I run `codetap -a`
Then that session should show adapter as "?" with no UUID
Feature: Desktop Mobile Session Lifecycle
Background:
Given I am logged in
And the server is running with tmux
Scenario: Desktop /exit ends session becomes historical (LC1)
# /exit terminates Claude CLI tmux window closes session removed from Active
# BUT the session still exists as historical (JSONL file persists)
Given I have an active session visible in the Active tab
And mobile is connected to the session
When the desktop user types "/exit" in Claude CLI
# T0: Claude CLI exits tmux window closes SessionEnd hook fires
Then mobile should receive a SESSION_ENDED event
And the session should disappear from the Active tab
# [Screenshot: lifecycle-LC1-session-ended.png]
# T1: Session is now historical still visible in Projects drill-down
When I navigate to the project in the Projects tab
Then the session should still appear in the sessions list (historical)
And the green dot should NOT be shown (not active)
# [Screenshot: lifecycle-LC1-historical.png]
Scenario: Full session lifecycle create use exit resume (LC2)
# Complete lifecycle test: active historical active again
# T0: Create session from desktop
When I start a desktop session via `codetap`
And I type "Hello from lifecycle test" in the desktop terminal
Then the Active tab should show the session
# [Screenshot: lifecycle-LC2-T0-active.png]
# T1: Exit session becomes historical
When the desktop user types "/exit" in Claude CLI
Then the session should disappear from the Active tab
# [Screenshot: lifecycle-LC2-T1-historical.png]
# T2: Resume session becomes active again
When I run `codetap --resume <session-id>` in a new terminal
Then the Active tab should show the resumed session
And the previous messages should be visible when connecting from mobile
# [Screenshot: lifecycle-LC2-T2-resumed.png]
Scenario: Desktop detaches from tmux session stays active (LC3)
# User presses Ctrl+B, D to detach Claude CLI keeps running in background
Given I have an active session started via `codetap`
And mobile is connected to the session
When the desktop user detaches from tmux (Ctrl+B, D)
Then the tmux window should still be running (Claude CLI alive)
And the session should still appear in the Active tab
And mobile should still be able to send messages
And Claude should still respond (running headlessly in tmux)
# [Screenshot: lifecycle-LC3-detached-still-active.png]
Scenario: Desktop re-attaches to tmux after detach (LC4)
Given a session is running in tmux (detached)
When the desktop user runs `tmux attach -t codetap`
Then they should see the Claude CLI exactly where they left off
And the session should continue working normally
Scenario: Mobile disconnect session persists in tmux (LC5)
Given I have an active session with mobile connected
When I close the mobile browser tab
Then the tmux window should still be running
And the session should still appear in the Active tab (no clients)
When I reopen the mobile app and connect to the session
Then all previous messages should be preserved
And I should be able to continue sending messages
# [Screenshot: lifecycle-LC5-mobile-reconnect.png]
Scenario: Server restart sessions survive in tmux (LC6)
Given I have 2 active sessions running in tmux
When the server process is killed (simulating crash)
Then the tmux windows should still be running (Claude CLI alive)
When the server is restarted
And I open the mobile app
# Note: Sessions are recovered via SQLite. CLI UUID is the primary key.
# When hooks fire from surviving CLI instances, sessions are found directly by UUID.
When I connect to a known session ID from a previous page
Then the session should re-attach and history should be preserved
# [Screenshot: lifecycle-LC6-server-restart.png]
Scenario: Non-graceful restart restores session by CLI UUID (LC7)
Given a session exists with CLI UUID "d6d56787-bfaf-4312-ae4d-99683ba45459"
When the server crashes without running shutdown
And the tmux window survives
And the server restarts
When the CLI fires a hook (e.g., UserPromptSubmit)
Then the session should be found directly by CLI UUID in the DB
And the session should reappear in Active tab with the same CLI UUID
Scenario: Disconnect button kills tmux window (LC8)
Given I have an active session in the Active tab
When I expand the session card and tap "Disconnect"
Then the tmux window should be killed
And the session should disappear from the Active tab
And if mobile was connected, it should receive SESSION_ENDED
# [Screenshot: lifecycle-LC7-disconnect-kill.png]
Feature: Desktop Mobile Bidirectional Message Sync
Background:
Given I am logged in
And I have a desktop session running in tmux
And I am connected to that session from mobile
Scenario: Mobile input syncs to desktop (B1)
When I send "Hi" from mobile
Then the desktop tmux pane should show "Hi" being typed
And Claude's response should appear on both mobile and desktop
# [Screenshot: sync-B1-mobile-to-desktop.png]
Scenario: Desktop input syncs to mobile (B2)
When I type "Tell me a joke" in the desktop terminal
Then within 10 seconds mobile should show a user message bubble "Tell me a joke"
And mobile should show the assistant response bubble
# [Screenshot: sync-B2-desktop-to-mobile.png]
Scenario: Alternating input from both sides (B3)
# --- Timeline: Three-round alternating conversation ---
# T0: Mobile sends first
When I send "Round 1 from mobile" from mobile
Then both sides should show the response
# [Screenshot: sync-B3-T0-round1.png]
# T1: Desktop sends second
When I type "Round 2 from desktop" in the desktop terminal
Then mobile should show both the user message and response
# [Screenshot: sync-B3-T1-round2.png]
# T2: Mobile sends third
When I send "Round 3 from mobile" from mobile
Then all 6 messages (3 user + 3 assistant) should be visible on both sides
And the order should be consistent between mobile and desktop
# [Screenshot: sync-B3-T2-round3.png]
Feature: Desktop Mobile Resume Session Sync
# These test the complete flow: desktop resumes an old session mobile connects both chat
# Resume has unique mechanics: different JSONL path lookup, session mapping, history loading
Background:
Given I am logged in
And the server is running with tmux
Scenario: codetap CLI new session mobile connect bidirectional chat (RS1)
# Full flow: desktop starts fresh session via CLI, mobile discovers and joins
# T0: Desktop starts session
When I run `codetap` in a terminal
And I type "Hello from desktop" in the desktop terminal
And Claude responds
# [Screenshot: sync-RS1-T0-desktop-started.png]
# T1: Mobile discovers and connects
When I open the Active tab on mobile
Then I should see the session with firstPrompt "Hello from desktop"
When I expand and tap "Connect"
Then I should see the full history: user "Hello from desktop" + assistant response
# [Screenshot: sync-RS1-T1-mobile-connected.png]
# T2: Mobile sends desktop sees
When I send "Hello from mobile" from mobile
Then the desktop tmux pane should show "Hello from mobile" being typed
And Claude's response should appear on both sides
# [Screenshot: sync-RS1-T2-mobile-to-desktop.png]
# T3: Desktop sends mobile sees in real-time
When I type "Desktop reply" in the desktop terminal
Then mobile should show a user message "Desktop reply" (blue bubble)
And mobile should show Claude's response
# [Screenshot: sync-RS1-T3-desktop-to-mobile.png]
Scenario: codetap --resume mobile connect bidirectional chat (RS2)
# Resume an old session via CLI argument
# NOTE: `codetap --resume` uses --dangerously-skip-permissions (YOLO mode) by default
Given I have a historical session with known ID from a previous conversation
# T0: Desktop resumes (YOLO mode)
When I run `codetap --resume <session-id>` in a terminal
Then the Claude CLI should load the old session's context
And permission mode should be "YOLO" (bypassPermissions)
# [Screenshot: sync-RS2-T0-resumed.png]
# T1: Mobile discovers
When I open the Active tab on mobile
Then I should see the resumed session
When I connect to it
Then I should see the old conversation history plus any new messages
# [Screenshot: sync-RS2-T1-mobile-history.png]
# T2: Bidirectional chat in resumed session
When I send "Continue from mobile" from mobile
Then Claude should respond in context of the old conversation
And the desktop should show the mobile-sent message and response
# [Screenshot: sync-RS2-T2-resume-sync.png]
When I type "Continue from desktop" in the desktop terminal
Then mobile should show the desktop message and response in real-time
# [Screenshot: sync-RS2-T3-resume-desktop.png]
Scenario: Claude CLI /resume mobile connect bidirectional chat (RS3)
# Resume via the /resume command inside an already-running Claude CLI
Given I have an active Claude CLI session in tmux
# T0: Desktop uses /resume
When I type "/resume" in the Claude CLI
And I select an old session from the list
Then Claude should load the old session's context
# [Screenshot: sync-RS3-T0-cli-resume.png]
# T1: Mobile discovers the resumed session
When I check the Active tab on mobile
Then the old session should appear as active
When I connect to it
Then the old conversation history should be visible
# [Screenshot: sync-RS3-T1-mobile-connected.png]
# T2: Chat continues with sync
When I send "Question about previous context" from mobile
Then Claude should answer using context from the old session
And desktop should show the mobile message
# [Screenshot: sync-RS3-T2-resume-chat.png]
Scenario: Mobile resumes historical session desktop window created sync (RS4)
# User browses old sessions on mobile and reopens one
Given I have a historical session in the Projects tab (not currently active)
# T0: Mobile opens old session
When I navigate to the project and tap on a historical session
Then the chat view should load with full history
# [Screenshot: sync-RS4-T0-history-loaded.png]
# T1: Mobile sends a new message (triggers session resume)
When I send "Continuing this old session from mobile"
Then a new tmux window should be created in the codetap session
And Claude should respond in context of the old conversation
# [Screenshot: sync-RS4-T1-session-resumed.png]
# T2: Desktop can see and interact with the resumed session
When the desktop user attaches to the tmux window
Then they should see Claude's prompt ready for input
When the desktop user types "Desktop also joining"
Then mobile should show the desktop message in real-time
# [Screenshot: sync-RS4-T2-desktop-joins.png]
Scenario: Long response streaming sync (B4)
When I send a complex question from mobile
# T0: Thinking
Then mobile should show the thinking indicator
# [Screenshot: sync-B4-T0-thinking.png]
# T1: Streaming preview
Then the streaming preview should appear with partial response text
# [Screenshot: sync-B4-T1-streaming.png]
# T2: Complete
Then the full response should appear as a single bubble (no duplicate)
And the streaming indicator should disappear
# [Screenshot: sync-B4-T2-complete.png]
Scenario: Tool call response syncs correctly (B5)
When I send "Read the file package.json" from mobile
Then mobile should show a tool card for "Read" with running status
And within 15 seconds the tool should complete (green checkmark)
And the assistant response should appear on mobile
And the desktop terminal should show the same tool execution
# [Screenshot: sync-B5-tool-sync.png]
Feature: Response Display Correctness
Background:
Given I am logged in
And I have an active chat session
Scenario: Single response no duplicate bubble (C1)
When I send a simple question from mobile
Then I should see exactly one response bubble
And there should NOT be a simultaneous "Responding..." bubble
And there should NOT be two copies of the same response text
# [Screenshot: response-C1-no-duplicate.png]
Scenario: Multi-tool turn single final response (C2)
When I send a request that triggers multiple tool calls
Then tool cards should appear and transition (running success)
And the final assistant response should appear as exactly one bubble
And there should be no duplicate or ghost bubbles
# [Screenshot: response-C2-multi-tool.png]
Scenario: Thinking indicator lifecycle (C3)
When I send a question that requires thinking
# T0: Thinking shows
Then the thinking indicator should appear (spinner + verb)
# [Screenshot: response-C3-T0-thinking.png]
# T1: Response replaces thinking
Then when the response appears, the thinking indicator should disappear
And only the response bubble should remain
# [Screenshot: response-C3-T1-response.png]
Scenario: Interrupt then re-send (C4)
When I send a question from mobile
And I tap the stop button during streaming
Then the response should be marked as interrupted
# [Screenshot: response-C4-interrupted.png]
When I send a follow-up message
Then the new response should display normally
And the interrupted marker should remain in history
# [Screenshot: response-C4-followup.png]
Feature: Active Sessions Expandable Cards & Disconnect
Background:
Given I am logged in
And I have active sessions running
Scenario: Active session shows firstPrompt instead of UUID (A1/5a)
Given a desktop session has been used with the first message "Explain React hooks"
When I view the Active tab
Then the session should display "Explain React hooks" as the title
And it should NOT show a raw UUID like "f925ca56-6093-4ebd-..."
# [Screenshot: active-firstprompt.png]
Scenario: Expand active session card
When I tap on an active session row
Then the card should expand to show additional details
And I should see a "Connect" button
And I should see a "Disconnect" button (in red)
# [Screenshot: active-expanded-card.png]
Scenario: Expanded active session card shows session ID
When I expand an active session card
Then I should see the CLI UUID (e.g. "d6d56787-bfaf-4312...")
And a copy button should be available for the UUID
And the card title remains the firstPrompt (not UUID)
Scenario: Collapse expanded card
Given an active session card is expanded
When I tap on the same session row again
Then the card should collapse back to the compact view
Scenario: Connect to active session
Given an active session card is expanded
When I tap "Connect"
Then I should enter the chat view for that session
And the full message history should load
# [Screenshot: active-connect.png]
Scenario: Disconnect (destroy) active session
Given an active session card is expanded
When I tap "Disconnect"
Then the session should be removed from the Active list
And the tmux window should be killed
And the Active count should decrease by 1
# [Screenshot: active-disconnect.png]
Scenario: Active tab refreshes every 3 seconds
Given I am viewing the Active tab
When a new session becomes active
Then it should appear in the Active tab within 3 seconds
# Previously: polling was 10 seconds, causing delayed discovery
Feature: Reconnect Streaming State Restoration
Background:
Given I am logged in
And I have an active chat session
Scenario: Refresh during idle no streaming indicator (E1)
Given Claude is idle (waiting for input)
When I refresh the page
Then I should see the full message history
And there should be NO streaming indicator or thinking status
# [Screenshot: reconnect-E1-idle.png]
Scenario: Refresh during thinking indicator restored (E1b)
When I send a question from mobile
And Claude is currently thinking (spinner visible)
When I refresh the page
Then I should see the message history
And the thinking/streaming indicator should reappear within 1 second
# [Screenshot: reconnect-E1b-thinking-restored.png]
Scenario: Refresh during response streaming restored (E1c)
When I send a question from mobile
And Claude is currently streaming a response
When I refresh the page
Then I should see the message history
And the streaming indicator should reappear
And when the response completes, the indicator should disappear normally
# [Screenshot: reconnect-E1c-streaming-restored.png]
Scenario: Refresh during desktop-sent thinking indicator restored (E1d)
# E1b/E1c only test mobile-sent. This covers desktop-sent.
Given I have a desktop session with mobile connected
When the desktop user sends a complex message
And mobile shows the "Working..." indicator
When I refresh the mobile browser
Then I should see the message history
And the "Working..." indicator should reappear within 1 second
And when the response completes, the indicator should disappear normally
# Mechanism: handleReconnect sends SESSION_STATE streaming:true when isProcessing
# [Screenshot: reconnect-E1d-desktop-thinking-restored.png]
Scenario: Refresh during tool execution tool card restored (E1e)
# Tool statuses are replayed via TOOL_UPDATES on reconnect.
Given a desktop session has a tool in progress (e.g. Read)
And mobile shows the tool card with a spinner
When I refresh the mobile browser
Then I should see the message history
And the tool card should reappear with running status
# Mechanism: handleReconnect sends TOOL_UPDATES with parser.getPendingTools()
# [Screenshot: reconnect-E1e-tool-restored.png]
Scenario: Refresh during permission request overlay restored (E1f)
Given a desktop session has a pending permission request (e.g. Write tool)
And mobile shows the permission overlay
When I refresh the mobile browser
Then I should see the message history
And the permission overlay should reappear with the same tool name and input
# Mechanism: handleReconnect sends PERMISSION_REQUEST from pendingPermissions
# [Screenshot: reconnect-E1f-permission-restored.png]
Scenario: Refresh during AskUserQuestion options restored (E1g)
Given a desktop session has a pending AskUserQuestion
And mobile shows the question options UI
When I refresh the mobile browser
Then I should see the message history
And the question options should reappear
# Mechanism: handleReconnect sends PERMISSION_REQUEST from pendingQuestions
# [Screenshot: reconnect-E1g-question-restored.png]
Scenario: Refresh during compacting context status restored (E1h)
Given a session is compacting context ("Compacting context..." visible)
When I refresh the mobile browser
Then I should see the message history
# NOTE: Compacting is driven by PreCompact hook thinkingStatus state.
# On reconnect, thinkingStatus is not persisted PaneMonitor must re-detect.
# The "Compacting context..." text may not reappear immediately.
# [Screenshot: reconnect-E1h-compacting.png]
Scenario: Refresh with queued message pending (E1i)
# Queued messages are stored in client-side React state (queuedRef).
# They are NOT persisted server-side and will be lost on refresh.
Given I have an active session with Claude processing
And I type a second message (queued, not yet sent)
When I refresh the mobile browser
Then the queued message should be gone (client-only state)
And the input field should be empty
# [Screenshot: reconnect-E1i-queued.png]
Scenario: Refresh after user pressed stop interrupted state (E1j)
Given Claude was processing and I pressed the stop button
And the chat shows an interrupt marker
When I refresh the mobile browser
Then I should see the full message history including the interrupt marker
And there should be no streaming indicator
# [Screenshot: reconnect-E1j-interrupted.png]
Scenario: Refresh during Agent tool with sub-tools running (E1k)
Given a desktop session has an Agent tool running with sub-tools in progress
And mobile shows the Agent card with sub-tool spinners
When I refresh the mobile browser
Then I should see the message history
And the Agent tool card should reappear with running sub-tools
# Mechanism: handleReconnect sends TOOL_UPDATES with pending tools
# [Screenshot: reconnect-E1k-agent-subtools.png]
Scenario: Refresh during desktop-sent streaming preview (E1l)
# E1c only tests mobile-sent. This covers desktop-sent.
Given I have a desktop session with mobile connected
When the desktop user sends a long message
And mobile shows streaming text preview (partial response)
When I refresh the mobile browser
Then I should see the message history
And the streaming indicator should reappear
And streaming text preview should resume via PaneMonitor
# [Screenshot: reconnect-E1l-desktop-streaming.png]
Scenario: Connect to processing session from Active tab (G7)
Given a desktop session is currently processing a response
When I tap on that session in the Active tab
Then I should see the message history
And the streaming/thinking indicator should be visible
And when the response completes, it should appear normally
# [Screenshot: reconnect-G7-connect-processing.png]
Scenario: Session ended Active tab updates, history preserved (E4)
Given I have an active session visible in the Active tab
When the Claude CLI session terminates (desktop user types /exit)
Then the session should disappear from the Active tab on next refresh
# Note: session is still historical visible in Projects drill-down, can /resume
When I navigate to the project in the Projects tab
Then the session should still appear in the sessions list (with messages)
# [Screenshot: reconnect-E4-session-ended.png]
Feature: Desktop Mobile Permission & Mode Sync
Background:
Given I am logged in
And I have a desktop session connected from mobile
And permission mode is "Normal"
Scenario: Permission overlay appears on both sides simultaneously (D1)
When Claude tries to execute a tool requiring permission (e.g. Write file)
Then the desktop terminal should show its own permission prompt
And the mobile should show the permission overlay
And either side can answer first response wins
# [Screenshot: sync-D1-both-sides-permission.png]
Scenario: Desktop answers permission mobile overlay dismisses on turn complete (D2)
Given both desktop and mobile show a permission prompt
When the desktop user answers "Yes" in the terminal
Then the tool should proceed on desktop
# Note: mobile overlay does NOT dismiss immediately when desktop answers.
# It dismisses when TURN_COMPLETE fires (after tool execution finishes),
# because setPermissionRequest(null) is in the TURN_COMPLETE handler.
Then when the turn completes, the mobile overlay should auto-dismiss
Scenario: Mobile answers permission desktop prompt resolves (D3)
Given both desktop and mobile show a permission prompt
When I tap "Allow" on the mobile overlay
Then the overlay should dismiss
And the desktop terminal should show the tool proceeding
Scenario: Desktop Shift+Tab changes mode mobile reflects (D5)
Given mobile StatusBar shows "Normal"
When the desktop user presses Shift+Tab to switch to "YOLO"
Then within 2 seconds the mobile StatusBar should update to "YOLO"
# [Screenshot: sync-D5-desktop-mode-change.png]
Scenario: Mobile mode change desktop reflects (D6)
Given mobile StatusBar shows "Normal"
When I tap the permission mode label on mobile to switch to "Auto-edit"
Then the desktop terminal should show "accept edits on" indicator
# [Screenshot: sync-D6-mobile-mode-change.png]
Scenario: AskUserQuestion from desktop shows on mobile (D4)
When Claude uses the AskUserQuestion tool from desktop
Then mobile should display the ask-question overlay (not permission overlay)
And it should show the question text and selectable options
# [Screenshot: sync-D4-ask-question.png]
When I select an option on mobile
Then the answer should be sent to the desktop terminal
And Claude should continue processing
# [Screenshot: sync-D4-answered.png]
Scenario: Desktop Ctrl+C interrupt mobile sees interrupt (D7)
Given mobile is connected and Claude is streaming a response
When the desktop user presses Ctrl+C in the terminal
Then the streaming should stop on desktop
And mobile should show the interrupt marker " Interrupted..."
And the streaming indicator should disappear on mobile
# [Screenshot: sync-D7-desktop-interrupt.png]
Feature: Edge Cases
Background:
Given I am logged in
Scenario: Empty session in Active tab (G1)
Given a desktop session was just started but no message sent
When I view the Active tab
Then the session should appear with the session UUID as fallback title
# [Screenshot: edge-G1-empty-session.png]
Scenario: Long streaming preview truncated (G2)
When I send a question that triggers a very long response
Then the streaming preview should truncate at ~200 characters
And the full response should display completely when done
# [Screenshot: edge-G2-long-preview.png]
Scenario: Compacting context indicator (G3)
Given I have a long conversation approaching context limits
When Claude compacts the conversation context
Then mobile should show "Compacting context..." as the thinking status
And when compacting finishes, the indicator should be replaced by normal status
# [Screenshot: edge-G3-compacting.png]
Scenario: Queued message auto-sends after response (G6)
When I send a message and Claude starts responding
And I type a second message and tap send while streaming
# T0: Message queued
Then a queued message bubble should appear with "Queued" badge
# [Screenshot: edge-G6-T0-queued.png]
# T1: First response completes queued auto-sends
Then when the first response finishes, the queued message should auto-send
And it should appear as a regular user message bubble
# [Screenshot: edge-G6-T1-auto-sent.png]
# =============================================================================
# REGRESSION TESTS
# =============================================================================
# =============================================================================
# BUG REGRESSION: Session Deduplication
# =============================================================================
# Regression tests for: desktop + mobile connecting to same session should
# produce exactly ONE entry in Active tab, not two.
Feature: Regression Session Deduplication
Background:
Given I am logged in
And the server is running with tmux
# Regression (DEDUP-1): Previously, dual ID system (internal ID vs CLI UUID)
# caused confusion where Connect button passed the wrong ID type.
# Fix: Unified to single CLI UUID no dual-ID confusion possible.
Scenario: Desktop session + mobile connect single Active entry (DEDUP-1)
When I start a desktop session via `codetap`
And I type "Say hello in one word" in the desktop terminal
And Claude responds
Then the Active tab should show exactly 1 session with firstPrompt "Say hello in one word"
# [Screenshot: dedup-1-single-before-connect.png]
# Mobile connect should attach to the SAME session, not create a second one
When I tap on the session and tap "Connect"
Then I should enter the chat view with full history
When I go back to the Active tab
Then the Active tab should still show exactly 1 session (not 2)
And it should show "1 connected" (the mobile WebSocket client)
# [Screenshot: dedup-1-single-after-connect.png]
# Reconnect after refresh should still be 1 session
When I refresh the mobile browser and reconnect
Then the Active tab should still show exactly 1 session
# [Screenshot: dedup-1-after-reconnect.png]
# =============================================================================
# Regression SessionStart Hook API POST
# =============================================================================
Feature: Regression SessionStart Hook API POST
# SessionStart hook must fire POST to /api/hooks/{adapter}/session-start
# (not write to session-map.json). This enables real-time session discovery.
Scenario: codetap new session appears in Active tab immediately
Given the server is running
When I run `codetap new` in a terminal
And the Claude CLI starts and fires SessionStart hook
Then the session should appear in the Active tab within 5 seconds
And no session-map.json file should be created
# =============================================================================
# BUG REGRESSION: Desktop Mobile Streaming Indicator
# =============================================================================
# Regression tests for: when desktop sends a message, mobile should immediately
# show a "Working..." / thinking indicator instead of waiting 500ms+ for
# PaneMonitor to detect streaming.
Feature: Regression Desktop Message Streaming Indicator
Background:
Given I am logged in
And I have a desktop session connected from mobile
Scenario: Desktop sends message mobile shows immediate indicator (STREAM-1)
# Regression (Bug 1): processing-started event was not forwarded by ClaudeAdapter
# SESSION_STATE streaming:true never broadcast TOOL_START/THINKING/TEXT_DELTA gated out.
# Regression (Bug 2): ChatView condition checked last message role instead of
# pendingResponse flag even with SESSION_STATE, indicator failed for desktop-sent.
When the desktop user types "Explain React hooks" in the terminal
# T0: Immediate "Working..." indicator via SESSION_STATE (not waiting for PaneMonitor)
Then mobile should show the "Working..." indicator within 500ms
# [Screenshot: stream-1-T0-immediate-indicator.png]
# T1: User message arrives via JSONL watcher (may take 1-2s)
Then mobile should display the user message "Explain React hooks" (blue bubble)
# [Screenshot: stream-1-T1-user-message.png]
# T2: Thinking PaneMonitor detects thinking state
Then the indicator should transition to thinking text (e.g. "Analyzing...")
# [Screenshot: stream-1-T2-thinking.png]
# T3: Tool use if Claude uses tools, TOOL_START events are no longer gated out
Then tool cards should appear with running indicators
And each tool should transition to completed (green checkmark) when done
# [Screenshot: stream-1-T3-tools.png]
# T4: Response complete
Then the response should appear as a single bubble
And the streaming indicator should disappear
# [Screenshot: stream-1-T4-complete.png]
Scenario: Desktop rapid messages mobile indicators cycle correctly (STREAM-2)
When the desktop user types "First question" and Claude responds
Then mobile should show indicator response idle
When the desktop user types "Second question"
Then mobile should show the indicator again immediately
And previous turn's tool cards should retain their completed () status
# Note: tool statuses are NOT cleared between turns IDs are unique, old entries don't interfere
# [Screenshot: stream-2-rapid-turns.png]
Scenario: Desktop sends while mobile shows no indicator tool events not lost (STREAM-3)
# Regression: TOOL_START was gated on streamingRef.current dropped when false.
When the desktop user types "Read package.json" in the terminal
Then mobile should show the "Working..." indicator
And a "Read" tool card should appear with running status
And the tool card should transition to completed (not stuck on running)
# [Screenshot: stream-3-tool-not-lost.png]
# =============================================================================
# BUG REGRESSION: Tool Card Expanded View
# =============================================================================
# Regression tests for: expanding a tool card should show friendly per-tool
# display (file path, command, pattern) instead of raw JSON dump.
Feature: Regression Tool Card Display
Background:
Given I am logged in
And I have an active chat session with completed tool calls
Scenario: Read tool card shows file path, not JSON (TOOLUI-1)
Given Claude has completed a Read tool call for "package.json"
When I tap on the Read tool card to expand it
Then I should see the file path "package.json" as a header
And I should see the file contents as formatted code
And I should NOT see raw JSON like {"file_path": "package.json", "limit": ...}
# [Screenshot: toolui-1-read-expanded.png]
Scenario: Bash tool card shows command and output (TOOLUI-2)
Given Claude has completed a Bash tool call with command "ls -la"
When I tap on the Bash tool card to expand it
Then I should see "$ ls -la" styled as a terminal command
And I should see the command output below
And I should NOT see raw JSON like {"command": "ls -la", "description": ...}
# [Screenshot: toolui-2-bash-expanded.png]
Scenario: Grep tool card shows pattern and results (TOOLUI-3)
Given Claude has completed a Grep tool call with pattern "TODO"
When I tap on the Grep tool card to expand it
Then I should see "Pattern: TODO" as the header
And I should see matching file paths or content lines
# [Screenshot: toolui-3-grep-expanded.png]
Scenario: Edit tool card still shows diff view (TOOLUI-4)
# Existing behavior should NOT regress
Given Claude has completed an Edit tool call
When I tap on the Edit tool card to expand it
Then I should see a diff view with red (removed) and green (added) lines
And a "View full diff" link should be available
# [Screenshot: toolui-4-edit-diff.png]
Scenario: Agent tool card shows description, not raw JSON (TOOLUI-5)
Given Claude has used an Agent tool with description "Explore code-tap codebase"
When I tap on the Agent tool card to expand it
Then I should see "Explore code-tap codebase" as the description
And I should NOT see raw JSON like {"subagent_type": "Explore", "prompt": ...}
# [Screenshot: toolui-5-agent-expanded.png]
# =============================================================================
# BUG REGRESSION: Agent Sub-Tool Display
# =============================================================================
# Regression tests for: Agent/Task tool cards should show their internal
# sub-tool calls (Read, Write, Bash, etc.) with running/completed status.
Feature: Regression Agent Sub-Tool Display
Background:
Given I am logged in
And I have an active chat session
Scenario: Agent tool shows nested sub-tools (SUBTOOL-1)
# Regression: TranscriptParser ignored agent_progress entries from JSONL,
# so SubagentGroup was never triggered.
When I send a message that triggers an Agent tool (e.g. "search the codebase for X")
# T0: Agent card appears with running spinner
Then an Agent tool card should appear with a loading spinner
And it should show the agent description (e.g. "Explore code-tap codebase")
# [Screenshot: subtool-1-T0-agent-running.png]
# T1: Sub-tools appear as Agent works
Then sub-tool indicators should appear nested under the Agent card
And each sub-tool should show its tool name (Read, Grep, Bash, etc.)
And each should show its own status (running spinner or completed checkmark)
# [Screenshot: subtool-1-T1-sub-tools-running.png]
# T2: Agent completes
Then the Agent card should show a completed status
And all sub-tools should show completed status
And a count badge should show (e.g. "5 tools completed")
# [Screenshot: subtool-1-T2-agent-complete.png]
Scenario: Expand Agent card to see sub-tool details (SUBTOOL-2)
Given Claude has completed an Agent tool with sub-tools
When I tap on the Agent card header
Then it should expand to show all sub-tool cards
And each sub-tool card should be expandable for details
When I tap on a sub-tool "Read" card
Then I should see the file path (not raw JSON)
# [Screenshot: subtool-2-expanded-sub-tools.png]
Scenario: Multiple parallel Agents each show their own sub-tools (SUBTOOL-3)
When I send a message that triggers 2 parallel Agent tools
Then 2 separate Agent cards should appear
And each should show its own sub-tools independently
And sub-tools should NOT be mixed between Agent cards
# [Screenshot: subtool-3-parallel-agents.png]
Scenario: Agent sub-tools in history load (SUBTOOL-4)
Given I have a completed session that used Agent tools
When I reconnect to that session from mobile
Then the Agent tool cards should show the sub-tools from history
And each sub-tool should show completed status
And each sub-tool should display its tool name badge (Read, Bash, Glob, etc.)
# [Screenshot: subtool-4-history-load.png]
# =============================================================================
# BUG REGRESSION: Agent Sub-Tool Badge & Label
# =============================================================================
# Regression tests for: sub-tool cards should show tool name badges (Read, Bash,
# Glob) and the SubagentGroup count label should say "tools" not "agents".
Feature: Regression Agent Sub-Tool Badge & Label
Background:
Given I am logged in
And I have an active chat session
Scenario: Sub-tool cards show tool name badges (BADGE-1)
# Regression: TOOL_UPDATES stored raw server objects with 'name' field,
# but frontend expected 'toolName' badge rendered empty.
When I send a message that triggers an Agent tool with sub-tools
Then each sub-tool card should display a tool name badge (Read, Bash, Glob, etc.)
And the badge should NOT be empty or show a dash
# [Screenshot: badge-1-tool-names.png]
Scenario: SubagentGroup label says "tools" not "agents" (BADGE-2)
# Regression: SubagentGroup hardcoded "agents" label for sub-tool count.
When I view an Agent tool card with completed sub-tools
Then the count label should say "N tools completed" (not "N agents completed")
And while running, it should say "N/M tools" (not "N of M agents running...")
# [Screenshot: badge-2-tools-label.png]
# =============================================================================
# BUG REGRESSION: /resume Session Streaming Indicator
# =============================================================================
# Regression tests for: when desktop uses /resume inside CLI to switch sessions,
# subsequent hooks should still resolve to the managed session.
Feature: Regression /resume Session Streaming
Background:
Given I am logged in
And the server is running with tmux
Scenario: Desktop /resume then sends message mobile sees indicator (RESUME-1)
# Regression: /resume inside CLI changes session_id internally, but
# resolveSessionId couldn't find the new ID hooks silently dropped
# no processing-started emitted mobile never saw streaming indicator.
Given I have a desktop session started via `codetap`
And mobile is connected to the session
When the desktop user types "/resume" in Claude CLI and selects an old session
And the desktop user types a new message in the resumed session
Then mobile should show the "Working..." indicator within 1 second
And the response should appear on mobile when complete
# [Screenshot: resume-1-indicator.png]
Scenario: /resume session hooks resolve correctly (RESUME-2)
Given I have a desktop session with CLI UUID X
When Claude CLI internally switches to session_id Y (via /resume)
And a UserPromptSubmit hook fires with session_id Y
Then the session should be found by CLI UUID Y in the sessions Map
And the hook should NOT be silently dropped
# Verified via: mobile receives SESSION_STATE streaming=true
# =============================================================================
# BUG REGRESSION: Desktop Client Visibility
# =============================================================================
# Regression tests for: Active tab should show desktop activity indicator
# separately from mobile WebSocket client count.
Feature: Regression Desktop Client Visibility
Background:
Given I am logged in
And the server is running with tmux
Scenario: Active tab shows desktop indicator when hooks are active (CLIENT-1)
# Regression: "connected" count only showed WebSocket clients. Desktop CLI
# uses HTTP hooks, not WebSocket, so it was never counted.
When I start a desktop session via `codetap` and send a message
Then the Active tab should show "desktop" indicator for that session
And if no mobile is connected, it should NOT show "0 connected"
# [Screenshot: client-1-desktop-only.png]
Scenario: Active tab shows both desktop and mobile (CLIENT-2)
Given I have a desktop session with recent hook activity
And mobile is connected via WebSocket
Then the Active tab should show "desktop · 1 connected"
# [Screenshot: client-2-desktop-and-mobile.png]
# =============================================================================
# BUG REGRESSION: Message Deduplication
# =============================================================================
# Regression tests for: messages should never appear twice on mobile.
# Root cause: JSONL watcher position was stale after HISTORY_LOAD, causing
# entries already in history to be re-emitted as "new" via MESSAGE_COMPLETE.
Feature: Regression Message Deduplication
Background:
Given I am logged in
And the server is running with tmux
Scenario: Desktop message appears once after mobile reconnect (MSGDEDUP-1)
# Regression: After mobile reconnect, HISTORY_LOAD sent all entries, but
# the watcher's lastByteOffset was from before reconnect watcher re-emitted
# entries already in history MESSAGE_COMPLETE appended duplicates.
Given I have a desktop session with some conversation history
And mobile connects and receives history via HISTORY_LOAD
When the desktop user sends a new message
Then the user message should appear exactly ONCE on mobile (blue bubble)
And the assistant response should appear exactly ONCE
And there should be NO duplicate bubbles of the same content
# [Screenshot: msgdedup-1-no-duplicates.png]
Scenario: Messages remain single after mobile browser refresh (MSGDEDUP-2)
Given I have a desktop session with mobile connected
And the conversation has 3+ exchanges
When I refresh the mobile browser
And mobile reconnects and loads history
And the desktop user sends another message
Then all messages should appear exactly once
And the new message and response should each appear exactly once
# [Screenshot: msgdedup-2-after-refresh.png]
# =============================================================================
# REGRESSION Bug Fix Guards
# =============================================================================
# Each scenario guards against a specific user-visible bug that was previously fixed.
# Format: What the user should see (correct behavior), with comment about what
# previously went wrong.
Feature: Regression Bug Fix Guards
Scenario: Deny permission actually rejects the tool (REG-DENY-1)
Given a permission overlay is showing for a Write tool
When I tap "Deny"
Then the tool should NOT execute
And Claude should acknowledge the denial
# Previously: Deny sent wrong key to CLI, tool executed anyway
Scenario: HTTPS mode tools and streaming work end-to-end (REG-HTTPS-1)
Given the server is running in HTTPS mode
And Claude is in Normal permission mode
When I send a message that triggers a Write tool
Then the permission overlay should appear
When I tap Allow
Then the tool should execute and complete
And the response should stream to mobile normally
And TURN_COMPLETE should fire (turn ends cleanly)
# Previously: hooks used http:// but server ran HTTPS all hooks silently failed,
# TURN_COMPLETE never sent, streaming broken
Scenario: Permission Allow sends correct key tool executes (REG-PERM-1)
Given a permission overlay is showing
When I tap "Allow"
Then the tool should execute within 5 seconds
And the tool card should transition from Loading to Complete
# Previously: CLI received wrong key sequence (Down+Enter instead of number key),
# selected wrong option or typed into wrong prompt
Scenario: No phantom Enter after permission response (REG-PERM-2)
Given a permission overlay is showing
When I tap "Allow" and the tool completes
And Claude asks another question or shows another permission prompt
Then the new prompt should NOT be auto-answered
And the new prompt should wait for user input normally
# Previously: extra Enter keystroke leaked into the next CLI prompt
Scenario: Agent subtools finish streaming continues (REG-SUBAGENT-1)
Given Claude is running an Agent with subtools
When the Agent's subtools complete
Then the response text should continue streaming to mobile
And tool cards should NOT be discarded
And subsequent tool events should display normally
# Previously: SubagentStop shared the Stop endpoint killed streaming monitor,
# sent premature TURN_COMPLETE, set streaming=false tool events discarded
Scenario: WS stays alive during long operations (REG-WS-1)
Given mobile is connected during a 60-second thinking period
Then the WS connection should remain open
And the thinking indicator should stay visible throughout
# Previously: no ping/pong WS disconnected after ~30s idle
Scenario: Streaming works after server restart (REG-MONITOR-1)
Given the server was restarted but tmux sessions still exist
When I connect to an existing session from mobile
And the desktop user sends a message
Then streaming text should appear on mobile incrementally
And the response should NOT appear all at once when the turn ends
# Previously: PaneMonitor never started for reconnected sessions,
# response text appeared only after turn complete (no real-time streaming)
Scenario: Permission overlay appears despite desktop mode change (REG-MODE-1)
Given the desktop user changed from YOLO to Normal mode via Shift+Tab
When Claude requests permission for a tool
Then the permission overlay should appear on mobile
# Previously: server cached stale YOLO mode, filtered out PermissionRequest
Scenario: ExitPlanMode shows plan card, not permission overlay (REG-PLAN-1)
Given the session is in Normal permission mode
When Claude exits plan mode with a plan document
Then a plan card with Approve/Reject/YOLO buttons should appear
And NO generic permission overlay (Allow/Deny) should appear
# Previously: ExitPlanMode fired PermissionRequest hook showed wrong overlay
Scenario: Permission overlay dismissed on all connected clients (REG-DISMISS-1)
Given Mobile A and Mobile B are connected to the same session
And both show a permission overlay
When Mobile A taps "Allow"
Then Mobile A's overlay should dismiss immediately
And Mobile B's overlay should dismiss within 1 second
# Previously: only answering client's overlay dismissed, other clients stuck
Scenario: Send button enables after programmatic text input (REG-INPUT-1)
Given I am in an empty chat
When text is inserted into the input field programmatically (e.g., paste or autofill)
Then the send button should become enabled
# Previously: send button state only tracked React onChange, not DOM input events
Scenario: Messages appear exactly once after reconnect (REG-DEDUP-1)
Given a desktop session has some conversation history
And mobile connects and loads history
When the desktop user sends a new message
Then the user message should appear exactly ONCE on mobile
And the assistant response should appear exactly ONCE
And there should be NO duplicate bubbles
# Previously: JSONL watcher offset was stale after HISTORY_LOAD re-emitted
# entries already in history as new MESSAGE_COMPLETE events
# =============================================================================
# MULTI-CLIENT SYNC
# =============================================================================
Feature: Multi-Client Mobile-to-Mobile Message Sync
Background:
Given I am logged in
And I have a desktop session with tmux
Scenario: Mobile A message visible on Mobile B (MULTI-1)
# Regression: fromMobile flag caused ALL mobile clients to skip
# the user message, not just the sender.
Given Mobile A and Mobile B are both connected to the same session
When Mobile A sends "Hello from A"
Then Mobile A should show "Hello from A" (blue bubble, optimistic)
And Mobile B should show "Hello from A" (blue bubble, from JSONL)
And both should show the assistant response
# [Screenshot: multi-1-cross-mobile-sync.png]
Scenario: Mobile B message visible on Mobile A (MULTI-2)
Given Mobile A and Mobile B are both connected to the same session
When Mobile B sends "Hello from B"
Then Mobile B should show "Hello from B" (optimistic)
And Mobile A should show "Hello from B" (from JSONL)
# [Screenshot: multi-2-reverse-sync.png]
Scenario: Desktop message visible on all mobile tabs (MULTI-3)
Given Mobile A and Mobile B are both connected to the same session
When the desktop user sends "Hello from desktop"
Then Mobile A and Mobile B should both show the user message and response
# [Screenshot: multi-3-desktop-to-all.png]
Scenario: No duplicate messages on sender (MULTI-4)
# The sender gets the message optimistically + via JSONL.
# senderClientId check prevents duplicates.
Given Mobile A is connected to a session
When Mobile A sends "Test dedup"
Then "Test dedup" should appear exactly ONCE on Mobile A (not twice)
# [Screenshot: multi-4-sender-no-dup.png]
Feature: Multi-Client Active Session Client Count
Background:
Given I am logged in
Scenario: Client count includes desktop and mobile tabs (COUNT-1)
Given a desktop session is active (hooks firing)
And 2 mobile tabs are connected to that session
Then the Active tab should show "desktop · 2 connected"
# [Screenshot: count-1-desktop-plus-2.png]
Scenario: Client count updates when tab closes (COUNT-2)
Given 2 mobile tabs are connected to a session
When one tab is closed
Then within 3 seconds, the Active tab should show "1 connected"
# [Screenshot: count-2-tab-close.png]
Scenario: Opening session tab counts as connected (COUNT-3)
Given a session exists in the Active tab showing "desktop"
When I tap Connect on that session (opening chat view)
Then the Active tab should show "desktop · 1 connected" on next refresh
# [Screenshot: count-3-open-counts.png]
Feature: Multi-Client Permission/Question Overlay Dismiss
Background:
Given I am logged in
And Mobile A and Mobile B are both connected to the same session
Scenario: PermissionRequest dismissed on other client (PERM-DISMISS-1)
# Normal mode: Write tool triggers permission
Given the session is in Normal permission mode
When the assistant tries to use the Write tool
Then both Mobile A and Mobile B should show the permission overlay
When Mobile A taps "Allow"
Then Mobile A's overlay should dismiss immediately (optimistic)
And Mobile B's overlay should dismiss within 1 second (via PERMISSION_DISMISSED)
# [Screenshot: perm-dismiss-1-both-cleared.png]
Scenario: PermissionRequest second client response is no-op (PERM-DISMISS-2)
Given both clients show a permission overlay for the same request
When Mobile A taps "Allow"
And Mobile B taps "Allow" before receiving the dismiss
Then the tool should only execute once (no double keystroke)
And both overlays should dismiss
Scenario: AskUserQuestion dismissed on other client (ASK-DISMISS-1)
When the assistant calls AskUserQuestion with options
Then both Mobile A and Mobile B should show the question overlay
When Mobile A selects an option
Then Mobile A's overlay should dismiss immediately (optimistic)
And Mobile B's overlay should dismiss when TOOL_DONE arrives
# [Screenshot: ask-dismiss-1-both-cleared.png]
Scenario: ExitPlanMode card syncs across clients (PLAN-SYNC-1)
When the assistant calls ExitPlanMode with a plan
Then both Mobile A and Mobile B should show the plan card with Approve/Reject buttons
And neither client should show a permission overlay (Deny/Allow)
# Regression: CLI fires PermissionRequest for ExitPlanMode, but mobile
# must skip it plan card provides its own approval UI.
When Mobile A taps "Approve"
Then Mobile A's card should switch to read-only (optimistic, hasUserAfter)
And Mobile B's card should switch to read-only when the approval message syncs via JSONL
# [Screenshot: plan-sync-1-both-readonly.png]
Scenario: ExitPlanMode does not show permission overlay (PLAN-NO-OVERLAY-1)
# Regression: ExitPlanMode fires PermissionRequest hook, which showed
# a generic Deny/Allow overlay with raw allowedPrompts JSON.
# The option indices (0=allow, 2=deny) don't match the CLI's plan
# selector options, causing wrong behavior.
Given the session is in Normal permission mode
When the assistant calls ExitPlanMode with a plan
Then the plan card with Approve/Reject/YOLO buttons should appear
And NO permission overlay (Deny/Allow) should appear
And the user should be able to interact with the plan card directly
# [Screenshot: plan-no-overlay-1.png]
Scenario: New permission request replaces dismissed one (PERM-DISMISS-3)
Given Mobile A dismissed a permission overlay
When a new permission request arrives before the dismiss reaches Mobile B
Then Mobile B should show the NEW request (not dismiss it)
# =============================================================================
# PWA Installation & Standalone Mode
# =============================================================================
Feature: PWA Installation
The app should be installable as a standalone PWA from the home screen.
Scenario: PWA manifest is served correctly
When I open the app URL in Safari
Then the browser should serve /manifest.webmanifest with correct JSON
And it should contain name "CodeTap", display "standalone", 3 icons
And the HTML should include <link rel="manifest">
And the HTML should include <meta name="theme-color" content="#09090b">
And the HTML should include <link rel="apple-touch-icon">
Scenario: Add to Home Screen
Given I am logged into the app in Safari
When I tap Share Add to Home Screen
Then the "Add to Home Screen" dialog should show
And the app name should be "CodeTap"
And the icon should display the CodeTap logo (not a generic icon)
When I tap "Add"
Then the CodeTap icon should appear on the home screen
# [Screenshot: pwa-homescreen-icon.png]
Scenario: Standalone mode no Safari chrome
Given CodeTap is installed on the home screen
When I launch CodeTap from the home screen
Then the app should open in standalone mode (no Safari address bar)
And the status bar should use dark theme (#09090b)
And the login page should be displayed (separate cookie jar from Safari)
# [Screenshot: pwa-standalone-login.png]
Scenario: Standalone mode login and session list
Given CodeTap is open in standalone mode
When I login with the correct password
Then the sessions list should display
And there should be no browser navigation controls visible
# [Screenshot: pwa-standalone-sessions.png]
# =============================================================================
# PWA Push Notification Subscription
# =============================================================================
Feature: Push Notification Subscription
Users can subscribe to push notifications from the PWA.
Scenario: Bell icon only visible in standalone PWA mode
Given I am logged into the app in Safari (regular browser tab)
Then the notification bell icon should NOT appear in the header
# Reason: PushManager is not available outside standalone mode on iOS
Scenario: Bell icon visible in standalone PWA mode
Given I am logged into the app in standalone PWA mode
Then a bell icon (BellOff) should appear in the header next to Logout
And it should be titled "Enable notifications"
# [Screenshot: pwa-bell-off.png]
Scenario: Subscribe to push notifications
Given the bell icon is visible (BellOff state)
When I tap the bell icon
Then the browser should show a notification permission prompt
When I allow notifications
Then the bell icon should change to BellOn (filled)
And the server should have a stored push subscription
# [Screenshot: pwa-bell-on.png]
Scenario: Unsubscribe from push notifications
Given push notifications are enabled (BellOn state)
When I tap the bell icon
Then the bell icon should change back to BellOff
And the server should remove the push subscription
# =============================================================================
# PWA Push Notification Triggers
# =============================================================================
Feature: Push Notification Triggers
Push notifications should fire only when the user is NOT viewing the session.
Background:
Given push notifications are enabled on the mobile PWA
And there is an active Claude session "A" in project "my-project"
Scenario: No notification when viewing the session (session-idle)
Given I am connected to session A in ChatView
When Claude completes a response in session A (Stop hook fires)
Then I should see the response via WebSocket (TURN_COMPLETE)
And I should NOT receive a push notification
And the app badge should NOT increment
Scenario: Notification when not viewing the session (session-idle)
Given I am on SessionsView (not connected to any session)
When Claude completes a response in session A
Then I should receive a push notification:
| title | Claude finished |
| body | Turn complete in my-project |
And the app icon badge should show "1"
# [Screenshot: pwa-push-session-idle.png]
Scenario: Notification when viewing a different session
Given I am connected to session B in ChatView
And session A completes a response
Then I should receive a push notification for session A
And the app icon badge should increment
Scenario: Notification for permission request
Given I am on SessionsView
When Claude in session A requests permission for "Bash"
Then I should receive a push notification:
| title | Permission needed |
| body | Bash in my-project |
And the app icon badge should increment
# [Screenshot: pwa-push-permission.png]
Scenario: Notification for AskUserQuestion
Given I am on SessionsView
When Claude in session A uses AskUserQuestion
Then I should receive a push notification:
| title | Question from Claude |
| body | Waiting for answer in my-project |
And the app icon badge should increment
Scenario: No notification flood during active conversation
Given I am connected to session A in ChatView
When Claude completes 10 responses in rapid succession
Then I should receive 0 push notifications
And the app icon badge should remain unchanged
Scenario: App in background receives notification
Given I am connected to session A in ChatView
When I switch to the home screen (app goes to background)
And the WebSocket disconnects (after ~2-3 seconds)
And Claude completes a response in session A
Then I should receive a push notification
And the app icon badge should show "1"
# [Screenshot: pwa-push-background.png]
Scenario: Multiple sessions notify independently
Given sessions A, B, and C are all active
And I am not connected to any session
When session A completes push, badge=1
And session B requests permission push, badge=2
And session C asks a question push, badge=3
Then the notification center should show 3 notifications
And the app icon badge should show "3"
# =============================================================================
# PWA Notification Click & Navigation
# =============================================================================
Feature: Notification Click Navigation
Tapping a notification should navigate to the correct session.
Scenario: Click notification when app is open
Given the app is open on SessionsView
And I received a notification for session A
When I tap the notification
Then the app should focus (bring to foreground)
And the app should navigate to session A's ChatView
And session A's pending count should clear
# [Screenshot: pwa-notification-click-open.png]
Scenario: Click notification when app is closed
Given the app is not open
And I received a notification for session A
When I tap the notification
Then the app should open with URL /?session=<sessionId>
And after login, the app should navigate to session A's ChatView
# [Screenshot: pwa-notification-click-closed.png]
Scenario: URL parameter ?session= parsed on app load
Given the app is freshly opened with URL /?session=abc123
When I login
Then the app should automatically navigate to session abc123
And the URL should be cleaned up (no ?session= in address)
# =============================================================================
# PWA Badge Count & Pending Indicators
# =============================================================================
Feature: Badge Count Management
App icon badge and session card indicators track unread notifications.
Background:
Given push notifications are enabled
And sessions A (1 pending), B (1 pending), C (1 pending) have notifications
And the app icon badge shows "3"
Scenario: Badge decrements when entering a session
When I open the app and navigate to session A
Then session A's pending count should be cleared
And the app icon badge should update to "2"
And SessionsView should show badges on B and C (not A)
Scenario: Badge clears to zero when all sessions viewed
When I navigate to session A badge=2
And I navigate to session B badge=1
And I navigate to session C badge=0
Then the app icon badge should be cleared completely
Scenario: Pending indicators on Active Sessions list
Given I am on the Active tab in SessionsView
Then sessions with pending notifications should show a red badge with count
And sessions without pending notifications should show no badge
# [Screenshot: pwa-pending-badges.png]
Scenario: Pending indicators update in real-time via SW
Given I am on the Active tab in SessionsView
When a new push notification arrives for session B
Then session B's badge should update without waiting for polling
# (SW postMessage app refetches pending counts)
Scenario: Notification tag deduplication
Given session B completes 3 times while I am away
Then the notification center should show only 1 notification for B (latest replaces previous)
But the app icon badge should count all 3 (badge = 3 if B is the only pending session)
When I enter session B
Then all 3 pending counts for B are cleared at once
And badge drops to 0
# =============================================================================
# PWA HTTPS & Certificate
# =============================================================================
Feature: HTTPS Support
The server supports HTTPS for PWA push notification requirements.
Scenario: Server auto-detects HTTPS certificates
Given ~/.codetap/cert.pem and ~/.codetap/key.pem exist
When the server starts
Then it should listen on HTTPS
And the startup log should show "https://0.0.0.0:PORT (HTTPS)"
Scenario: Server falls back to HTTP without certificates
Given ~/.codetap/cert.pem does NOT exist
When the server starts
Then it should listen on HTTP (default behavior)
And the startup log should show "http://0.0.0.0:PORT"
Scenario: codetap cert command generates self-signed certificate
When I run "codetap cert"
Then ~/.codetap/cert.pem and ~/.codetap/key.pem should be created
And the certificate should include the machine's local IP as SAN
And instructions for trusting on iOS and Android should be printed
Scenario: Tailscale HTTPS works for PWA
Given the server is running with Tailscale TLS certificates
When I open the app on a mobile device via the Tailscale hostname
Then the HTTPS connection should be established without certificate errors
And the PWA should be installable
And push notifications should work (secure context satisfied)
Scenario: Permission request works in HTTPS mode
Given ~/.codetap/cert.pem and ~/.codetap/key.pem exist
And the server is running in HTTPS mode
And Claude is in Normal permission mode
When Claude requests permission for a Write tool
Then the permission overlay should appear on mobile
And tapping Allow should approve the tool in CLI
And the tool should execute successfully
Scenario: Streaming text works in HTTPS mode
Given the server is running in HTTPS mode
When Claude generates a long response
Then streaming text should appear incrementally on mobile
And the StatusBar should show context usage percentage
And the final response should display completely
# =============================================================================
# PWA Service Worker
# =============================================================================
Feature: Service Worker Lifecycle
The service worker handles precaching, push events, and notification clicks.
Scenario: Service worker registers on app load
When the app loads for the first time
Then a service worker should register successfully
And static assets should be precached
Scenario: Service worker auto-updates
When a new version of the app is deployed
And the user opens the app
Then the service worker should auto-update (registerType: autoUpdate)
And the new version should be active on next navigation
Scenario: Push event with badge=0 clears app badge
When the server sends a silent push with { data: { badge: 0 } }
Then the service worker should call navigator.clearAppBadge()
And no notification should be shown (silent push, no title)
# =============================================================================
# REGRESSION Tool Status After Permission Deny & Interrupt
# =============================================================================
# Added: 2026-03-21
# Root cause: TURN_COMPLETE handler skipped markToolsAs() when interruptedRef
# was true, leaving denied tool cards stuck in "running/loading" state forever.
# Fix: TURN_COMPLETE now always calls markToolsAs() with the appropriate status
# ('interrupted' if interrupted, 'success' otherwise).
Feature: Regression Tool Status After Permission Deny
Background:
Given I am logged in
And permission mode is "Normal"
And I open a new chat
Scenario: Single tool deny tool card shows interrupted icon (not loading)
When I send "Create a file called /tmp/deny-regression.txt with 'test'"
Then a Write tool card should appear with running status (spinner)
And a permission overlay should appear with Allow and Deny buttons
# [Screenshot: deny-reg-T0-overlay.png]
When I tap "Deny"
Then the permission overlay should dismiss
And the Write tool card should show the interrupted icon (🚫), NOT a spinner
And "Interrupted · What should Claude do instead?" should appear
And the input placeholder should change to "What should Claude do instead?"
# [Screenshot: deny-reg-T1-interrupted-icon.png]
# Previously: tool card stayed as spinner/loading forever
Scenario: Multi-tool deny completed tools keep success, denied tool shows interrupted
# This is the critical regression: previously ALL tool cards reverted to loading
When I send "Read package.json then create /tmp/multi-deny.txt with the name"
# T0: Read tool starts (auto-approved) completes green checkmark
Then a Read tool card should appear and complete with green checkmark
# [Screenshot: multi-deny-T0-read-success.png]
# T1: Write tool starts permission overlay appears
Then a Write tool card should appear with running status
And a permission overlay should appear
# [Screenshot: multi-deny-T1-write-permission.png]
# T2: User denies CLI interrupts TURN_COMPLETE fires
When I tap "Deny"
Then the permission overlay should dismiss
And the Read tool card MUST still show green checkmark (NOT revert to loading)
And the Write tool card should show 🚫 interrupted icon
And "Interrupted · What should Claude do instead?" should appear
# [Screenshot: multi-deny-T2-final-state.png]
# CRITICAL: Read tool status must NOT regress from success to loading/running
Scenario: Deny does not create the file
When I send "Create /tmp/deny-file-check.txt with 'should not exist'"
And the permission overlay appears
When I tap "Deny"
Then /tmp/deny-file-check.txt should NOT exist
Feature: Regression Tool Status After User Abort (Stop Button)
Background:
Given I am logged in
And I open a new chat
Scenario: Abort during streaming completed tools keep success
Given permission mode is "YOLO" (to avoid permission overlay interference)
When I send a request that triggers multiple tools (e.g. "Read all .ts files")
And some tools complete with while Claude is still streaming
When I tap the stop button
Then completed tools MUST still show green checkmark
And any running tools should show 🚫 interrupted icon
And "Interrupted · What should Claude do instead?" should appear
# [Screenshot: abort-multi-tool-result.png]
Scenario: Abort then re-send tool cards start fresh
Given I aborted a previous turn
When I send a new message
Then new tool cards should appear with running status
And old interrupted tool cards should remain in history (not removed)
And the new tools should complete normally
Feature: Regression Tool Status After CLI Interrupt (Ctrl+C)
# Desktop user presses Ctrl+C in tmux mobile should reflect the interrupt
Background:
Given I am logged in
And I have a desktop session connected from mobile
Scenario: Desktop Ctrl+C during multi-tool completed tools keep success on mobile
Given Claude is executing multiple tools (Read, Grep, etc.) visible on mobile
And some tools have completed () while others are still running
When the desktop user presses Ctrl+C in the tmux terminal
Then on mobile:
- Completed tools MUST still show green checkmark
- Running tools should show 🚫 interrupted icon
- "Interrupted · What should Claude do instead?" should appear
- The streaming indicator should disappear
# [Screenshot: cli-interrupt-multi-tool.png]
# Previously: CLI interrupt caused all tool cards to stay as loading spinners
Feature: Regression HTTPS Hook Configuration
# Bug: hook-config.ts hardcoded http:// URLs even when server ran on HTTPS
# Root cause: _isHttps() was removed during TS migration, hookUrl was hardcoded
# Fix: auto-detect HTTPS from cert file existence, use correct protocol + curl -k
Background:
Given ~/.codetap/cert.pem and ~/.codetap/key.pem exist
And the server is running in HTTPS mode
Scenario: Hooks use HTTPS URLs when server runs on HTTPS
When the server starts and installs hooks
Then ~/.claude/settings.json hook commands should contain "https://localhost"
And curl commands should include the "-k" flag (for self-signed certs)
# Previously: hooks used "http://localhost" SSL connection error all hooks silent
Scenario: Permission overlay appears when HTTPS hooks are correctly configured
Given hooks are installed with HTTPS URLs
And permission mode is "Normal"
When Claude requests permission for a Write tool
Then the permission overlay should appear on mobile (within 5 seconds)
And the countdown timer should be visible
And Allow/Deny buttons should be functional
# Previously: overlay never appeared because HTTP hooks couldn't reach HTTPS server
Scenario: Hooks use HTTP URLs when server runs on HTTP
Given no cert files exist
And the server is running in HTTP mode
When the server starts and installs hooks
Then ~/.claude/settings.json hook commands should contain "http://localhost"
And curl commands should NOT include the "-k" flag
Feature: Regression Voice Input Secure Context
# Voice input (Web Speech API) requires secure context (HTTPS or localhost)
Background:
Given I am logged in
And I open a new chat
Scenario: Mic button visible in HTTPS context
Given the app is loaded over HTTPS (or localhost)
Then a microphone button should be visible in the input bar
And it should be positioned between the image button and the textarea
Scenario: Mic button hidden in HTTP context
Given the app is loaded over plain HTTP (e.g. http://192.168.1.x:3456)
Then no microphone button should be visible
# Web Speech API requires secure context; button hidden via isSecureContext check
Scenario: Voice recording toggle
Given the mic button is visible (HTTPS context)
When I tap the mic button
Then the button should pulse red (recording indicator)
And the browser should request microphone permission (if not already granted)
When I tap the mic button again
Then recording should stop
And the button should return to its default gray state
Scenario: Voice transcript appends to existing text
Given I have typed "hello " in the input field
When I activate voice input and say "world"
Then the input field should contain "hello world" (appended, not replaced)
And the message should NOT auto-send (user reviews before pressing Send)
# =============================================================================
# Feature: Insight Block Rendering
# =============================================================================
Feature: Insight Block Display
Scenario: Insight block renders as collapsible card
Given I have an active chat session with an Insight block in the response
Then the Insight block shows as a collapsed card
And the card shows " Insight" label with a summary
And a chevron icon is visible
Scenario: Insight block expands on tap
Given I see a collapsed Insight card
When I tap the Insight card
Then the card expands to show full markdown content
And the chevron changes to up arrow
Scenario: Insight block collapses on second tap
Given I see an expanded Insight card
When I tap the Insight card again
Then the card collapses back to summary view
Scenario: Multiple Insight blocks in one message
Given I have a response with two Insight blocks separated by text
Then both render as separate collapsible cards
And the text between them renders as normal markdown
Scenario: Message without Insight blocks renders normally
Given I have a response with no Insight delimiters
Then the message renders as plain markdown
Scenario: Insight block in reconnected session history
Given I reconnect to a session that had Insight blocks
Then the Insight blocks render correctly as collapsible cards
# =============================================================================
# MULTI-ADAPTER UI
# =============================================================================
Feature: Multi-Adapter New Chat (Adapter Selection)
Background:
Given I am logged in
And I am viewing sessions within a project
Scenario: New Chat shows Hero Icon adapter selection screen
When I tap "New Chat"
Then I should see an adapter selection screen
And it should display Hero Icons for each available adapter
And the available adapters should include "Claude" and "Codex"
# [Screenshot: adapter-selection.png]
Scenario: Selecting Claude shows Claude-specific options
When I tap "New Chat"
And I select the "Claude" adapter
Then I should see Claude-specific settings cards:
| Setting | Options |
| Model | Sonnet / Opus / Haiku |
| Thinking | Off / Normal / Extended |
And the input placeholder should reflect Claude
# [Screenshot: new-chat-claude-options.png]
Scenario: Selecting Codex shows Codex-specific options
When I tap "New Chat"
And I select the "Codex" adapter
Then I should see Codex-specific settings cards:
| Setting | Options |
| Model | GPT-5.4 / o3 |
| Reasoning Effort | Low / Medium / High / XHigh |
And the input placeholder should reflect Codex
# [Screenshot: new-chat-codex-options.png]
Scenario: Settings cards cycle to next option on tap
Given I selected the "Claude" adapter
And the Model card shows "Sonnet"
When I tap the Model card
Then the Model card should cycle to "Opus"
When I tap the Model card again
Then the Model card should cycle to "Haiku"
When I tap the Model card again
Then the Model card should cycle back to "Sonnet"
Scenario: Per-adapter preferences persist across sessions
Given I selected the "Claude" adapter
And I set Model to "Opus" and Thinking to "Extended"
When I navigate away and tap "New Chat" again
And I select the "Claude" adapter
Then Model should still be "Opus" and Thinking should still be "Extended"
And Codex preferences should be independent (unchanged)
Scenario: "Switch to [other]" link swaps adapter, options, and input placeholder
Given I selected the "Claude" adapter
Then a link "Switch to Codex" should be visible
When I tap "Switch to Codex"
Then the adapter should switch to Codex
And the settings cards should update to Codex-specific options
And the input placeholder should update to reflect Codex
And a link "Switch to Claude" should now be visible
# [Screenshot: switch-adapter-link.png]
Scenario: Sending a prompt navigates to chat view with correct adapter
Given I selected the "Claude" adapter
And I type "Hello Claude" in the input field
When I tap Send
Then I should navigate to the chat view
And the StatusBar should show "Claude" as the adapter
And the message should be sent to the Claude CLI
Feature: Multi-Adapter Session List
Background:
Given I am logged in
And I have sessions from both Claude and Codex adapters
Scenario: Session list shows adapter tabs (All / Claude / Codex)
When I view sessions within a project
Then I should see adapter filter tabs: "All", "Claude", "Codex"
And "All" should be selected by default
# [Screenshot: session-list-adapter-tabs.png]
Scenario: Tapping a tab filters sessions by adapter
Given I am viewing the session list
When I tap the "Claude" tab
Then only Claude sessions should be visible
When I tap the "Codex" tab
Then only Codex sessions should be visible
Scenario: Each session row shows adapter badge (Claude=amber, Codex=green)
When I view the session list
Then each session row should display an adapter badge
And Claude sessions should show an amber "Claude" badge
And Codex sessions should show a green "Codex" badge
# [Screenshot: session-row-adapter-badges.png]
Scenario: "All" tab shows sessions from both adapters sorted by time
Given I have sessions from both adapters with different timestamps
When I tap the "All" tab
Then sessions from both Claude and Codex should be visible
And they should be sorted by most recent activity (newest first)
Feature: Multi-Adapter StatusBar
Background:
Given I am logged in
And I have an active chat session
Scenario: StatusBar shows adapter badge instead of "tmux"
When I view an active chat session
Then the StatusBar should show an adapter badge instead of "tmux"
And the badge should display the adapter name
Scenario: Claude sessions show amber "Claude" badge
Given I am viewing a Claude chat session
Then the StatusBar should show an amber "Claude" badge
# [Screenshot: statusbar-claude-badge.png]
Scenario: Codex sessions show green "Codex" badge
Given I am viewing a Codex chat session
Then the StatusBar should show a green "Codex" badge
# [Screenshot: statusbar-codex-badge.png]
Feature: Cross-AI Review Message Action Buttons
Background:
Given I am logged in
And I have an active Claude chat session with at least one completed assistant turn
And both Claude and Codex adapters are available
Scenario: Action buttons appear on completed assistant messages
# T0: Assistant message is fully rendered (not streaming)
When I view a completed assistant message
Then I should see action buttons below the message
And the buttons should include "Copy" and "Send to Codex"
And "Send to Codex" should display the target adapter name dynamically
# [Screenshot: message-action-buttons.png]
Scenario: Action buttons NOT shown on user messages
When I view a user message in the chat
Then no action buttons should appear below the user message
# Action buttons are only for assistant responses
Scenario: Action buttons NOT shown during streaming
# T0: AI is actively streaming a response
Given the AI is currently streaming a response
Then the streaming message should NOT show action buttons
# T1: Streaming completes
When the AI response finishes streaming
Then action buttons should appear below the completed message
# [Screenshot: buttons-appear-after-streaming.png]
Scenario: Action buttons NOT shown when only one adapter is available
Given only the Claude adapter is available (Codex unavailable)
When I view a completed assistant message
Then only the "Copy" button should appear
And no "Send to" button should be visible
# "Send to [Adapter]" requires at least one other available adapter
Scenario: Action buttons NOT shown in empty or new chat
Given I am in a new chat with no messages
Then no action buttons should be visible anywhere
When I send a message
And the AI has not yet responded
Then no action buttons should be visible
Scenario: Adapter name is dynamic based on available adapters
Given I am in a Codex chat session
And the Claude adapter is available
When I view a completed assistant message
Then the action button should read "Send to Claude" (not "Send to Codex")
# The button always shows the OTHER adapter, not the current one
Scenario: Copy button copies message text to clipboard
Given I see a completed assistant message with action buttons
When I tap the "Copy" button
Then the message content should be copied to the clipboard
And the "Copy" button should briefly show a success indicator
# [Screenshot: copy-success-indicator.png]
Feature: Cross-AI Review Review Action Menu
Background:
Given I am logged in
And I have an active Claude chat session with a completed assistant message
And both Claude and Codex adapters are available
Scenario: Tapping "Send to [Adapter]" opens modal with template options
When I tap the "Send to Codex" button on an assistant message
Then a popup menu should appear with the following template options:
| Template |
| Direct send |
| Code Review |
| Suggest alternatives |
| Custom instruction... |
# [Screenshot: review-action-menu.png]
Scenario: Selecting "Custom instruction..." shows inline text input
When I tap the "Send to Codex" button on an assistant message
And I tap "Custom instruction..."
Then an inline text input field should appear within the menu
And the input should have a placeholder like "Enter your instruction..."
And the input should be focused and keyboard visible
# [Screenshot: custom-instruction-input.png]
Scenario: Custom instruction submits on Enter
Given the custom instruction input is visible
When I type "Compare this with the official docs" in the input
And I press Enter
Then the review should be created with the custom instruction
And the menu should close
Scenario: Backdrop tap dismisses the action menu
Given the review action menu is visible
When I tap the backdrop (area outside the menu)
Then the menu should dismiss
And no review should be created
Scenario: Menu state resets on reopen (custom input not stale)
Given I opened the review action menu and typed in the custom input
When I dismiss the menu by tapping the backdrop
And I tap "Send to Codex" again
Then the menu should show the initial template list (not the custom input)
And the custom input field should be empty
Feature: Cross-AI Review Creating a Review
Background:
Given I am logged in
And I have an active Claude chat session with multiple completed turns
And both Claude and Codex adapters are available
Scenario: Selecting a template creates child session and opens floating panel
# --- Timeline: Creating a Review ---
# T0: User taps "Send to Codex" on an assistant message
When I tap "Send to Codex" on an assistant message
And I select "Code Review" from the template menu
# T1: Server creates child session, floating panel appears
Then a floating panel should appear at the bottom of the screen
And the panel should occupy roughly the lower half of the viewport
And the parent chat should still be partially visible above the panel
# [Screenshot: T1-floating-panel-opened.png]
Scenario: Panel shows adapter brand color and dynamic title
Given I started a "Code Review" review with Codex
When the floating panel appears
Then the panel header should show "Codex Code Review" as the title
And the header should use the Codex brand color (green)
And an "End" button should be visible in the panel header
# [Screenshot: panel-header-brand-color.png]
Scenario: Panel title reflects selected template
When I tap "Send to Codex" and select "Direct send"
Then the panel title should show "Codex Direct Send"
When I start another review and select "Suggest alternatives"
Then the panel title should show "Codex Suggest Alternatives"
When I start another review with a custom instruction "Check error handling"
Then the panel title should show "Codex Check Error Handling" (truncated if long)
Scenario: Child session runs in same cwd as parent
Given the parent session is running in "/Users/me/my-project"
When I start a review with Codex
Then the child Codex session should launch in "/Users/me/my-project"
And the child session should have codebase access in that directory
Scenario: Context includes conversation history up to anchor message
Given the parent chat has 10 messages (5 user + 5 assistant turns)
When I tap "Send to Codex" on the 3rd assistant message
And I select "Code Review"
Then the child session should receive context including:
| Content |
| Parent conversation history (messages 1-6) |
| The anchor message marked for review |
| The "Code Review" instruction |
Scenario: Context capped at 50 messages / 30KB
Given the parent chat has 80 messages totaling 50KB
When I start a review on a recent assistant message
Then the context sent to the child should include at most 50 messages
And the total context size should not exceed 30KB
And if truncated, the context should begin with "[Earlier conversation omitted]"
Feature: Cross-AI Review Floating Panel Interaction
Background:
Given I am logged in
And I have an active review with a Codex child session
And the floating panel is expanded
Scenario: Panel shows child messages with streaming
# T0: Child session is processing the review request
When the child AI begins responding
Then I should see the child's response streaming in the floating panel
And the streaming indicator should be visible in the panel
# T1: Child response completes
When the child AI finishes responding
Then the full response should be visible in the panel
# [Screenshot: T1-child-response-complete.png]
Scenario: Panel has its own input field with distinct placeholder
When I look at the floating panel
Then I should see an input field at the bottom of the panel
And the placeholder should be distinct from the parent input (e.g. "Ask Codex...")
When I type "Can you elaborate on point 3?" in the panel input
And I tap Send in the panel
Then the message should be sent to the child session (not the parent)
And the child AI should begin responding in the panel
Scenario: Handle bar minimizes panel to pill button
When I tap the handle bar at the top of the floating panel
Then the panel should minimize
And a pill-shaped button should appear in the bottom-right corner
And the pill should show the adapter name and template (e.g. "Codex Code Review")
And the pill should have a pulsing dot indicating an active review
# [Screenshot: minimized-pill-button.png]
Scenario: Pill tap re-expands the floating panel
Given the floating panel is minimized to a pill
When I tap the pill button
Then the floating panel should re-expand to its previous size
And the child session messages should still be visible
And the scroll position should be preserved
Scenario: Each child assistant message has Copy and "Send to [Parent]" buttons
When a child assistant message is fully rendered
Then I should see "Copy" and "Send to Claude" buttons below the message
And "Send to Claude" should use the parent adapter name dynamically
# [Screenshot: child-message-action-buttons.png]
Feature: Cross-AI Review Send Back to Parent
Background:
Given I am logged in
And I have an active review with a Codex child session
And the child AI has completed at least one response
Scenario: "Send to [Parent]" injects formatted feedback into parent
# T0: Child response is visible in the floating panel
When I tap "Send to Claude" on a child assistant message
# T1: Feedback is injected into the parent session
Then the parent chat should receive a new message prefixed with "[Review feedback from Codex]:"
And the message content should contain the child's response text
# T2: Parent AI responds to the feedback
And the parent AI should begin processing the feedback
# [Screenshot: T2-parent-receiving-feedback.png]
Scenario: Parent AI responds to the injected feedback
Given I sent a child message back to the parent via "Send to Claude"
When the parent AI finishes responding to the feedback
Then the parent's response should be visible in the parent chat
And the parent should reference or address the review feedback
And the floating panel should remain open (review is still active)
Scenario: "Send to [Parent]" button disabled when parent is streaming
Given the parent AI is currently streaming a response
When I look at a child assistant message's action buttons
Then the "Send to Claude" button should be disabled (greyed out)
And tapping the disabled button should show a toast: "Wait for the current turn to complete"
# [Screenshot: send-to-parent-disabled.png]
Scenario: 409 returned when parent is busy
Given the parent session is currently processing a message
When I attempt to send a child message back to the parent
Then the server should respond with a 409 Conflict status
And a toast should notify me that the parent is busy
And the child message should remain unsent (can retry later)
Feature: Cross-AI Review Ending a Review
Background:
Given I am logged in
And I have an active review with a Codex child session
And the floating panel is visible
Scenario: End button kills child session and panel disappears
# T0: Floating panel is visible with active child session
When I tap the "End" button in the panel header
# T1: Child tmux session is terminated, panel disappears
Then the floating panel should disappear
And the minimized pill should also disappear (if it was visible)
And the parent chat should return to its full-height layout
# [Screenshot: T1-review-ended.png]
Scenario: JSONL preserved for history after ending
Given I ended the review by tapping "End"
When I navigate away and return to the parent session later
Then the ended review should be visible as a collapsed card in the history
And the child session's conversation data should be preserved
Scenario: Can start a new review after ending the previous one
Given I ended a review
When I tap "Send to Codex" on another assistant message
And I select a template
Then a new review should start successfully
And a new floating panel should appear
And the previous ended review should remain as a collapsed card in history
Feature: Cross-AI Review One Active Review Constraint
Background:
Given I am logged in
And I have an active Claude chat session
And both Claude and Codex adapters are available
Scenario: Only one active review at a time
Given I have an active review with Codex running
When I view the parent chat
Then I should see exactly one floating panel (or pill)
And there should be no way to have two panels simultaneously
Scenario: Second review attempt shows confirmation dialog
Given I have an active review with Codex running
When I tap "Send to Codex" on a different assistant message
And I select "Direct send" from the template menu
Then a confirmation dialog should appear
And the dialog should say "End current review to start a new one?"
And the dialog should have "Confirm" and "Cancel" buttons
# [Screenshot: confirm-end-review-dialog.png]
Scenario: Confirming ends first review and starts new one
Given the confirmation dialog is visible
When I tap "Confirm"
Then the current review should end (child session terminated)
And a new review should start with the newly selected message and template
And the floating panel should show the new review's content
Scenario: Cancelling keeps existing review
Given the confirmation dialog is visible
When I tap "Cancel"
Then the dialog should dismiss
And the existing review should remain active
And the floating panel should continue showing the current review
Feature: Cross-AI Review Session Filtering
Background:
Given I am logged in
And I have an active Claude chat session with an active Codex review child
Scenario: Child session NOT shown in session list
When I navigate to the session list
Then the Codex child session should NOT appear in the session list
And only the parent Claude session should be listed
# Child sessions are filtered out at the API layer
Scenario: Child session NOT shown in active sessions tab
When I view the active sessions tab
Then the Codex child session should NOT appear as an active session
And only the parent Claude session should show as active
Scenario: Active session count excludes child sessions
Given the parent Claude session is active
And the child Codex review session is active
When I view the session count badge
Then the count should show 1 active session (not 2)
# Child sessions do not inflate the active count
Scenario: Push notifications suppressed for child sessions
Given the child Codex session produces an output event
When I am not viewing the parent chat (e.g. on the session list)
Then no push notification should appear for the child session
# Notifications for child sessions would be confusing to the user
Feature: Cross-AI Review History View
Background:
Given I am logged in
And I have a parent Claude session with one ended Codex review
And the review was anchored to a specific assistant message
Scenario: Block-start marker appears at anchor message position
When I scroll to the assistant message that triggered the review
Then a block-start marker should appear after the anchor message
And the marker should read "Codex Code Review started" (adapter + template)
And the marker should be styled as a horizontal divider with label
# [Screenshot: block-start-marker.png]
Scenario: Collapsed card shows adapter, title, count, and summary
When I view the block-start marker area
Then a collapsed review card should appear immediately after the marker
And the card should display:
| Field | Example |
| Adapter | Codex |
| Title | Code Review |
| Count | 4 messages |
| Summary | First line of the child AI's response |
And the card should show "Tap to expand" hint
# [Screenshot: collapsed-review-card.png]
Scenario: Block-end marker appears after collapsed card for ended reviews
When I view the area below the collapsed review card
Then a block-end marker should appear: "Review ended"
And the marker should be styled similarly to the block-start marker
# [Screenshot: block-end-marker.png]
Scenario: Active review shows "in progress" marker instead of end marker
Given the review is still active (not ended)
When I scroll to the anchor message area
Then the block-start marker should appear
And the collapsed card should show an "in progress" indicator
And no block-end marker should be present
Scenario: Parent messages during review period render normally
Given the parent received messages while the review was active
When I scroll through the chat history
Then parent messages sent during the review period should render normally
And parent messages should appear below the collapsed review card
And parent messages should NOT be nested inside the review card
Scenario: Multiple ended reviews render correctly
Given the parent session has 3 ended reviews on different messages
When I scroll through the full chat history
Then each review should have its own block-start marker, card, and block-end marker
And the reviews should appear at their respective anchor message positions
And the chat flow between reviews should be uninterrupted
Scenario: Tapping collapsed card opens read-only view
When I tap on a collapsed review card
Then a read-only panel should open showing the full child conversation
And the panel should show all child messages in order
And no input field should be present (read-only)
And a close button should be available to dismiss the panel
# [Screenshot: read-only-review-panel.png]
Scenario: Compacted anchor message shows card at end as fallback
Given the anchor message was compacted or is no longer individually identifiable
When I view the chat history
Then the collapsed review card should appear at the end of the visible messages
And a note should indicate the original anchor position is unavailable
Feature: Cross-AI Review Reconnect & Persistence
Background:
Given I am logged in
And I have an active Claude chat session with an active Codex review
Scenario: Page refresh restores the floating panel
# T0: Floating panel is visible with active child session
When I refresh the browser page
# T1: Page reloads and reconnects
Then the floating panel should reappear automatically
And the child session messages should be restored
And the panel should show the correct adapter and title
# [Screenshot: T1-panel-restored-after-refresh.png]
Scenario: Server restart resumes or marks review ended
# T0: Active review is in progress
When the server restarts
And I reconnect to the parent session
# T1: Server checks if child tmux window still exists
Then if the child tmux window is found, the review should resume
And the floating panel should reappear
Or if the child tmux window is gone, the review should be marked as ended
And the ended review should appear as a collapsed card in history
Scenario: WebSocket reconnect restores the panel
Given the WebSocket connection drops temporarily
When the WebSocket reconnects
Then the floating panel should restore to its previous state (expanded or minimized)
And child session messages should continue streaming if the child is active
Scenario: Parent session destroyed cascades to end child review
Given the parent tmux session is killed externally
When the server detects the parent session is gone
Then the child review session should be automatically ended
And the review should be marked with ended_at in the database
Scenario: Child tmux crash auto-ends the review
Given the child tmux window crashes or is killed externally
When the server detects the child session is gone
Then the review should be automatically marked as ended
And the floating panel should disappear
And a toast should notify: "Review session ended unexpectedly"
Scenario: Navigate away and back restores the panel
# T0: Floating panel is visible
When I navigate to the session list
# T1: Panel is no longer visible (different view)
And I navigate back to the parent chat session
# T2: Panel restores
Then the floating panel should reappear
And the child session state should be preserved
# [Screenshot: T2-panel-restored-after-navigation.png]
Feature: Cross-AI Review Multi-Client
Background:
Given I am logged in from two browser tabs (Tab A and Tab B)
And both tabs are viewing the same parent Claude chat session
And both Claude and Codex adapters are available
Scenario: Both tabs see the same review panel
Given a review is active with a Codex child session
When I view Tab A
Then the floating panel should be visible with the child session
When I switch to Tab B
Then the floating panel should also be visible with the same child session
# Both tabs receive the same WebSocket broadcasts
Scenario: Tab A starts review, Tab B sees panel appear
# T0: Neither tab has a review panel
When I start a review from Tab A by tapping "Send to Codex"
And I select "Code Review" from the template menu
# T1: Tab A shows the floating panel
Then Tab A should show the floating panel
# T2: Tab B receives REVIEW_STARTED broadcast
When I switch to Tab B
Then Tab B should also show the floating panel
And the panel should display the same adapter and title as Tab A
# [Screenshot: T2-tab-b-panel-synced.png]
Scenario: Tab A ends review, Tab B sees panel disappear
Given a review is active and both tabs show the floating panel
# T0: Tab A taps "End"
When I tap "End" in the panel header on Tab A
# T1: Tab A's panel disappears
Then Tab A's floating panel should disappear
# T2: Tab B receives REVIEW_ENDED broadcast
When I switch to Tab B
Then Tab B's floating panel should also have disappeared
And both tabs should show the parent chat at full height
Feature: Multi-Adapter Session Context Menu
Background:
Given I am logged in
And I am viewing the session list
Scenario: Long-pressing a session row shows context menu bottom sheet
When I long-press on a session row
Then a context menu bottom sheet should slide up from the bottom
And the bottom sheet should display the session's first prompt as a title
# [Screenshot: session-context-menu.png]
Scenario: Active sessions show "Use as reference" option
Given the session is active (currently running)
When I long-press on the session row
Then the context menu should show "Use as reference in [Adapter]" option
And the adapter name should be dynamic (the other available adapter)
And no "Hand off" option should be present
Scenario: Inactive sessions show "Use as reference" option
Given the session is inactive (historical)
When I long-press on the session row
Then the context menu should show "Use as reference in [Adapter]" option
And the adapter name should be dynamic (the other available adapter)
Scenario: Tapping backdrop dismisses context menu
Given the context menu bottom sheet is visible
When I tap the backdrop (area outside the bottom sheet)
Then the context menu should dismiss
And I should return to the session list view
Feature: Multi-Adapter Permission Mode Startup
Background:
Given I am logged in
Scenario: Sessions always start with bypass-permissions flag
When I start a new chat session (any adapter)
Then the CLI process should launch with the bypass-permissions flag
And the StatusBar should reflect the pending permission mode
# Bypass ensures the session starts without blocking on initial permission setup
Scenario: After startup, permission mode switches to user's chosen mode
Given a new session just started with bypass-permissions
When the CLI process is ready
Then the permission mode should switch to the user's chosen mode
And the StatusBar should update to show the active mode (e.g. "Normal")
# [Screenshot: permission-mode-after-startup.png]
Scenario: Mid-session permission mode switching works for all 4 modes (Claude)
Given I have an active Claude chat session
Then I should be able to cycle through all 4 permission modes:
| Mode |
| Normal |
| Auto-edit |
| Plan |
| YOLO |
And each mode change should take effect immediately
And the StatusBar should reflect the current mode
Scenario: Codex sessions always run in YOLO mode (no approvals)
Given I start a new Codex chat session
Then the permission mode should be set to YOLO (-a never)
And the permission mode label should indicate YOLO
And no permission mode cycling should be available
# Codex adapter does not support mid-session permission changes