Files
clawtap/docs/superpowers/specs/2026-03-27-interactive-prompts-design.md
T
kuannnn 0fcf66fc22 feat: ClawTap v0.2.0
Interactive Prompts:
- Unified InteractivePrompt type across all 3 adapters (Claude/Codex/Gemini)
- InteractivePromptOverlay component with options, text input, countdown
- Gemini + Codex pane monitors detect tool confirmation, ask user, plan approval
- respondInteractivePrompt routing: permission → respondPermission, options → _selectOption
- Claude AskUserQuestion nested questions[0] structure parsing

Cross-AI Review:
- Client-generated reviewId, removed pendingReview state
- FloatingReviewPanel uses CSS display:none instead of unmount (keeps hooks alive)
- Child review sessions default to YOLO/bypass permission mode
- Send back to parent, send to existing/new review, tab switching, end review
- Collapsed review cards with read-only panel for ended reviews
- Full reconnect support: active + ended reviews restore correctly

AskUserQuestion Tool Card UI:
- Dedicated renderer replaces raw JSON display
- Options shown with selected (green) / unselected (gray) indicators
- Free text answers shown in quoted format with green border
- Collapsed summary: question → answer
- Shared parseAskQuestionInput utility (client + server)
- Historical tool results attached via _result on tool_use blocks

Adapter Fixes:
- Session→adapter mapping persisted in SQLite (survives server restart)
- SESSION_CREATED deferred for pendingRekey adapters (Codex/Gemini)
- session-rekeyed handler sends complete SESSION_CREATED with adapter + cwd
- Gemini: auto-accept folder trust, privacy notice, IDE nudge, YOLO * prompt
- Claude: auto-accept bypass permissions confirmation (v2.1.85+)
- Port fallback (EADDRINUSE → try +1), statusLine shell script wrapper

Other:
- Desktop Enter sends / Shift+Enter newline; Mobile Enter newline
- Strip CLAWTAP_REF marker from session list
- Active sessions tab shows adapter badge
- Rename CLAUDE_UI_PASSWORD → CLAWTAP_PASSWORD

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:46:00 +08:00

197 lines
6.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Interactive Prompts — Normalized Handling Across All Adapters
**Date**: 2026-03-27
**Status**: Draft
## Problem
Gemini and Codex CLIs show interactive terminal prompts (tool confirmation, ask user question, plan approval, etc.) that ClawTap doesn't detect or handle. The UI freezes with a blinking cursor while the CLI waits for input. Claude's prompts are handled via hooks, but Gemini/Codex have no equivalent hook mechanism for runtime prompts.
## Design Principle
**One normalized format, adapter-agnostic frontend.**
Each adapter detects prompts in its own way (Claude: hooks, Gemini/Codex: pane monitor), but all emit the same `InteractivePrompt` format. Frontend renders based on format, not adapter identity.
## Normalized Types
```typescript
interface InteractivePrompt {
requestId: string;
type: 'permission' | 'question' | 'plan' | 'loop-detected';
title: string;
description: string;
toolName?: string;
toolInput?: any;
options?: { value: string; label: string }[];
textInput?: { placeholder?: string };
}
interface PromptResponse {
requestId: string;
selectedOption?: string;
textValue?: string;
}
```
Render logic:
- `options` only → button list (permission overlay)
- `textInput` only → text input field (ask question)
- Both → buttons + text field (plan mode: approve buttons + feedback)
## Changes
### Change 1: Gemini pane monitor — detect interactive prompts
**File:** `server/adapters/gemini/pane-monitor.ts`
Add prompt detection to the existing poll loop. Each poll, check for known prompt patterns before processing streaming text:
**Tool Confirmation** ("Action Required"):
```
Pattern: content includes "Action Required" AND numbered options (● N.)
Extract: title, description (command/file/tool info), options list
Emit: { type: 'permission', options: [...] }
```
**AskUser** ("Answer Questions"):
```
Pattern: content includes "Answer Questions" AND "> " input prompt
Extract: question text, input type (detect options vs free text)
Emit: { type: 'question', textInput or options }
```
**Plan Approval** ("Approval"):
```
Pattern: content includes "Approval" header with plan content
Extract: plan text, approval options + feedback field
Emit: { type: 'plan', options: [...], textInput: { placeholder: 'Type your feedback...' } }
```
**Loop Detection**:
```
Pattern: content includes "potential loop was detected"
Extract: options
Emit: { type: 'loop-detected', options: [...] }
```
Dedup: track last emitted requestId to avoid re-emitting the same prompt on each poll.
### Change 2: Codex pane monitor — detect interactive prompts
**File:** `server/adapters/codex/pane-monitor.ts`
**Command/File/Network Approval**:
```
Pattern: content includes "(y) Yes, proceed" or "Would you like to run"
Extract: command/file info, available options (y/a/p/d/n)
Emit: { type: 'permission', options: [...] }
```
**Request User Input**:
```
Pattern: content includes "Question" header with "> " input
Extract: question text, options or free text field
Emit: { type: 'question', textInput or options }
```
### Change 3: Gemini adapter — respondPermission / respondQuestion
**File:** `server/adapters/gemini/gemini-tmux-adapter.ts`
Add methods to send keystrokes based on `PromptResponse`:
- **Permission (numbered options):** Navigate Down × N to selected option, press Enter
- **Question (text):** Type answer text, press Enter
- **Question (select):** Navigate to selected option, press Enter
- **Plan:** Select approve option OR type feedback, press Enter
### Change 4: Codex adapter — respondPermission / respondQuestion
**File:** `server/adapters/codex/codex-tmux-adapter.ts`
- **Permission:** Send the keyboard shortcut key (y/a/p/d/n)
- **Question (text):** Type answer text, press Enter
- **Question (select):** Navigate to option, press Enter
### Change 5: Claude adapter — normalize existing events
**File:** `server/adapters/claude/tmux-adapter.ts`
Current `permission-request` and `ask-question` events already work but use adapter-specific format. Map to `InteractivePrompt` format in session-manager.ts event handlers (minimal change — add missing fields).
### Change 6: Session manager — unified event handling
**File:** `server/session-manager.ts`
Current event listeners:
```typescript
adapter.on('permission-request', ...) broadcast WS.PERMISSION_REQUEST
adapter.on('ask-question', ...) broadcast WS.PERMISSION_REQUEST (toolName: 'AskUserQuestion')
```
Change to emit unified format:
```typescript
adapter.on('interactive-prompt', (sessionId, prompt: InteractivePrompt) => {
broadcast(sessionId, { type: WS.INTERACTIVE_PROMPT, ...prompt });
});
```
### Change 7: Frontend — InteractivePromptOverlay component
**File:** `src/components/InteractivePromptOverlay.tsx` (new)
Replaces `PermissionOverlay` and `AskQuestion` with a single adapter-agnostic component:
- Renders `title` + `description`
- If `options` → button list
- If `textInput` → text input field
- If both → buttons + text field (plan mode layout)
- Sends `PromptResponse` back via WS
**File:** `src/components/ChatView.tsx` + `src/components/FloatingReviewPanel.tsx`
Replace `PermissionOverlay` / `AskQuestion` usage with `InteractivePromptOverlay`.
### Change 8: useChat — handle new WS message type
**File:** `src/hooks/useChat.ts`
Replace `WS.PERMISSION_REQUEST` handler with `WS.INTERACTIVE_PROMPT`:
```typescript
case WS.INTERACTIVE_PROMPT:
setInteractivePrompt(msg); // replaces setPermissionRequest
break;
```
### Change 9: Remaining _waitForReady prompts
**File:** `server/adapters/gemini/gemini-tmux-adapter.ts`
Add detection for remaining startup prompts:
- Privacy Notice → send Esc
- Multi-folder trust → send option 1
- IDE integration nudge → send "No"
These are handled in `_waitForReady` (auto-bypass), not sent to frontend.
## What Doesn't Change
- **WS protocol**: Still uses WebSocket for all communication
- **tmux management**: Still uses tmux for CLI process management
- **JSONL watching**: Still uses file watchers for message history
- **Claude hooks**: Still uses hooks for Claude tool events (just normalizes the output format)
## Scope Notes
**Phase 1 (this spec):**
- Gemini: tool confirmation, ask user, plan approval, loop detection
- Codex: command/file/network approval, request user input
- Claude: normalize existing events to new format
- Frontend: InteractivePromptOverlay component
**Phase 2 (future):**
- Diff rendering in file edit permissions
- Codex MCP elicitation forms (structured multi-field forms)
- Codex multi-agent thread approval routing