ac35ba3594
- move developing.md and specification.md into docs/ - update readme links to new locations - add missing telegram.py module documentation - fix engines.py location (was incorrectly under runners/) - remove stale "moved to markdown.py" entries
547 lines
20 KiB
Markdown
547 lines
20 KiB
Markdown
# Takopi Specification v0.2.0 [2025-12-31]
|
||
|
||
This document specifies Takopi v0.2.0 behavior and architecture in a way that is testable, evolvable, and explicitly aligned with the goals:
|
||
|
||
- **Better testability**
|
||
- **Runner abstraction** to support future runners (e.g., Claude Code)
|
||
- **Telegram remains the only bot client** (adding another is unlikely)
|
||
- **Parallel runs are allowed across different threads**, but runs for the **same thread must be serialized** to avoid corrupting history
|
||
|
||
This is a normative spec using **MUST / SHOULD / MAY** language. Sections labeled **Decision** capture choices that should remain stable unless intentionally changed.
|
||
|
||
------
|
||
|
||
## 1. Scope and goals
|
||
|
||
### 1.1 Goals (v0.2.0)
|
||
|
||
1. Provide a Telegram bot that runs an “exec agent” (runner) and streams progress updates with periodic edits.
|
||
2. Support “thread continuation” via a **resume command** embedded in chat messages.
|
||
3. Support **parallel execution across different threads** (different resume tokens).
|
||
4. Enforce **serialization per thread** (same resume token) to avoid concurrent mutation of the same engine conversation/history.
|
||
5. Establish a stable, Takopi-owned **normalized event model** that runners produce and renderers consume.
|
||
6. Keep architecture modular enough to add another runner in a future version with minimal changes.
|
||
|
||
### 1.2 Non-goals (v0.2.0)
|
||
|
||
- Adding additional bot clients besides Telegram (Discord/Slack/etc.) is out of scope.
|
||
- Implementing auto-selection of multiple runners is not required (but should be prepared for).
|
||
- Streaming partial assistant answers token-by-token is not required (progress UI is event-driven; final answer is delivered at completion).
|
||
- Supporting engines that cannot provide stable action IDs is out of scope (see §5.4).
|
||
|
||
------
|
||
|
||
## 2. Terminology
|
||
|
||
- **Runner / Engine**: Implementation that executes an agent process (Codex today; Claude Code later) and produces Takopi events.
|
||
- **Thread**: The engine-side conversation identifier. In Takopi this is represented as a **ResumeToken**.
|
||
- **ResumeToken**: A Takopi-owned structured identifier: `{ engine: EngineId, value: str }`.
|
||
- **ResumeLine**: A runner-owned string representation embedded in chat; **canonical** representation is the engine CLI command (Decision §4.1).
|
||
- **Takopi Event**: A normalized event dict emitted by a runner and consumed by renderers/bridge.
|
||
- **Progress Message**: Telegram message that is edited periodically to show live status.
|
||
- **Final Message**: Telegram message containing final answer + resume line + status.
|
||
|
||
------
|
||
|
||
## 3. Architecture overview
|
||
|
||
### 3.1 Layers and responsibilities (strict boundaries)
|
||
|
||
**Domain Model (Takopi-owned)**
|
||
|
||
- Defines: `ResumeToken`, `TakopiEvent`, `Action` (including the terminal `completed` event).
|
||
- No Telegram, no subprocess, no engine JSON.
|
||
|
||
**Runner Interface (Takopi-owned)**
|
||
|
||
- Defines `Runner` protocol: `run()`, `extract_resume()`, `format_resume()`, etc.
|
||
- Runners are trusted producers of Takopi events (Decision §5.2).
|
||
|
||
**Runner Implementations (engine-owned logic)**
|
||
|
||
- Codex runner translates engine-specific stream into Takopi events.
|
||
- Each runner enforces per-thread serialization (MUST, §6.2).
|
||
|
||
**Renderers (Takopi-owned)**
|
||
|
||
- Pure functions/state machines that consume Takopi events and produce markdown strings.
|
||
- No engine-specific parsing.
|
||
- No Telegram API calls.
|
||
|
||
**Bridge (Telegram orchestration)**
|
||
|
||
- Receives Telegram updates and turns them into runner invocations.
|
||
- Maintains throttled progress editing.
|
||
- Handles cancellation `/cancel`.
|
||
- Owns Telegram markdown constraints (limits, entity formatting).
|
||
|
||
### 3.2 Module naming and one-word modules (v0.2.0 refactor target)
|
||
|
||
Recommended module layout (single-word filenames, clean layering):
|
||
|
||
- `takopi/model.py`
|
||
Domain types: events, actions, resume token.
|
||
- `takopi/runner.py`
|
||
Runner protocol.
|
||
- `takopi/runners/codex.py`
|
||
Codex runner implementation.
|
||
- `takopi/runners/mock.py`
|
||
Script/mock runner for tests.
|
||
- `takopi/render.py`
|
||
Progress renderer and event-to-text formatting.
|
||
- `takopi/bridge.py`
|
||
Telegram orchestration; main loop and message handler.
|
||
- `takopi/cli.py`
|
||
Typer/CLI entrypoints, config loading, engine selection.
|
||
- `takopi/markdown.py`
|
||
Markdown sanitization + Telegram entity prep.
|
||
|
||
**Rationale:**
|
||
The normalized event model MUST NOT live under `runners/` because it is core domain state shared by bridge and renderer.
|
||
|
||
------
|
||
|
||
## 4. Resume tokens and resume lines
|
||
|
||
### 4.1 Decision: canonical resume representation is engine CLI command
|
||
|
||
The canonical representation of “resume” embedded in chat is the runner’s **engine CLI resume command**, e.g.:
|
||
|
||
- Codex: ``codex resume <uuid>``
|
||
|
||
Takopi MUST treat the runner as the authority for:
|
||
|
||
- formatting a `ResumeToken` into a `ResumeLine`
|
||
- extracting a `ResumeToken` from message text
|
||
|
||
Takopi MAY introduce additional Takopi-owned metadata lines in the future (e.g., `resume: codex:<uuid>`), but **v0.2.0 canonical remains the CLI command**.
|
||
|
||
### 4.2 ResumeToken structure (Takopi-owned)
|
||
|
||
```python
|
||
@dataclass(frozen=True, slots=True)
|
||
class ResumeToken:
|
||
engine: str # EngineId (string)
|
||
value: str
|
||
```
|
||
|
||
### 4.3 Runner resume codec interface (MUST)
|
||
|
||
Each runner MUST implement:
|
||
|
||
- `format_resume(token: ResumeToken) -> str`
|
||
Returns a ResumeLine suitable for embedding in Telegram markdown (usually inside backticks).
|
||
- `extract_resume(text: str) -> ResumeToken | None`
|
||
Extracts a ResumeToken from arbitrary message text.
|
||
- `is_resume_line(line: str) -> bool`
|
||
Fast check used for truncation safety (to preserve the resume line during trimming).
|
||
|
||
**Constraints:**
|
||
|
||
- `format_resume()` MUST raise or otherwise fail if `token.engine != runner.engine`.
|
||
- `extract_resume()` MUST return `None` if it cannot confidently parse a resume command for its engine.
|
||
|
||
### 4.4 Resume extraction behavior in the bridge (v0.2.0)
|
||
|
||
Given a user message `text` and optional reply-to message `reply_text`:
|
||
|
||
1. The bridge MUST attempt `runner.extract_resume(text)`.
|
||
2. If not found, the bridge MUST attempt `runner.extract_resume(reply_text)` if present.
|
||
3. If still not found, run starts as a **new thread** (`resume=None`).
|
||
|
||
**Future note (non-normative):**
|
||
For multi-runner auto-selection, the bridge MAY attempt extraction across all registered runners. This is not required for v0.2.0.
|
||
|
||
------
|
||
|
||
## 5. Normalized event model (Takopi-owned)
|
||
|
||
### 5.1 Decision: events are trusted after normalization
|
||
|
||
Runners are responsible for producing well-formed Takopi events. Downstream consumers (render/bridge) SHOULD assume validity and may fail fast if invariants are violated (Decision §5.2).
|
||
|
||
### 5.2 Event types (minimum set)
|
||
|
||
Takopi MUST support the following event types:
|
||
|
||
1. `started`
|
||
2. `action`
|
||
3. `completed`
|
||
|
||
### 5.3 Required fields by event type
|
||
|
||
#### 5.3.1 `started`
|
||
|
||
Required:
|
||
|
||
- `type: "started"`
|
||
- `engine: EngineId`
|
||
- `resume: ResumeToken`
|
||
|
||
Optional:
|
||
|
||
- `title: str` (human-readable session/agent label)
|
||
- `meta: dict` (debug/diagnostic payloads)
|
||
|
||
#### 5.3.2 `action`
|
||
|
||
Required:
|
||
|
||
- `type: "action"`
|
||
- `engine: EngineId`
|
||
- `action: Action`
|
||
- `phase: "started" | "updated" | "completed"`
|
||
|
||
Optional:
|
||
|
||
- `ok: bool` (typically present when `phase="completed"`)
|
||
- `message: str` (freeform status/warning text)
|
||
- `level: "debug" | "info" | "warning" | "error"`
|
||
|
||
#### 5.3.3 `completed`
|
||
|
||
Required:
|
||
|
||
- `type: "completed"`
|
||
- `engine: EngineId`
|
||
- `ok: bool` (success/failure of the run)
|
||
- `answer: str` (final assistant response text; may be empty)
|
||
|
||
Optional:
|
||
|
||
- `resume: ResumeToken` (final resume token for the run; new or existing, if known)
|
||
- `error: str | None` (fatal error message, if any)
|
||
- `usage: dict` (engine usage/telemetry, if provided)
|
||
|
||
### 5.4 Action schema (MUST, per your Decision #4)
|
||
|
||
Actions MUST have stable IDs.
|
||
|
||
```python
|
||
@dataclass(frozen=True, slots=True)
|
||
class Action:
|
||
id: str # required
|
||
kind: str # required, stable taxonomy
|
||
title: str # required, short label
|
||
detail: dict[str, Any] # required, structured details
|
||
```
|
||
|
||
**Definition (v0.2.0):**
|
||
“Stable” means **stable within a single run**: the same underlying action MUST keep the same `Action.id` across all events in that run, and `Action.id` values MUST be unique within the run. Takopi does not require action IDs to remain stable across different runs/resumes.
|
||
|
||
Action kinds SHOULD be from a stable set (extensible):
|
||
|
||
- `command`
|
||
- `tool`
|
||
- `file_change`
|
||
- `web_search`
|
||
- `note`
|
||
- `turn`
|
||
- `warning`
|
||
- `telemetry`
|
||
- `note`
|
||
|
||
Runners MAY include additional kinds, but renderers MAY treat unknown kinds as `note`.
|
||
|
||
The `detail` dict is **freeform per runner**; no per-kind schema is enforced. Renderers SHOULD handle missing or unexpected fields gracefully.
|
||
|
||
The `ok` field semantics are **runner-defined**. For example, a runner MAY treat `grep` exit code 1 (no match) as `ok=True` if contextually appropriate.
|
||
|
||
**User-visible warnings and errors:** runners SHOULD surface these as `action` events with `phase="completed"` (typically `kind="warning"` or `kind="note"`) and `ok=False`, rather than introducing additional event types.
|
||
|
||
------
|
||
|
||
## 6. Runner interface and concurrency semantics
|
||
|
||
### 6.1 Runner protocol (MUST)
|
||
|
||
```python
|
||
class Runner(Protocol):
|
||
engine: str
|
||
|
||
def run(
|
||
self,
|
||
prompt: str,
|
||
resume: ResumeToken | None,
|
||
) -> AsyncIterator[TakopiEvent]: ...
|
||
```
|
||
|
||
### 6.2 Per-thread serialization (MUST; core invariant)
|
||
|
||
**Invariant:** At most one active run may operate on the same thread (same `ResumeToken`) at a time.
|
||
|
||
- Parallel runs are allowed only if they target **different** threads.
|
||
- Runs targeting the same thread MUST be queued and executed sequentially.
|
||
- This invariant MUST be enforced by the runner implementation (even if used outside the bridge).
|
||
|
||
**Critical requirement for new sessions:**
|
||
If `resume is None`, the runner MUST acquire the per-thread lock **as soon as the new thread's ResumeToken becomes known**, and MUST do so **before emitting `started`** to downstream consumers.
|
||
|
||
This prevents:
|
||
|
||
- a second run resuming the thread while the original "new session" run is still active
|
||
- history corruption due to concurrent engine operations
|
||
|
||
**Bridge note (non-normative):**
|
||
The bridge may enforce FIFO scheduling per thread to avoid emitting multiple progress messages for the same thread while a run is already in-flight.
|
||
|
||
**Codex note (non-normative):**
|
||
Codex emits `thread.started` (with `thread_id`) before any `turn.*` / `item.*` events for both new and resumed runs. Codex MAY emit top-level warning `error` lines (e.g., config warnings) before `thread.started`; the Codex runner translates these warnings into `action` events with `phase="completed"` and yields them in the same order as received (so `started` is not guaranteed to be the first yielded event). If the subprocess exits before `thread.started` is observed, no `started` can be emitted and the bridge reports an error without a resume line.
|
||
|
||
Codex also emits exactly one `agent_message`/`assistant_message` per turn; the runner uses that message text as `completed.answer`.
|
||
|
||
### 6.3 Run completion event (MUST)
|
||
|
||
```python
|
||
@dataclass(frozen=True, slots=True)
|
||
class CompletedEvent:
|
||
type: Literal["completed"]
|
||
engine: EngineId
|
||
ok: bool # success/failure of the run
|
||
resume: ResumeToken | None = None # final resume token for the run (new or existing, if known)
|
||
answer: str # final assistant response text (may be empty)
|
||
```
|
||
|
||
`completed` MUST be the final event of a successful run.
|
||
|
||
### 6.4 Event delivery semantics (MUST)
|
||
|
||
Event ordering is significant. The system MUST ensure:
|
||
|
||
- Events are yielded to the consumer in the same order they are produced by the runner.
|
||
- Event delivery MUST NOT spawn unbounded background tasks per event.
|
||
- If the consumer stops iteration early (break/cancel/exception), the runner MUST abort the run (best-effort) and release any held resources.
|
||
|
||
### 6.5 Crash and error handling
|
||
|
||
If the runner subprocess crashes or exits uncleanly:
|
||
|
||
- The bridge MUST publish an error status message.
|
||
- If `started` was received, the bridge MUST include the resume line in the error message.
|
||
|
||
------
|
||
|
||
## 7. Bridge (Telegram orchestration)
|
||
|
||
### 7.1 Responsibilities
|
||
|
||
The bridge MUST:
|
||
|
||
- Poll Telegram updates.
|
||
- Resolve resume token (from message text or reply target).
|
||
- Start runner execution with appropriate cancellation support.
|
||
- Maintain progress rendering and Telegram edits (rate-limited).
|
||
- Publish final answer and include resume line.
|
||
- Support `/cancel` to cancel the run associated with an in-flight progress message.
|
||
|
||
**Queuing behavior:**
|
||
|
||
- Multiple prompts to the same thread are queued and executed sequentially.
|
||
- There is no queue depth limit; all prompts are accepted.
|
||
|
||
### 7.1.1 Scheduling algorithm (MUST)
|
||
|
||
The bridge MUST implement per-thread FIFO scheduling in a way that does not require spawning one task per queued job.
|
||
|
||
**Definitions:**
|
||
|
||
- `ThreadKey := f"{resume.engine}:{resume.value}"`
|
||
- `Job := (chat_id, user_msg_id, text, resume: ResumeToken | None)`
|
||
|
||
**Required behavior:**
|
||
|
||
- For `resume != None`, the bridge MUST enqueue the job into `pending_by_thread[ThreadKey]` and ensure exactly one worker drains that queue sequentially.
|
||
- If a run starts with `resume == None` but later emits `started(resume=token)`, the bridge MUST treat that run as the in-flight job for `ThreadKey(token)` for scheduling purposes until it completes.
|
||
- A thread worker MUST exit when its queue is empty; the bridge SHOULD avoid retaining per-thread state for inactive threads.
|
||
|
||
The bridge MUST NOT:
|
||
|
||
- parse engine-native events
|
||
- encode engine-specific rules beyond resume extraction via runner
|
||
|
||
### 7.2 Progress behavior
|
||
|
||
- The bridge SHOULD send an initial progress message quickly (“running…”).
|
||
- The bridge SHOULD edit the progress message no more frequently than every 2 seconds.
|
||
- The bridge SHOULD avoid edits if rendered content has not changed.
|
||
|
||
### 7.3 Resume line inclusion
|
||
|
||
The progress renderer and/or final message MUST include the canonical resume line once known:
|
||
|
||
- If `started` has been received, the progress view SHOULD include the resume line.
|
||
- The final message MUST include the resume line.
|
||
|
||
**Important:** because the resume line may appear during progress updates, the bridge MUST treat `started` as the point at which the thread key becomes known for scheduling and cancellation routing.
|
||
|
||
### 7.4 Cancellation `/cancel`
|
||
|
||
- The bridge MUST allow the user to cancel a run in progress by sending `/cancel` in reply to the progress message (or by other defined mapping).
|
||
- Cancel MUST terminate the runner process via **SIGTERM** and stop further progress edits.
|
||
- After cancellation, the bridge MUST publish a "cancelled" status message and SHOULD include the resume line if known.
|
||
- If `/cancel` is sent with additional text, the additional text is ignored; only cancellation occurs.
|
||
|
||
### 7.5 Telegram markdown constraints
|
||
|
||
The bridge MUST:
|
||
|
||
- escape/prepare markdown per Telegram rules
|
||
- enforce Telegram message length limits (including after escaping)
|
||
- avoid truncating away the resume line (use runner `is_resume_line()`)
|
||
|
||
If truncation is required:
|
||
|
||
- the bridge MUST keep the resume line intact
|
||
- the bridge SHOULD preserve the **head** (beginning) of content and add an ellipsis marker before truncation point
|
||
|
||
------
|
||
|
||
## 8. Renderer (progress and final formatting)
|
||
|
||
### 8.1 Renderer responsibilities
|
||
|
||
Renderers MUST:
|
||
|
||
- be deterministic functions of Takopi events and internal state
|
||
- produce markdown text and (optionally) entity annotations
|
||
|
||
Renderers MUST NOT:
|
||
|
||
- depend on engine-native events
|
||
- call Telegram APIs
|
||
- perform blocking operations
|
||
|
||
### 8.2 Progress renderer state
|
||
|
||
The progress renderer SHOULD maintain:
|
||
|
||
- session title
|
||
- current running actions and their latest summaries
|
||
- completed actions and status
|
||
- resume token if known
|
||
|
||
If the runner emits multiple `action` events for the same `Action.id` while it is still running (e.g., repeated `phase="started"` or `phase="updated"`), the progress renderer SHOULD treat these as updates and collapse them into a single line (replacing the prior running line rather than appending a new one).
|
||
|
||
### 8.3 Final rendering
|
||
|
||
Final output MUST include:
|
||
|
||
- status line (`done` / `error` / `cancelled`)
|
||
- final `answer`
|
||
- resume line
|
||
|
||
------
|
||
|
||
## 9. Configuration and engine selection
|
||
|
||
### 9.1 v0.2.0 behavior (Decision #5)
|
||
|
||
- A single runner/engine is selected at startup via config/CLI (default: Codex).
|
||
- Resume extraction uses only the selected runner’s parser.
|
||
- If the user attempts to resume a thread created by a different engine, resume extraction will fail and the bot treats it as a new thread.
|
||
|
||
### 9.2 Future behavior (non-normative)
|
||
|
||
Takopi MAY support:
|
||
|
||
- trying all registered runners’ `extract_resume` to auto-select a runner for resumes
|
||
- falling back to default runner when no resume is present
|
||
|
||
The architecture SHOULD keep this future change localized to a `RunnerRegistry` / router.
|
||
|
||
------
|
||
|
||
## 10. Testing requirements (v0.2.0)
|
||
|
||
### 10.1 Test categories (MUST)
|
||
|
||
1. **Runner contract tests**
|
||
- Emits exactly one `started`
|
||
- All actions have required fields and stable IDs
|
||
- `completed.resume` matches started token (when present)
|
||
- Event ordering is preserved
|
||
- `ok` semantics match intended behavior
|
||
2. **Runner serialization tests (critical)**
|
||
- Serializes concurrent runs for the same `ResumeToken`
|
||
- For `resume=None`, acquires per-thread lock once the token is known (before emitting `started`)
|
||
3. **Bridge per-thread scheduling tests (critical)**
|
||
- Enqueue two prompts for the same `ResumeToken`
|
||
- Assert the bridge does not start the second run until the first completes
|
||
4. **Bridge progress throttling tests**
|
||
- Edits no more frequently than configured interval
|
||
- No edits without changes
|
||
- Truncation preserves resume line
|
||
5. **Cancellation tests**
|
||
- `/cancel` terminates run
|
||
- “cancelled” status produced
|
||
- resume line included if known
|
||
6. **Renderer formatting tests**
|
||
- Correct rendering of actions
|
||
- Stable formatting under event sequences
|
||
|
||
### 10.2 Test tooling guidelines (SHOULD)
|
||
|
||
- Provide **event factories** in tests for readability.
|
||
- Provide a deterministic fake clock/sleep.
|
||
- Use a script/mock runner to simulate event sequences.
|
||
|
||
------
|
||
|
||
## 11. Open design notes / evolution hooks
|
||
|
||
### 11.1 Takopi-owned resume tags (future discussion)
|
||
|
||
Even though canonical is engine CLI command in v0.2.0, Takopi MAY later add a Takopi-owned unambiguous line such as:
|
||
|
||
- `resume: codex:<uuid>`
|
||
|
||
Benefits:
|
||
|
||
- easier multi-runner routing
|
||
- resilience to CLI syntax changes
|
||
- simpler truncation and extraction
|
||
|
||
This is not required for v0.2.0.
|
||
|
||
### 11.2 EngineId typing
|
||
|
||
To reduce friction adding new runners, v0.2.0 SHOULD treat engine IDs as strings (or a `NewType(str)`), not a closed Literal union.
|
||
|
||
------
|
||
|
||
## 12. Changelog template (for evolving this spec)
|
||
|
||
- v0.2.0 [2025-12-31]
|
||
- Establish Takopi normalized event model and runner protocol
|
||
- Canonical resume representation is engine CLI command
|
||
- Enforce per-thread serialization including new sessions once token is known
|
||
- Telegram-only bridge with progress edits + cancellation
|
||
- Recommended module split into one-word modules
|
||
- Clarify: `ok` semantics are runner-defined, `detail` is freeform
|
||
- Clarify: bridge queues per thread (FIFO)
|
||
- Clarify: SIGTERM for cancellation, `/cancel` ignores accompanying text
|
||
- Clarify: truncation preserves head + resume line
|
||
- Clarify: crash publishes error with resume if known
|
||
|
||
------
|
||
|
||
## Appendix A: Example end-to-end flow (informative)
|
||
|
||
1. User sends: “Refactor this module and run tests.”
|
||
2. Bridge resolves resume token:
|
||
- none in message, none in reply → `resume=None`
|
||
3. Bridge sends a progress message: “Running…”
|
||
4. Runner starts and emits:
|
||
- `started(engine="codex", resume={engine:"codex", value:"<uuid>"})`
|
||
- `action(id="1", kind="command", title="pytest", detail={...}, phase="started")`
|
||
- `action(id="1", ok=True, phase="completed", ...)`
|
||
- `completed(resume=..., ok=True, answer="...")`
|
||
5. Progress renderer now includes resume line:
|
||
- ``codex resume <uuid>``
|
||
6. User replies to progress message with follow-up prompt.
|
||
7. Bridge extracts resume via runner, chooses same thread, and queues it behind the in-flight run if still active.
|
||
8. Final message includes:
|
||
- “done”
|
||
- final answer
|
||
- resume line ``codex resume <uuid>``
|