Files
takopi/specification.md
T
2026-01-01 01:30:46 +04:00

547 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Takopi Specification v0.2.0 [2025-12-31]
This document specifies Takopi v0.2.0 behavior and architecture in a way that is testable, evolvable, and explicitly aligned with the goals:
- **Better testability**
- **Runner abstraction** to support future runners (e.g., Claude Code)
- **Telegram remains the only bot client** (adding another is unlikely)
- **Parallel runs are allowed across different threads**, but runs for the **same thread must be serialized** to avoid corrupting history
This is a normative spec using **MUST / SHOULD / MAY** language. Sections labeled **Decision** capture choices that should remain stable unless intentionally changed.
------
## 1. Scope and goals
### 1.1 Goals (v0.2.0)
1. Provide a Telegram bot that runs an “exec agent” (runner) and streams progress updates with periodic edits.
2. Support “thread continuation” via a **resume command** embedded in chat messages.
3. Support **parallel execution across different threads** (different resume tokens).
4. Enforce **serialization per thread** (same resume token) to avoid concurrent mutation of the same engine conversation/history.
5. Establish a stable, Takopi-owned **normalized event model** that runners produce and renderers consume.
6. Keep architecture modular enough to add another runner in a future version with minimal changes.
### 1.2 Non-goals (v0.2.0)
- Adding additional bot clients besides Telegram (Discord/Slack/etc.) is out of scope.
- Implementing auto-selection of multiple runners is not required (but should be prepared for).
- Streaming partial assistant answers token-by-token is not required (progress UI is event-driven; final answer is delivered at completion).
- Supporting engines that cannot provide stable action IDs is out of scope (see §5.4).
------
## 2. Terminology
- **Runner / Engine**: Implementation that executes an agent process (Codex today; Claude Code later) and produces Takopi events.
- **Thread**: The engine-side conversation identifier. In Takopi this is represented as a **ResumeToken**.
- **ResumeToken**: A Takopi-owned structured identifier: `{ engine: EngineId, value: str }`.
- **ResumeLine**: A runner-owned string representation embedded in chat; **canonical** representation is the engine CLI command (Decision §4.1).
- **Takopi Event**: A normalized event dict emitted by a runner and consumed by renderers/bridge.
- **Progress Message**: Telegram message that is edited periodically to show live status.
- **Final Message**: Telegram message containing final answer + resume line + status.
------
## 3. Architecture overview
### 3.1 Layers and responsibilities (strict boundaries)
**Domain Model (Takopi-owned)**
- Defines: `ResumeToken`, `TakopiEvent`, `Action` (including the terminal `completed` event).
- No Telegram, no subprocess, no engine JSON.
**Runner Interface (Takopi-owned)**
- Defines `Runner` protocol: `run()`, `extract_resume()`, `format_resume()`, etc.
- Runners are trusted producers of Takopi events (Decision §5.2).
**Runner Implementations (engine-owned logic)**
- Codex runner translates engine-specific stream into Takopi events.
- Each runner enforces per-thread serialization (MUST, §6.2).
**Renderers (Takopi-owned)**
- Pure functions/state machines that consume Takopi events and produce markdown strings.
- No engine-specific parsing.
- No Telegram API calls.
**Bridge (Telegram orchestration)**
- Receives Telegram updates and turns them into runner invocations.
- Maintains throttled progress editing.
- Handles cancellation `/cancel`.
- Owns Telegram markdown constraints (limits, entity formatting).
### 3.2 Module naming and one-word modules (v0.2.0 refactor target)
Recommended module layout (single-word filenames, clean layering):
- `takopi/model.py`
Domain types: events, actions, resume token.
- `takopi/runner.py`
Runner protocol.
- `takopi/runners/codex.py`
Codex runner implementation.
- `takopi/runners/mock.py`
Script/mock runner for tests.
- `takopi/render.py`
Progress renderer and event-to-text formatting.
- `takopi/bridge.py`
Telegram orchestration; main loop and message handler.
- `takopi/cli.py`
Typer/CLI entrypoints, config loading, engine selection.
- `takopi/markdown.py`
Markdown sanitization + Telegram entity prep.
**Rationale:**
The normalized event model MUST NOT live under `runners/` because it is core domain state shared by bridge and renderer.
------
## 4. Resume tokens and resume lines
### 4.1 Decision: canonical resume representation is engine CLI command
The canonical representation of “resume” embedded in chat is the runners **engine CLI resume command**, e.g.:
- Codex: ``codex resume <uuid>``
Takopi MUST treat the runner as the authority for:
- formatting a `ResumeToken` into a `ResumeLine`
- extracting a `ResumeToken` from message text
Takopi MAY introduce additional Takopi-owned metadata lines in the future (e.g., `resume: codex:<uuid>`), but **v0.2.0 canonical remains the CLI command**.
### 4.2 ResumeToken structure (Takopi-owned)
```python
@dataclass(frozen=True, slots=True)
class ResumeToken:
engine: str # EngineId (string)
value: str
```
### 4.3 Runner resume codec interface (MUST)
Each runner MUST implement:
- `format_resume(token: ResumeToken) -> str`
Returns a ResumeLine suitable for embedding in Telegram markdown (usually inside backticks).
- `extract_resume(text: str) -> ResumeToken | None`
Extracts a ResumeToken from arbitrary message text.
- `is_resume_line(line: str) -> bool`
Fast check used for truncation safety (to preserve the resume line during trimming).
**Constraints:**
- `format_resume()` MUST raise or otherwise fail if `token.engine != runner.engine`.
- `extract_resume()` MUST return `None` if it cannot confidently parse a resume command for its engine.
### 4.4 Resume extraction behavior in the bridge (v0.2.0)
Given a user message `text` and optional reply-to message `reply_text`:
1. The bridge MUST attempt `runner.extract_resume(text)`.
2. If not found, the bridge MUST attempt `runner.extract_resume(reply_text)` if present.
3. If still not found, run starts as a **new thread** (`resume=None`).
**Future note (non-normative):**
For multi-runner auto-selection, the bridge MAY attempt extraction across all registered runners. This is not required for v0.2.0.
------
## 5. Normalized event model (Takopi-owned)
### 5.1 Decision: events are trusted after normalization
Runners are responsible for producing well-formed Takopi events. Downstream consumers (render/bridge) SHOULD assume validity and may fail fast if invariants are violated (Decision §5.2).
### 5.2 Event types (minimum set)
Takopi MUST support the following event types:
1. `started`
2. `action`
3. `completed`
### 5.3 Required fields by event type
#### 5.3.1 `started`
Required:
- `type: "started"`
- `engine: EngineId`
- `resume: ResumeToken`
Optional:
- `title: str` (human-readable session/agent label)
- `meta: dict` (debug/diagnostic payloads)
#### 5.3.2 `action`
Required:
- `type: "action"`
- `engine: EngineId`
- `action: Action`
- `phase: "started" | "updated" | "completed"`
Optional:
- `ok: bool` (typically present when `phase="completed"`)
- `message: str` (freeform status/warning text)
- `level: "debug" | "info" | "warning" | "error"`
#### 5.3.3 `completed`
Required:
- `type: "completed"`
- `engine: EngineId`
- `ok: bool` (success/failure of the run)
- `answer: str` (final assistant response text; may be empty)
Optional:
- `resume: ResumeToken` (final resume token for the run; new or existing, if known)
- `error: str | None` (fatal error message, if any)
- `usage: dict` (engine usage/telemetry, if provided)
### 5.4 Action schema (MUST, per your Decision #4)
Actions MUST have stable IDs.
```python
@dataclass(frozen=True, slots=True)
class Action:
id: str # required
kind: str # required, stable taxonomy
title: str # required, short label
detail: dict[str, Any] # required, structured details
```
**Definition (v0.2.0):**
“Stable” means **stable within a single run**: the same underlying action MUST keep the same `Action.id` across all events in that run, and `Action.id` values MUST be unique within the run. Takopi does not require action IDs to remain stable across different runs/resumes.
Action kinds SHOULD be from a stable set (extensible):
- `command`
- `tool`
- `file_change`
- `web_search`
- `note`
- `turn`
- `warning`
- `telemetry`
- `note`
Runners MAY include additional kinds, but renderers MAY treat unknown kinds as `note`.
The `detail` dict is **freeform per runner**; no per-kind schema is enforced. Renderers SHOULD handle missing or unexpected fields gracefully.
The `ok` field semantics are **runner-defined**. For example, a runner MAY treat `grep` exit code 1 (no match) as `ok=True` if contextually appropriate.
**User-visible warnings and errors:** runners SHOULD surface these as `action` events with `phase="completed"` (typically `kind="warning"` or `kind="note"`) and `ok=False`, rather than introducing additional event types.
------
## 6. Runner interface and concurrency semantics
### 6.1 Runner protocol (MUST)
```python
class Runner(Protocol):
engine: str
def run(
self,
prompt: str,
resume: ResumeToken | None,
) -> AsyncIterator[TakopiEvent]: ...
```
### 6.2 Per-thread serialization (MUST; core invariant)
**Invariant:** At most one active run may operate on the same thread (same `ResumeToken`) at a time.
- Parallel runs are allowed only if they target **different** threads.
- Runs targeting the same thread MUST be queued and executed sequentially.
- This invariant MUST be enforced by the runner implementation (even if used outside the bridge).
**Critical requirement for new sessions:**
If `resume is None`, the runner MUST acquire the per-thread lock **as soon as the new thread's ResumeToken becomes known**, and MUST do so **before emitting `started`** to downstream consumers.
This prevents:
- a second run resuming the thread while the original "new session" run is still active
- history corruption due to concurrent engine operations
**Bridge note (non-normative):**
The bridge may enforce FIFO scheduling per thread to avoid emitting multiple progress messages for the same thread while a run is already in-flight.
**Codex note (non-normative):**
Codex emits `thread.started` (with `thread_id`) before any `turn.*` / `item.*` events for both new and resumed runs. Codex MAY emit top-level warning `error` lines (e.g., config warnings) before `thread.started`; the Codex runner translates these warnings into `action` events with `phase="completed"` and yields them in the same order as received (so `started` is not guaranteed to be the first yielded event). If the subprocess exits before `thread.started` is observed, no `started` can be emitted and the bridge reports an error without a resume line.
Codex also emits exactly one `agent_message`/`assistant_message` per turn; the runner uses that message text as `completed.answer`.
### 6.3 Run completion event (MUST)
```python
@dataclass(frozen=True, slots=True)
class CompletedEvent:
type: Literal["completed"]
engine: EngineId
ok: bool # success/failure of the run
resume: ResumeToken | None = None # final resume token for the run (new or existing, if known)
answer: str # final assistant response text (may be empty)
```
`completed` MUST be the final event of a successful run.
### 6.4 Event delivery semantics (MUST)
Event ordering is significant. The system MUST ensure:
- Events are yielded to the consumer in the same order they are produced by the runner.
- Event delivery MUST NOT spawn unbounded background tasks per event.
- If the consumer stops iteration early (break/cancel/exception), the runner MUST abort the run (best-effort) and release any held resources.
### 6.5 Crash and error handling
If the runner subprocess crashes or exits uncleanly:
- The bridge MUST publish an error status message.
- If `started` was received, the bridge MUST include the resume line in the error message.
------
## 7. Bridge (Telegram orchestration)
### 7.1 Responsibilities
The bridge MUST:
- Poll Telegram updates.
- Resolve resume token (from message text or reply target).
- Start runner execution with appropriate cancellation support.
- Maintain progress rendering and Telegram edits (rate-limited).
- Publish final answer and include resume line.
- Support `/cancel` to cancel the run associated with an in-flight progress message.
**Queuing behavior:**
- Multiple prompts to the same thread are queued and executed sequentially.
- There is no queue depth limit; all prompts are accepted.
### 7.1.1 Scheduling algorithm (MUST)
The bridge MUST implement per-thread FIFO scheduling in a way that does not require spawning one task per queued job.
**Definitions:**
- `ThreadKey := f"{resume.engine}:{resume.value}"`
- `Job := (chat_id, user_msg_id, text, resume: ResumeToken | None)`
**Required behavior:**
- For `resume != None`, the bridge MUST enqueue the job into `pending_by_thread[ThreadKey]` and ensure exactly one worker drains that queue sequentially.
- If a run starts with `resume == None` but later emits `started(resume=token)`, the bridge MUST treat that run as the in-flight job for `ThreadKey(token)` for scheduling purposes until it completes.
- A thread worker MUST exit when its queue is empty; the bridge SHOULD avoid retaining per-thread state for inactive threads.
The bridge MUST NOT:
- parse engine-native events
- encode engine-specific rules beyond resume extraction via runner
### 7.2 Progress behavior
- The bridge SHOULD send an initial progress message quickly (“running…”).
- The bridge SHOULD edit the progress message no more frequently than every 2 seconds.
- The bridge SHOULD avoid edits if rendered content has not changed.
### 7.3 Resume line inclusion
The progress renderer and/or final message MUST include the canonical resume line once known:
- If `started` has been received, the progress view SHOULD include the resume line.
- The final message MUST include the resume line.
**Important:** because the resume line may appear during progress updates, the bridge MUST treat `started` as the point at which the thread key becomes known for scheduling and cancellation routing.
### 7.4 Cancellation `/cancel`
- The bridge MUST allow the user to cancel a run in progress by sending `/cancel` in reply to the progress message (or by other defined mapping).
- Cancel MUST terminate the runner process via **SIGTERM** and stop further progress edits.
- After cancellation, the bridge MUST publish a "cancelled" status message and SHOULD include the resume line if known.
- If `/cancel` is sent with additional text, the additional text is ignored; only cancellation occurs.
### 7.5 Telegram markdown constraints
The bridge MUST:
- escape/prepare markdown per Telegram rules
- enforce Telegram message length limits (including after escaping)
- avoid truncating away the resume line (use runner `is_resume_line()`)
If truncation is required:
- the bridge MUST keep the resume line intact
- the bridge SHOULD preserve the **head** (beginning) of content and add an ellipsis marker before truncation point
------
## 8. Renderer (progress and final formatting)
### 8.1 Renderer responsibilities
Renderers MUST:
- be deterministic functions of Takopi events and internal state
- produce markdown text and (optionally) entity annotations
Renderers MUST NOT:
- depend on engine-native events
- call Telegram APIs
- perform blocking operations
### 8.2 Progress renderer state
The progress renderer SHOULD maintain:
- session title
- current running actions and their latest summaries
- completed actions and status
- resume token if known
If the runner emits multiple `action` events for the same `Action.id` while it is still running (e.g., repeated `phase="started"` or `phase="updated"`), the progress renderer SHOULD treat these as updates and collapse them into a single line (replacing the prior running line rather than appending a new one).
### 8.3 Final rendering
Final output MUST include:
- status line (`done` / `error` / `cancelled`)
- final `answer`
- resume line
------
## 9. Configuration and engine selection
### 9.1 v0.2.0 behavior (Decision #5)
- A single runner/engine is selected at startup via config/CLI (default: Codex).
- Resume extraction uses only the selected runners parser.
- If the user attempts to resume a thread created by a different engine, resume extraction will fail and the bot treats it as a new thread.
### 9.2 Future behavior (non-normative)
Takopi MAY support:
- trying all registered runners `extract_resume` to auto-select a runner for resumes
- falling back to default runner when no resume is present
The architecture SHOULD keep this future change localized to a `RunnerRegistry` / router.
------
## 10. Testing requirements (v0.2.0)
### 10.1 Test categories (MUST)
1. **Runner contract tests**
- Emits exactly one `started`
- All actions have required fields and stable IDs
- `completed.resume` matches started token (when present)
- Event ordering is preserved
- `ok` semantics match intended behavior
2. **Runner serialization tests (critical)**
- Serializes concurrent runs for the same `ResumeToken`
- For `resume=None`, acquires per-thread lock once the token is known (before emitting `started`)
3. **Bridge per-thread scheduling tests (critical)**
- Enqueue two prompts for the same `ResumeToken`
- Assert the bridge does not start the second run until the first completes
4. **Bridge progress throttling tests**
- Edits no more frequently than configured interval
- No edits without changes
- Truncation preserves resume line
5. **Cancellation tests**
- `/cancel` terminates run
- “cancelled” status produced
- resume line included if known
6. **Renderer formatting tests**
- Correct rendering of actions
- Stable formatting under event sequences
### 10.2 Test tooling guidelines (SHOULD)
- Provide **event factories** in tests for readability.
- Provide a deterministic fake clock/sleep.
- Use a script/mock runner to simulate event sequences.
------
## 11. Open design notes / evolution hooks
### 11.1 Takopi-owned resume tags (future discussion)
Even though canonical is engine CLI command in v0.2.0, Takopi MAY later add a Takopi-owned unambiguous line such as:
- `resume: codex:<uuid>`
Benefits:
- easier multi-runner routing
- resilience to CLI syntax changes
- simpler truncation and extraction
This is not required for v0.2.0.
### 11.2 EngineId typing
To reduce friction adding new runners, v0.2.0 SHOULD treat engine IDs as strings (or a `NewType(str)`), not a closed Literal union.
------
## 12. Changelog template (for evolving this spec)
- v0.2.0 [2025-12-31]
- Establish Takopi normalized event model and runner protocol
- Canonical resume representation is engine CLI command
- Enforce per-thread serialization including new sessions once token is known
- Telegram-only bridge with progress edits + cancellation
- Recommended module split into one-word modules
- Clarify: `ok` semantics are runner-defined, `detail` is freeform
- Clarify: bridge queues per thread (FIFO)
- Clarify: SIGTERM for cancellation, `/cancel` ignores accompanying text
- Clarify: truncation preserves head + resume line
- Clarify: crash publishes error with resume if known
------
## Appendix A: Example end-to-end flow (informative)
1. User sends: “Refactor this module and run tests.”
2. Bridge resolves resume token:
- none in message, none in reply → `resume=None`
3. Bridge sends a progress message: “Running…”
4. Runner starts and emits:
- `started(engine="codex", resume={engine:"codex", value:"<uuid>"})`
- `action(id="1", kind="command", title="pytest", detail={...}, phase="started")`
- `action(id="1", ok=True, phase="completed", ...)`
- `completed(resume=..., ok=True, answer="...")`
5. Progress renderer now includes resume line:
- ``codex resume <uuid>``
6. User replies to progress message with follow-up prompt.
7. Bridge extracts resume via runner, chooses same thread, and queues it behind the in-flight run if still active.
8. Final message includes:
- “done”
- final answer
- resume line ``codex resume <uuid>``