Files
takopi/docs/specification.md
T
2026-01-01 21:42:14 +04:00

378 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Takopi Specification v0.2.0 (minimal) [2025-12-31]
This document is **normative**. The words **MUST**, **SHOULD**, and **MAY** express requirements.
## 1. Scope
Takopi v0.2.0 specifies:
- A **Telegram** bot bridge that runs an agent **Runner** and posts:
- a throttled, edited **progress message**
- a **final message** with the final answer and a resume line
- **Thread continuation** via a **resume command** embedded in chat messages
- **Parallel runs across different threads**
- **Serialization within a thread** (no concurrent runs on the same thread)
- A Takopi-owned **normalized event model** produced by runners and consumed by renderers/bridge
Out of scope for v0.2.0:
- Non-Telegram clients (Slack/Discord/etc.)
- Auto-selecting among multiple runners
- Token-by-token streaming of the assistants final answer
- Engines/runners that cannot provide **stable action IDs** within a run
## 2. Terminology
- **EngineId**: string identifier of an engine (e.g., `"codex"`).
- **Runner**: Takopi adapter that executes an engine process and yields **Takopi events**.
- **Thread**: a single engine-side conversation, identified in Takopi by a **ResumeToken**.
- **ResumeToken**: Takopi-owned thread identifier `{ engine: EngineId, value: str }`.
- **ResumeLine**: a runner-owned string embedded in chat that represents a ResumeToken.
- **Run**: a single invocation of `Runner.run(prompt, resume)`.
- **TakopiEvent**: a normalized event emitted by a runner and consumed by renderers/bridge.
- **Progress message**: a Telegram message that is periodically edited during a run.
- **Final message**: a Telegram message that includes run status, final answer, and resume line.
## 3. Resume tokens and resume lines
### 3.1 Decision: canonical resume line is the engine CLI resume command
The canonical ResumeLine embedded in chat MUST be the engines CLI resume command, e.g.:
- `codex resume <id>`
- `claude --resume <id>`
Takopi MUST treat the runner as authoritative for:
- formatting a ResumeToken into a ResumeLine
- extracting a ResumeToken from message text
### 3.2 ResumeToken schema (Takopi-owned)
```python
@dataclass(frozen=True, slots=True)
class ResumeToken:
engine: str # EngineId
value: str
```
### 3.3 Runner resume codec (MUST)
Each runner MUST implement:
* `format_resume(token: ResumeToken) -> str`
* `extract_resume(text: str) -> ResumeToken | None`
* `is_resume_line(line: str) -> bool`
Constraints:
* `format_resume()` MUST fail if `token.engine != runner.engine`.
* `extract_resume()` MUST return `None` if it cannot **confidently** parse a resume line for its engine.
### 3.4 Bridge resume resolution (MUST)
Given `text` (user message) and optional `reply_text` (the message being replied to):
1. The bridge MUST attempt `runner.extract_resume(text)`.
2. If not found, it MUST attempt `runner.extract_resume(reply_text)` if present.
3. If still not found, the run MUST start with `resume=None` (new thread).
## 4. Normalized event model
### 4.1 Decision: events are trusted after normalization
Runners are responsible for emitting well-formed Takopi events. Consumers (renderer/bridge) SHOULD assume validity and MAY fail fast on invariant violations.
### 4.2 Supported event types (minimum set)
Takopi MUST support:
* `started`
* `action`
* `completed`
Minimal runner mode is supported:
* A runner MAY emit only `started` and `completed`.
* If `action` events are emitted, `phase="completed"` alone is valid (no requirement to emit `started`/`updated` phases).
### 4.3 Event schemas
All events MUST include `engine: EngineId` and `type`.
#### 4.3.1 `started`
Required:
* `type: "started"`
* `engine: EngineId`
* `resume: ResumeToken`
Optional:
* `title: str`
* `meta: dict`
#### 4.3.2 `action`
Required:
* `type: "action"`
* `engine: EngineId`
* `action: Action`
* `phase: "started" | "updated" | "completed"`
Optional:
* `ok: bool` (typically on `phase="completed"`)
* `message: str`
* `level: "debug" | "info" | "warning" | "error"`
Notes:
* `phase="completed"` alone is valid.
#### 4.3.3 `completed`
Required:
* `type: "completed"`
* `engine: EngineId`
* `ok: bool` (overall run success/failure)
* `answer: str` (final assistant answer; MAY be empty)
Optional:
* `resume: ResumeToken` (final token; new or existing, if known)
* `error: str | None` (fatal error message, if any)
* `usage: dict` (telemetry/usage if available)
### 4.4 Action schema (MUST; stable IDs)
Actions MUST have stable IDs within a run:
```python
@dataclass(frozen=True, slots=True)
class Action:
id: str
kind: str
title: str
detail: dict[str, Any]
```
Stability requirements:
* Within a single run, the same underlying action MUST keep the same `Action.id` across events.
* `Action.id` values MUST be unique within a run.
* IDs do **not** need to be stable across different runs/resumes.
Action kinds SHOULD come from an extensible stable set, e.g.:
* `command`, `tool`, `file_change`, `web_search`, `turn`, `warning`, `telemetry`, `note`
Unknown kinds MAY be rendered as `note`.
`detail` is freeform; no per-kind schema is required.
`ok` semantics are runner-defined.
User-visible warnings/errors SHOULD be surfaced as `action` events (typically `kind="warning"` or `kind="note"`, `phase="completed"`, `ok=False`) rather than introducing new event types.
## 5. Runner protocol and concurrency
### 5.1 Runner protocol (MUST)
```python
class Runner(Protocol):
engine: str # EngineId
def run(
self,
prompt: str,
resume: ResumeToken | None,
) -> AsyncIterator[TakopiEvent]: ...
```
### 5.2 Per-thread serialization (MUST; core invariant)
Define:
* `ThreadKey(resume) := f"{resume.engine}:{resume.value}"`
Invariant:
* At most **one** active run may operate on the same `ThreadKey` at a time.
Rules:
* Runs for different ThreadKeys MAY run in parallel.
* Runs for the same ThreadKey MUST be queued and executed sequentially.
* This invariant MUST be enforced by the runner implementation even if used outside the Telegram bridge.
New thread rule (`resume is None`):
* When the runner learns the new threads ResumeToken, it MUST:
* acquire the per-thread lock for that token
* do so **before emitting** `started(resume=token)`
### 5.3 `started` emission and ordering
* If the runner obtains a ResumeToken for the run, it MUST emit exactly one `started` event containing that token.
* The runner MAY emit `action` events before `started` (e.g., pre-init warnings). Consumers MUST NOT assume `started` is the first event.
### 5.4 Completion
* If the run reaches `started`, and then terminates under the runners control (success or detected failure), the runner MUST emit exactly one `completed` event and it MUST be the last event.
* If the runner never obtains a ResumeToken (e.g., fatal failure before session init), it MAY emit no `started` and no `completed`.
### 5.5 Event delivery semantics (MUST)
* Events MUST be yielded in the order produced by the runner.
* The runner MUST NOT spawn unbounded background tasks per event.
* If the consumer stops iterating early (cancel/break/exception), the runner MUST abort the run best-effort and release any held locks/resources.
## 6. Bridge (Telegram orchestration)
### 6.1 Responsibilities (MUST)
The bridge MUST:
* Receive Telegram updates
* Resolve resume token (per §3.4)
* Schedule runs per thread (per §6.2)
* Start runner execution with cancellation support
* Maintain a progress message with rate-limited edits
* Publish a final message containing status, answer, and resume line (when known)
* Support `/cancel` for in-flight runs
The bridge MUST NOT:
* parse engine-native streams/events
* embed engine-specific rules beyond calling runner resume extraction/formatting
Queue depth:
* There is no queue depth limit; all prompts are accepted.
### 6.2 Scheduling (MUST)
Definitions:
* `Job := (chat_id, user_msg_id, text, resume: ResumeToken | None)`
Required behavior:
* For `resume != None`, the bridge MUST enqueue jobs into `pending_by_thread[ThreadKey(resume)]`.
* For each ThreadKey, exactly one worker (or equivalent mechanism) MUST drain the queue sequentially.
* A worker MUST exit when its queue is empty; the bridge SHOULD avoid retaining state for inactive threads.
* The implementation MUST avoid spawning one long-lived task per queued job (bounded concurrency).
Runs that start as new threads:
* If a job starts with `resume=None` and later yields `started(resume=token)`, the bridge MUST treat that run as the in-flight job for `ThreadKey(token)` until it completes (for scheduling and cancellation routing).
### 6.3 Progress message behavior
* The bridge SHOULD send an initial progress message quickly (e.g., “Running…”).
* The bridge SHOULD edit the progress message no more frequently than every **2 seconds**.
* The bridge SHOULD skip edits when rendered content is unchanged.
* Once `started` is observed, the progress view SHOULD include the canonical ResumeLine.
### 6.4 Final message requirements (MUST)
The final output MUST include:
* a status line (`done` / `error` / `cancelled`)
* the final `answer` (if any)
* the ResumeLine if known (and MUST include it if `started` was received)
### 6.5 Cancellation `/cancel` (MUST)
* The bridge MUST allow users to cancel a run in progress by sending `/cancel` in reply to the progress message (or by an equivalent mapping defined by the bridge).
* Cancellation MUST terminate the runner process via **SIGTERM**.
* After cancellation, the bridge MUST stop further progress edits and publish a “cancelled” status message.
* The bridge SHOULD include the ResumeLine if known.
* Any additional text after `/cancel` is ignored.
### 6.6 Telegram markdown + truncation (MUST)
The bridge MUST:
* escape/prepare Telegram markdown correctly
* enforce Telegram message length limits (including after escaping)
* avoid truncating away the ResumeLine (using `runner.is_resume_line()`)
If truncation is required:
* the bridge MUST keep the ResumeLine intact
* the bridge SHOULD preserve the beginning of the content and insert an ellipsis at the truncation point
### 6.7 Crash/error handling (MUST)
If the runner crashes or exits uncleanly:
* the bridge MUST publish an error status message
* if `started` was received, the bridge MUST include the ResumeLine in that error message
## 7. Renderer
Renderers MUST:
* be deterministic functions/state machines over Takopi events + internal renderer state
* produce Telegram-ready markdown (or markdown + entities)
* tolerate `action` events that are “completed-only” (no prior `started`/`updated`)
Renderers MUST NOT:
* depend on engine-native event formats
* call Telegram APIs
* perform blocking I/O
Action update collapsing:
* If multiple `action` events share the same `Action.id`, renderers SHOULD treat later `started`/`updated` events as updates (replace the prior running line rather than appending).
## 8. Configuration and engine selection
Decision (v0.2.0):
* Exactly one runner is selected at startup via a CLI subcommand (no default).
* If no engine subcommand is provided, Takopi prints an engine chooser panel and exits.
* Resume extraction uses only the selected runner.
* If a user provides a resume line for a different engine, extraction fails and the bridge treats the message as a new thread (`resume=None`).
## 9. Testing requirements (MUST)
Tests MUST cover:
1. **Runner contract**
* If a token is obtained: exactly one `started`
* Action schema validity (required fields; stable unique IDs within run)
* Event ordering preserved
* `completed` emitted and last for controlled termination after `started`
2. **Runner serialization**
* Concurrent runs for the same ResumeToken serialize
* `resume=None` runs acquire the per-thread lock once token is known and before emitting `started`
3. **Bridge per-thread scheduling**
* FIFO per ThreadKey
* second job for same thread does not start until first completes
4. **Progress throttling**
* edits not more frequent than configured interval
* no edit when content unchanged
* truncation preserves ResumeLine
5. **Cancellation**
* `/cancel` terminates run and produces “cancelled”
* ResumeLine included if known
6. **Renderer formatting**
* completed-only actions render correctly
* repeated events for same Action.id collapse as intended
Test tooling SHOULD include event factories, deterministic/fake time, and a script/mock runner.