SPEC.md — @robota-sdk/agent-cli
Scope
Interactive terminal AI coding assistant. A React + Ink-based TUI, corresponding to Claude Code. A thin CLI layer built on top of agent-sdk, responsible only for the terminal UI.
Boundaries
- Does NOT own Session/SessionStore — handled internally by
@robota-sdk/agent-sdk; CLI must NOT import from@robota-sdk/agent-sessions - Does NOT own tools — assembled internally by
@robota-sdk/agent-sdk; CLI must NOT import from@robota-sdk/agent-tools - Does NOT own permissions/hooks — public types imported from
@robota-sdk/agent-core; permission callback type (TInteractivePermissionHandler) owned by@robota-sdk/agent-sdk - Does NOT own config/context loading — loaded internally by
InteractiveSessionconstructor - Does NOT own automatic project memory capture, retrieval, approval policy, or memory storage — handled by
@robota-sdk/agent-sdk; CLI/TUI may only render command output and notices - Does NOT own edit checkpoint capture, storage, or restore algorithms — handled by
@robota-sdk/agent-sdk; CLI/TUI may only route/rewind, render command output, and later provide picker chrome over SDK data - OWNS: Provider composition (receives provider definitions, reads config, selects an injected definition, creates instance, passes to
InteractiveSession) - Does NOT own
InteractiveSession— imported from@robota-sdk/agent-sdk - Does NOT own
CommandRegistry,BuiltinCommandSource,SkillCommandSource— all imported from@robota-sdk/agent-sdk - Does NOT use
SystemCommandExecutordirectly — usessession.executeCommand(name, args)instead - Does NOT own ITerminalOutput/ISpinner — SSOT is
@robota-sdk/agent-core - OWNS: Ink TUI components, permission-prompt (terminal UI), CLI argument parsing,
useInteractiveSessionhook - OWNS: CLI package-version update checks and user-level update-check cache
- OWNS: CLI-only command modules for terminal UI configuration, including
/statusline - Does NOT own
PluginCommandSource— imported from@robota-sdk/agent-sdk - Does NOT own
plugin-hooks-merger— moved to@robota-sdk/agent-sdk
Import Rules
| Source | Allowed | Examples |
|---|---|---|
agent-sdk | SDK-owned APIs | InteractiveSession, TInteractivePermissionHandler |
agent-core | Public types + utilities only | TUniversalMessage, TPermissionMode, createSystemMessage, getModelName |
agent-core | ❌ Internal engine | RobotaExecutionServiceConversationStore |
agent-sessions | ❌ Forbidden | SDK provides its own session and permission types |
agent-tools | ❌ Forbidden | SDK assembles tools internally |
agent-provider-* | ✅ Provider definition assembly only | CLI composes injected IProviderDefinition[]; provider packages own defaults and factories |
Architecture
The CLI is a pure TUI layer. All business logic (session lifecycle, slash command execution, tool orchestration, abort handling) lives in @robota-sdk/agent-sdk's InteractiveSession. The CLI:
- Reads config to determine which provider profile to use.
- Resolves the profile
typeagainst an injectedIProviderDefinition[]. - Creates the provider instance by calling
definition.createProvider(config). - Creates
InteractiveSession({ cwd, provider })— config and context loading happen internally inside the SDK. - Subscribes to
InteractiveSessionevents and converts them to React state for rendering.
Provider Profile Creation
The CLI owns provider profile resolution and provider definition composition. It must not branch on provider type names to decide defaults, required fields, setup prompts, endpoint probes, or constructor behavior. Those values come from injected IProviderDefinition records.
Settings may define an active provider profile:
{
"currentProvider": "gemma",
"providers": {
"gemma": {
"type": "gemma",
"model": "supergemma4-26b-uncensored-v2",
"apiKey": "lm-studio",
"baseURL": "http://localhost:1234/v1"
},
"openai": {
"type": "openai",
"model": "<openai-compatible-model>",
"apiKey": "$ENV:OPENAI_API_KEY"
},
"qwen": {
"type": "qwen",
"model": "qwen3.6-plus",
"apiKey": "$ENV:DASHSCOPE_API_KEY",
"options": {
"builtInWebTools": {
"webSearch": true,
"webFetch": true
}
}
}
}
}Gemma-family local models served through LM Studio must use a type: "gemma" profile so the provider package can apply Gemma-specific channel-marker projection. type: "openai" remains model-family neutral and must not filter Gemma markers.
Provider resolution order:
currentProviderplusproviders[currentProvider]- Legacy
provider - Defaults supplied by the resolved provider definition
Provider profiles may include options. The CLI passes this bag through to definition.createProvider(config) without interpreting provider-specific keys. Provider packages own the shape, validation, defaults, and behavior for their options.
Provider definition contract:
| Field | Owner | CLI behavior |
|---|---|---|
type | Provider package or CLI assembly | Match settings profile type to a definition |
aliases | Provider package | Optional compatibility names resolved by generic lookup |
displayName | Provider package | Optional human-readable provider label for setup lists |
description | Provider package | Optional provider description for setup lists and errors |
defaults | Provider package | Fill omitted model/apiKey/baseURL/timeout values |
defaults.options | Provider package | Optional provider-owned option defaults passed through generically |
setupSteps | Provider package | Drive interactive setup prompts without type branches |
requiresApiKey | Provider package | Validate profiles consistently |
probeProfile | Provider package | Optional endpoint/profile test hook |
createProvider | Provider package | Build concrete provider instance |
The default CLI binary assembles definitions from provider packages. Alternate embeddings can pass their own definitions into startCli({ providerDefinitions }). Compatibility provider names such as google for the canonical Gemini provider must be represented as provider-definition aliases, not as CLI provider-name branches.
Provider Configuration UX
The CLI owns provider setup and provider profile writes. Default writes go to ~/.robota/settings.json; .claude/settings.json compatibility is read-only for Robota-specific provider profile creation.
Supported setup flags:
| Flag | Behavior |
|---|---|
--configure | Run interactive provider setup and exit |
--configure-provider <profile> | Upsert a provider profile and exit unless a prompt is also provided |
--provider <profile> | Select an existing provider profile for this invocation |
--set-current | Persist the selected or configured profile as currentProvider |
--type <type> | Provider implementation type used by --configure-provider |
--base-url <url> | Provider API base URL |
--api-key <value> | Store a literal API key |
--api-key-env <name> | Store $ENV:<name>, not the current environment value |
First-run setup must offer the injected provider definitions as a selectable list when stdin/stdout are TTYs. The list is generated from IProviderDefinition[] and may render displayName, type, and description, but it must not branch on concrete provider names. Selecting a provider starts the same provider setup flow used by runtime provider setup.
Non-interactive print/headless execution must not prompt. Missing provider config must produce an actionable error generated from the injected provider definitions, pointing to robota --configure and robota --configure-provider without hardcoded provider-specific examples.
Environment-variable API key references use the $ENV:NAME form. If a required provider API key resolves to an unset environment variable, setup validation or provider construction must fail with a clear error before any provider request is sent. A literal unresolved $ENV:NAME string must never be sent as an API key.
Provider slash commands are CLI side effects rendered through generic TUI interactions:
| Command | Behavior |
|---|---|
/provider | Show current provider and subcommands |
/provider current | Show active profile, type, model, and baseURL |
/provider list | Show provider profiles from merged settings |
/provider use <profile> | Confirm, persist currentProvider, and restart the session |
/provider add | Start provider setup without a selected type; the CLI setup controller emits a generic choice interaction generated from injected definitions |
/provider add <type> | Start setup for the selected provider type |
/provider test [profile] | Validate fields and optionally probe the endpoint |
Provider changes must follow the existing /model restart pattern: command returns structured data, CLI side-effect handlers perform settings writes after confirmation or setup completion, and the App remounts with a new provider instance.
Provider setup prompt semantics must live outside Ink components. provider-setup-flow owns provider setup steps, defaults, required-field validation, environment-reference validation, masked-field metadata, and final IProviderSetupInput construction. provider-setup-interaction owns provider selection options and maps provider setup state into generic choice/text interaction descriptors. Interactive rendering components must not import provider setup modules or provider definitions; they may only render generic interaction descriptors and pass submitted values back to the CLI side-effect handler.
TUI input semantics must live outside Ink components. src/ui/flows/* owns prompt and input state transitions, shortcut meaning, selection bounds, slash autocomplete command selection, paste label insertion, and CJK cursor movement. Components may only translate useInput key data into flow actions, apply returned state, render the result, and call external callbacks.
Flow ownership:
| Flow module | Owns | Thin shell consumers |
|---|---|---|
text-prompt-flow.ts | text prompt editing, submit/cancel effects, validation state | TextPrompt, InteractivePrompt text rendering |
selection-flow.ts | bounded/wrapping selection, select/cancel effects, viewport scrolling | ListPicker, MenuSelect, InteractivePrompt choice rendering |
confirm-prompt-flow.ts | confirmation shortcuts and option selection | ConfirmPrompt |
permission-prompt-flow.ts | permission shortcuts and true/allow-session/false decisions | PermissionPrompt |
input-area-flow.ts | slash autocomplete movement, command completion, prompt history, queue cancel, paste labels | InputArea |
cjk-text-input-flow.ts | printable filtering, cursor movement, bracketed paste, submit effects | CjkTextInput |
bin.ts → cli.ts (arg parsing + provider definition composition)
├── createStatusLineCommandModule() (CLI-owned command module)
└── ui/render.tsx → App.tsx (Ink TUI)
├── useInteractiveSession (ONLY React↔SDK bridge)
│ ├── InteractiveSession({ cwd, provider })
│ │ (from @robota-sdk/agent-sdk; config/context loaded internally)
│ ├── TuiStateManager (owned by agent-cli)
│ │ holds history: IHistoryEntry[] ← primary state for message list
│ │ syncs from interactiveSession.getFullHistory() on each update
│ ├── CommandRegistry (from @robota-sdk/agent-sdk)
│ │ ├── BuiltinCommandSource (from @robota-sdk/agent-sdk)
│ │ ├── SkillCommandSource (from @robota-sdk/agent-sdk)
│ │ └── PluginCommandSource (from @robota-sdk/agent-sdk)
│ └── session.executeCommand() (slash commands routed via SDK)
├── MessageList.tsx (renders IHistoryEntry[]; EntryItem dispatches on category)
├── InputArea.tsx (bottom input area, slash detection)
├── SessionStatusBar.tsx (connects statusline settings + git branch to renderer)
├── StatusBar.tsx (pure status bar renderer, shows primary activity state)
├── PermissionPrompt.tsx (arrow-key selection)
└── SlashAutocomplete.tsx (command popup with scroll)Dependency chain:
agent-cli ─→ agent-sdk ─→ agent-sessions ─→ agent-core
│ ├─→ agent-tools ────────────→ agent-core
│ └─────────────────────────→ agent-core (direct: types, utilities)
├──────────────────────────────────────→ agent-core (direct: public types only)
└──────────────────────────────────────→ agent-provider-* (provider definitions)StatusBar Display
The StatusBar shows real-time session information:
┌──────────────────────────────────────────────────────────────────────────┐
│ Activity: Thinking | Mode: default | my-session | git: feat/x | Claude Sonnet 4.6 | Context: 45% (90K/200K) | msgs: 12 │
└──────────────────────────────────────────────────────────────────────────┘| Field | Source | Description |
|---|---|---|
| Mode | session.getPermissionMode() | Current permission mode |
| Model | getModelName(config.provider.model) | Human-readable model name (e.g., "Claude Sonnet 4.6") |
| Git | resolveGitBranch(cwd) | Current git branch when available and enabled |
| Context | session.getContextState().usedPercentage | Context usage with K/M formatting (e.g., "90K/1M") |
| msgs | message count | Number of messages in conversation |
| Session | session.getName() | Session name (shown only when a name is set) |
| Activity | CLI-derived display state | Left-side primary activity label |
Activity priority is deterministic and renderer-owned:
- active tool calls (
Tools xN) - foreground model waiting (
Thinking) - active background work (
Background xN) - queued prompt (
Queued) - idle (
Idle)
When a prompt is queued behind foreground work, the activity row keeps the active work as primary and appends queued as secondary metadata. SDK session state remains the source of truth; StatusBar receives derived display counts and does not infer provider or execution semantics.
/statusline Slash Command
/statusline is a CLI-owned ICommandModule, not an SDK core built-in. The CLI composes it into InteractiveSession alongside other command modules so the SDK only sees the generic ICommandModule interface.
Supported commands:
| Command | Behavior |
|---|---|
/statusline on | Persist statusline.enabled=true in ~/.robota/settings.json |
/statusline off | Persist statusline.enabled=false |
/statusline git on | Persist statusline.gitBranch=true |
/statusline git off | Persist statusline.gitBranch=false |
/statusline reset | Restore default status line fields |
Defaults are enabled=true and gitBranch=true. The command returns structured data, useSlashRouting converts it into a CLI side-effect flag, and useSideEffects persists the setting and updates React state. StatusBar remains a pure renderer.
Session Name Display
Session name appears in three locations when set (via --name or /rename):
- Input box top border — right-aligned title embedded in the border with background color matching the border and black bold text:
┌──────────────────────────────────────── "my-session" ──┐ │ > Type a message │ └────────────────────────────────────────────────────────┘ - Terminal title — ANSI escape
\x1b]0;Robota — <name>\x07updates the terminal tab/window title - StatusBar — displayed in magenta alongside mode, model, and context info
Context Color Coding
| Range | Color | Meaning |
|---|---|---|
| 0-69% | Green | Healthy |
| 70-89% | Yellow | Approaching limit |
| 90%+ | Red | Near limit, compaction imminent |
TUI Visual Grammar
The CLI TUI renders structured session/runtime data. It must not parse assistant prose to infer state, and it must not add provider/model-specific presentation branches. Rendering components may format data they receive, but ownership of the data remains in the SDK/session/runtime layer.
Output Surface Ownership
| Surface | Owner | Data Source | Rendering Contract |
|---|---|---|---|
| Chat messages | MessageList | IHistoryEntry[] chat entries | Show stable role labels and markdown-rendered assistant content |
| Tool summaries | MessageList | structured tool-summary event data | Show compact one-line tool rows plus structured details such as diffs/output |
| Streaming assistant response | StreamingIndicator | SDK text deltas | Show current assistant text without persisting duplicate rendered state |
| Live tool execution | StreamingIndicator | SDK tool state events | Show current tool state using the shared status marker set |
| Background work | BackgroundTaskPanel | SDK background task events | Show a one-level tree of running and retained terminal jobs |
| Status/activity | StatusBar and SessionStatusBar | session state, context state, settings | Show current activity and session metadata in the primary scan path |
| Diff blocks | ToolDiffBlock and markdown renderer | structured diff lines | Render diff bodies through fenced diff markdown; keep metadata outside the body |
| Setup/permission prompts | prompt components | CLI flow descriptors | Render generic interactions only; prompt semantics remain in flow modules |
Shared Markers
| State | Marker | Meaning |
|---|---|---|
| Running/queued | □ | Work exists but is not terminal |
| Completed/success | ■ | Work reached a successful terminal state |
| Failed/error | ■ | Work reached an error state, colored as error |
| Cancelled/denied | ■ | Work ended by user or policy decision |
| Omitted transcript | ... | Additional persisted output is hidden from preview |
Colors remain renderer-owned: green for success/healthy state, yellow for warning or user-decision state, red for error/near-limit state, cyan for active assistant work, and dim text for secondary metadata.
Layout Rules
- Prefer one-level trees for grouped activity: a short group label followed by aligned child rows.
- Keep labels human-readable. Avoid raw class names, untrimmed JSON, or provider-specific implementation names in user-facing rows.
- Keep previews bounded and whitespace-normalized. Long output must show a transcript/omitted-lines hint instead of expanding indefinitely.
- Keep persistent raw data in session/log records even when the TUI renders a compact preview.
- Place active state in the primary scan path. Passive metadata may remain dim or right-aligned, but model/tool/background activity must be visible without scanning the far edge of the terminal.
- Keep code/diff colorization centralized through markdown rendering or dedicated formatting helpers, not ad hoc line coloring in each component.
Testing Requirements
- Pure formatting helpers must have unit tests for status markers, truncation, omitted-line counts, and narrow-output labels.
- Ink components must have render tests for the same states using representative structured data.
- Changes that add a new output surface must update this section or explain why an existing surface owns the behavior.
Command Output Summary Rendering
Command-like tool summaries render a compact command row plus a bounded output preview. The contract is:
- Applies only to command execution tools (
Bash,BackgroundProcess) that providetoolResultData. - The visible preview shows at most four output lines.
- If output has additional lines, render
... +N lines (full output in session transcript). - Non-zero command
exitCode,success=false, or toolresult=errorrenders the tool row as failed even when the tool transport itself completed. - Structured
stdoutandstderrare kept distinct; stderr preview lines are prefixed with[stderr]. - Empty successful output shows only the compact command row.
- Full result data remains in SDK/session records; the TUI renders only the bounded projection.
Edit Diff Hunk Rendering
Edit tool summaries render context-aware hunks rather than isolated changed lines. The rendering contract is:
- Default context is three unchanged lines before and after the edited span when the modified file can be read.
- Diff bodies are fenced as
diffmarkdown and rendered through the shared markdown renderer. - Hunk headers, context lines, additions, and removals are represented as structured diff line data before rendering.
- File path, truncation state, and omitted-line counts remain outside the markdown body.
- If file context is unavailable, the renderer falls back to changed lines only rather than failing the tool summary.
- Large diffs are truncated by visible hunk groups when possible, preserving the first changed hunk before omitting additional lines.
Usage Summary Rendering
Usage summary rows render persisted SDK usage-summary history entries. The CLI must:
- Render usage near the assistant turn that produced it rather than in a detached dashboard.
- Show whether usage is exact or estimated.
- Show prompt, completion, and total token counts when available.
- Label monetary cost as unknown unless the SDK provides exact or configured pricing data.
- Avoid provider/model branches; all display data comes from the SDK-owned
IUsageSnapshot. - Subscribe to
context_updateso the status bar refreshes when a request is sent and again after provider usage reconciliation.
Context Management (CLI Layer)
/compact Slash Command
/compact # Default compaction
/compact focus on API changes # Custom focus instructions- Calls
session.compact(instructions) - Displays before/after context percentage
- Shows "Context compressed: 85% → 32%" message
Auto-Compaction Notification
When auto-compaction triggers (at ~83.5% threshold), the UI shows a system message notifying the user.
Tool Call Display
Real-Time Tool Execution (Streaming)
During session.run(), tool execution is displayed in real-time via the onToolExecution callback. The streaming display shows Tools: first, then Robota: in execution order:
Tools:
✓ Read(/src/index.ts)
✓ Bash(ls -la)
⟳ Glob(**/*.md)
Robota:
Checking the file structure now...Behavior:
onToolExecutionfiresstartwhen a tool begins andendwhen it completes- Running tools show
⟳(yellow), completed tools show✓(green) - Format:
ToolName(firstArgValue)— first argument truncated to 80 chars, matching post-run summary style - Completed tools remain visible until
session.run()finishes (not removed onend) Tools:andRobota:sections each have a blank line below the label and between sections- When no tools and no streaming text, renders nothing (empty fragment); "Thinking..." is shown by
StatusBar
Post-Run Tool Summary
After each session.run() completes, tool calls from the session history are extracted and displayed as a single grouped message:
Tool: [5 tools]
Read(/Users/jungyoun/Documents/dev/robota/.agents/tasks/apps-web-sep...)
Bash(ls -la .agents/tasks/)
Glob(**/*.md)- All tool calls from a run are grouped into one
role: 'tool'message - Format:
ToolName(firstArgValue)— first argument value extracted from JSON, truncated to 80 chars - Displayed after the assistant response in the message list
Slash Commands
| Command | Description |
|---|---|
/help | Show available commands |
/clear | Clear conversation history |
/mode [mode] | Show/change permission mode |
/model [model] | Select AI model (shows confirmation prompt, restarts session) |
/language [lang] | Set response language (ko, en, ja, zh), saves and restarts |
/compact [instructions] | Compress context window |
/cost | Show session info |
/context | Context window info |
/permissions | Permission rules |
/memory | Route project memory commands to SDK |
/rewind | Route edit checkpoint list/restore commands to SDK |
/background | Route background task controls to SDK |
/plugin [subcommand] | Plugin management |
/resume | Show session picker to resume a saved session |
/rename <name> | Rename the current session (name displayed in StatusBar) |
/exit | Exit CLI |
Slash Command Autocomplete
Typing / as the first character in the input triggers an autocomplete popup. The popup filters commands in real-time as the user types.
Interaction:
- Arrow Up/Down: Navigate items
- Tab: Insert highlighted command into input field (does NOT execute). User can continue typing args or press Enter to execute.
- Enter: Insert and execute the highlighted command immediately
- Esc: Dismiss popup, keep typed text
- Backspace past
/: Dismiss popup
Subcommand Navigation:
Commands with subcommands (e.g., /mode, /model) show a nested submenu when selected:
> /mode
+-------------------------------------+
| plan |
| default |
| acceptEdits |
| bypassPermissions |
+-------------------------------------+Visual Grouping:
Commands are grouped by source with separators: built-in commands appear first, followed by discovered skill commands.
/model — Model Change Flow
The /model command lists available models as subcommands with the format Claude Opus 4.6 (1M). Model definitions come from the CLAUDE_MODELS registry in @robota-sdk/agent-core.
Subcommand display:
> /model
+-------------------------------------+
| Claude Opus 4.6 (1M) |
| Claude Sonnet 4.6 (1M) |
| Claude Haiku 4.5 (200K) |
+-------------------------------------+Model change flow:
- User selects a model from the subcommand list
- A
ConfirmPromptappears: "Change model to Claude Opus 4.6? The CLI will restart." - If confirmed (Yes /
y): settings are written to~/.robota/settings.jsonand the CLI exits (user restarts manually) - If cancelled (No /
n): returns to normal input
ListPicker Component
A generic list picker overlay (ListPicker.tsx) for selecting an item from a list. Used by the session resume flow to display saved sessions.
Props:
| Prop | Type | Description |
|---|---|---|
title | string | Header text above the list |
items | Array<{ label, value }> | Items to display. label is shown, value is returned on select |
onSelect | (value: string) => void | Callback when an item is selected |
onCancel | () => void | Callback when ESC is pressed |
Interaction: Arrow Up/Down to navigate, Enter to select, ESC to cancel.
ConfirmPrompt Component
A reusable confirmation prompt with arrow-key selection (ConfirmPrompt.tsx). Used by /model change and available for other yes/no confirmations.
Props:
| Prop | Type | Default | Description |
|---|---|---|---|
message | string | — | Message above the options |
options | string[] | ['Yes', 'No'] | Options to select from |
onSelect | (index: number) => void | — | Callback with selected index |
Interaction: Arrow keys to navigate, Enter to confirm. For 2-option prompts, y selects the first option, n selects the second.
/plugin — Plugin Management
The /plugin command manages bundle plugins. Subcommands:
| Subcommand | Description |
|---|---|
/plugin install <name> | Install a plugin from marketplace or local path |
/plugin uninstall <name> | Remove an installed plugin |
/plugin enable <name> | Enable a disabled plugin |
/plugin disable <name> | Disable a plugin without uninstalling |
/plugin list | List installed plugins with status |
/plugin marketplace | Browse available plugins from configured sources |
Installed plugins contribute skills via PluginCommandSource, which discovers skills from each plugin's bundle manifest and makes them available as slash commands alongside project and user skills.
React↔SDK Bridge
useInteractiveSession is the single boundary between React and the SDK. It:
- Creates
InteractiveSession({ cwd, provider, commandModules })andCommandRegistryonce (viauseRef— never recreated on re-render). The provider instance is passed in from the caller;InteractiveSessionhandles config/context loading internally. - Creates a
TuiStateManagerinstance that holdshistory: IHistoryEntry[]as the primary state for the message list. On each execution update (whenthinkingtransitions tofalse, or oncomplete/interrupted), the hook delegates toTuiStateManagerto sync state frominteractiveSession.getFullHistory(). - Subscribes to
InteractiveSessionevents (text_delta,tool_start,tool_end,thinking,complete,interrupted,error,background_task_event) and converts them to React state. - Exposes
handleSubmit,handleAbort,handleCancelQueue, andhandleShutdownas stable callbacks to the TUI. - Routes slash commands via
session.executeCommand(name, args)— noSystemCommandExecutoris instantiated directly by the CLI. - Manages the permission queue (serialises concurrent permission requests).
No other hook or component interacts with InteractiveSession directly.
Plugin Hook Merging
Plugin hook merging (resolving ${CLAUDE_PLUGIN_ROOT} and merging hook groups) is handled internally by @robota-sdk/agent-sdk. The CLI does not perform hook merging.
App.tsx
App.tsx is a thin JSX shell (~220 lines). It:
- Calls
useInteractiveSessionandusePluginCallbacks. - Wraps
handleSubmitonly to process TUI-specific side effects (_pendingModelId,_pendingLanguage,_resetRequested,_exitRequested,_triggerPluginTUI) that require Ink APIs (useApp().exit). - Contains no queue logic, no abort logic, no session business logic.
Tool List Visibility
The StreamingIndicator (showing active tools) is rendered when isThinking || activeTools.length > 0. Streaming state (streamBuf, activeTools) is cleared at the start of a new execution (when thinking: true), not at the end. This means the tool list stays visible after execution completes or is aborted, until the next execution begins.
Streaming Text Debounce
TuiStateManager.onTextDelta debounces notify() calls to reduce React re-render and markdown rendering frequency. Text deltas are accumulated in streamBuf immediately (no data loss), but notify() fires at most once per STREAMING_DEBOUNCE_MS (default 300ms). This limits renderMarkdown() invocations to ~3/second instead of per-token (hundreds/second). A createDebouncedNotify utility manages the timer lifecycle; flush() is called on completion/interruption/error to clean up.
Command Registry Architecture
The slash command system uses an extensible registry pattern. Multiple ICommandSource implementations provide commands, and the CommandRegistry aggregates them. CommandRegistry, BuiltinCommandSource, and SkillCommandSource are all owned by @robota-sdk/agent-sdk. Slash command execution is routed through session.executeCommand(name, args) — the CLI does not instantiate SystemCommandExecutor directly. The CLI adds PluginCommandSource and any injected ICommandModule sources generically.
Reusable CLI/TUI code must not special-case command module names such as /agent. It accepts commandModules and registers them with the SDK registry. The package binary may choose product defaults by passing modules into startCli().
ICommandSource Interface
interface ICommandSource {
name: string;
getCommands(): ISlashCommand[];
}ISlashCommand Interface
interface ISlashCommand {
name: string;
description: string;
source: string;
skillContent?: string; // Full SKILL.md content (skill commands only)
subcommands?: ISlashCommand[];
execute?: (args: string) => void | Promise<void>;
}Command Sources
| Source | Class | Owner | Description |
|---|---|---|---|
| Built-in | BuiltinCommandSource | @robota-sdk/agent-sdk | Built-in commands with subcommands for /mode, /model |
| Modules | ICommandModule | Module package | Optional command modules injected by composition |
| Skills | SkillCommandSource | @robota-sdk/agent-sdk | Discovered from 4 scan paths (see Skill Discovery) |
| Plugins | PluginCommandSource | @robota-sdk/agent-sdk | Skills provided by installed bundle plugins |
Skill Discovery (Multi-Path)
Skills are discovered at session start from directories scanned by SkillCommandSource (agent-sdk), in priority order (highest first, deduplicated by name). Paths are defined in agent-sdk's SPEC.md; the CLI uses them as-is:
| Priority | Path | Scope |
|---|---|---|
| 1 | .claude/skills/*/SKILL.md | Project (Claude Code native) |
| 2 | .claude/commands/*.md | Project (Claude Code compatible) |
| 3 | ~/.robota/skills/*/SKILL.md | User global (Robota native) |
| 4 | .agents/skills/*/SKILL.md | Project (Robota native) |
Skill Frontmatter Schema
Each SKILL.md may contain YAML frontmatter with the following fields:
| Field | Type | Required | Description |
|---|---|---|---|
name | string | No | Display name (default: directory name) |
description | string | No | Short description for autocomplete |
allowed-tools | string[] | No | Tools the skill is allowed to use |
context | string | No | Execution context: fork, agent |
model | string | No | Override model for this skill |
max-turns | number | No | Maximum conversation turns |
invocation | string | No | Invocation method: user, auto-invoke, model-only |
If no frontmatter is found, the directory name is used as the command name.
Variable Substitution
Skill content supports variable substitution before injection:
| Variable | Description |
|---|---|
$ARGUMENTS | User-provided arguments after the command |
${CLAUDE_SESSION_ID} | Current session identifier |
${CLAUDE_MODEL} | Current model identifier |
${PROJECT_DIR} | Project root directory path |
${USER_HOME} | User home directory path |
Variables are substituted at invocation time, not at discovery time.
Shell Command Preprocessing
Skill content supports inline shell command execution using the !`command` syntax. The shell command is executed and its stdout replaces the markup in the skill content before injection. This enables dynamic content like file listings or environment values.
Skill Execution Features
| Feature | Value | Description |
|---|---|---|
context: fork | Fork context | Skill runs in a forked session, preserving the parent context |
context: agent | Agent context | Skill runs as a sub-agent with its own isolated session |
allowed-tools | Tool whitelist | Restricts which tools the skill can use during execution |
Skill Invocation Methods
| Method | Trigger | Description |
|---|---|---|
user | User types /skillname | Default — user explicitly invokes via slash command |
auto-invoke | Model decides | Model can invoke the skill automatically when relevant |
model-only | Model-initiated only | Not shown in user autocomplete, model-only access |
Skill Execution
When a skill slash command is selected, the full SKILL.md content (after variable substitution and shell preprocessing) is injected into the session prompt wrapped in <skill> tags. The model receives both the skill instructions and any user-provided arguments.
interactiveSession.submit(input, displayInput, rawInput) is called with three arguments:
input— the expanded skill content for the modeldisplayInput— the display form shown to the user (e.g.,/audit)rawInput— the qualified name form used for hook matching (e.g.,/rulebased-harness:audit some-args); if no qualified name is found, falls back todisplayInput
The qualified name is resolved via registry.resolveQualifiedName(cmd) so that hook matchers can identify which plugin's skill was invoked.
Type Ownership
| Type | Location | Purpose |
|---|---|---|
| ITerminalOutput | src/types.ts | Terminal I/O DI interface (duplicate — SSOT is agent-core) |
| ISpinner | src/types.ts | Spinner handle (duplicate — SSOT is agent-core) |
| IPermissionRequest | src/ui/types.ts | Permission prompt React state |
| ISlashCommand | src/commands/types.ts | CLI alias for ICommand from agent-sdk |
| ICommandSource | src/commands/types.ts | Re-export of ICommandSource from agent-sdk |
Public API Surface
| Export | Kind | Description |
|---|---|---|
| startCli | function | CLI entry point |
| ITerminalOutput | type | Terminal I/O DI interface |
| ISpinner | type | Spinner handle |
Note: createSession() is internal to agent-sdk and is NOT re-exported. The CLI uses InteractiveSession directly. index.ts does not re-export SDK types; consumers should import those directly from @robota-sdk/agent-sdk.
File Structure
src/
├── bin.ts ← Binary entry point
├── cli.ts ← Config loading, Ink render invocation
├── print-terminal.ts ← ITerminalOutput for print mode (-p)
├── types.ts ← ITerminalOutput, ISpinner
├── index.ts ← Re-exports (CommandRegistry, BuiltinCommandSource, etc.)
├── commands/
│ ├── types.ts ← ISlashCommand, ICommandSource interfaces
│ ├── builtin-source.ts ← Re-export shim: `export { BuiltinCommandSource } from '@robota-sdk/agent-sdk'`
│ ├── command-registry.ts ← Re-export shim: `export { CommandRegistry } from '@robota-sdk/agent-sdk'`
│ ├── skill-source.ts ← Re-export shim: `export { SkillCommandSource } from '@robota-sdk/agent-sdk'`
│ ├── plugin-source.ts ← PluginCommandSource (legacy local copy; main flow uses SDK version)
│ ├── skill-executor.ts ← Skill execution helpers (fork/inject modes); not in main flow
│ │ (main flow uses buildSkillPrompt from @robota-sdk/agent-sdk)
│ └── slash-executor.ts ← IPluginCallbacks interface + plugin TUI handler functions
│ (executeSlashCommand not in main flow; main flow uses session.executeCommand())
├── utils/
│ ├── cli-args.ts ← CLI argument parsing and validation
│ ├── settings-io.ts ← Settings file read/write/update/delete
│ ├── provider-factory.ts ← AI provider resolution from injected definitions
│ ├── provider-setup-flow.ts ← Provider setup field flow and final setup input construction
│ ├── provider-setup-interaction.ts← Provider setup to generic choice/text interaction mapping
│ ├── interactive-prompt.ts ← Generic prompt descriptor types shared by CLI use cases and TUI rendering
│ ├── tool-call-extractor.ts ← Tool call display extraction from history
│ ├── paste-labels.ts ← Paste label insertion and expansion for multiline paste
│ └── edit-diff.ts ← Edit diff computation and formatting for display
└── ui/
├── App.tsx ← Thin JSX shell (~220 lines); no queue/abort/session logic
├── hooks/
│ ├── useInteractiveSession.ts ← ONLY React↔SDK bridge; delegates to TuiStateManager for
│ │ history: IHistoryEntry[] state; converts InteractiveSession
│ │ events to React state (streamingText, activeTools, etc.)
│ ├── TuiStateManager.ts ← Holds history: IHistoryEntry[]; syncs from getFullHistory();
│ │ manages windowing (MAX_RENDERED_MESSAGES) and local event entries
│ └── usePluginCallbacks.ts ← Plugin TUI callback wiring
├── flows/
│ ├── text-prompt-flow.ts ← Text prompt editing, validation, submit/cancel effects
│ ├── selection-flow.ts ← Shared bounded/wrapping selection state machine
│ ├── confirm-prompt-flow.ts ← Confirmation shortcuts and option selection
│ ├── permission-prompt-flow.ts← Permission shortcuts and decision mapping
│ ├── input-area-flow.ts ← Slash autocomplete, prompt history, and paste-label input flow
│ └── cjk-text-input-flow.ts ← CJK-aware text editing and paste flow
├── render.tsx ← Ink render() invocation
├── MessageList.tsx ← Renders IHistoryEntry[] via EntryItem (dispatches on category)
├── InputArea.tsx ← Bottom fixed input (CjkTextInput), slash detection
├── SessionStatusBar.tsx ← Statusline settings + git branch adapter
├── StatusBar.tsx ← Mode, model, git branch, context %, message count, Thinking
├── PermissionPrompt.tsx ← Allow/Deny arrow-key selection (useInput)
├── StreamingIndicator.tsx ← Real-time Tools:/Robota: display during run()
├── SlashAutocomplete.tsx ← Command autocomplete popup (scroll, highlight)
├── CjkTextInput.tsx ← Custom text input with Korean IME support
├── ConfirmPrompt.tsx ← Reusable arrow-key confirmation prompt
├── WaveText.tsx ← Wave color animation for waiting indicator
├── ListPicker.tsx ← Generic list picker overlay (session resume, etc.)
├── InteractivePrompt.tsx ← Generic choice/text prompt renderer for CLI interactions
├── ToolDiffBlock.tsx ← Tool diff metadata shell with Markdown diff body rendering
├── MenuSelect.tsx ← Arrow-key menu selection component (Plugin TUI)
├── PluginTUI.tsx ← Plugin management TUI (screen stack navigation)
├── TextPrompt.tsx ← Text input prompt component (Plugin TUI)
├── plugin-tui-handlers.ts ← Plugin TUI action handlers (install, uninstall, etc.)
├── render-markdown.ts ← Markdown rendering for terminal output
├── InkTerminal.ts ← No-op ITerminalOutput
└── types.ts ← IPermissionRequestNote: CommandRegistry, BuiltinCommandSource, SkillCommandSource, PluginCommandSource, and SystemCommandExecutor are owned by @robota-sdk/agent-sdk. The CLI does not use SystemCommandExecutor directly; slash command execution goes through session.executeCommand(name, args). The CLI's src/commands/ directory holds re-export shims (builtin-source.ts, command-registry.ts, skill-source.ts) for backward compatibility, plus slash-executor.ts (plugin TUI handlers and IPluginCallbacks interface) and skill-executor.ts (fork/inject execution helpers). The CLI's src/index.ts exports only startCli and local CLI types.
CLI Usage
robota # Interactive TUI
robota -p "prompt" # Print mode (one-shot)
robota -c # Continue last session (most recent by cwd)
robota --continue # Same as -c
robota -r <id> # Resume session by ID or name
robota --resume [id] # Resume session (shows picker if no ID given)
robota -c --fork-session # Fork from last session (new ID, restored context)
robota --name <name> # Set session name on startup
robota --reset # Delete user settings and exit
robota --model <model> # Model override
robota --language <lang> # Response language (ko, en, ja, zh)
robota --permission-mode <mode> # plan | default | acceptEdits | bypassPermissions
robota --max-turns <n> # Limit turns
robota --output-format <fmt> # text | json | stream-json (print mode only)
robota --system-prompt <text> # Replace system prompt (print mode only)
robota --append-system-prompt <text> # Append to system prompt (print mode only)
robota --check-update # Check npm for the latest CLI version and exit
robota --disable-update-check # Skip interactive startup update check for this invocation
robota --version # VersionPrint Mode and Headless Transport
Print mode (-p) delegates execution to @robota-sdk/agent-transport-headless via createHeadlessTransport. The CLI creates an InteractiveSession, attaches the headless transport via session.attachTransport(transport), calls transport.start(), then calls session.shutdown({ reason: 'prompt_input_exit' }) before exiting with transport.getExitCode().
Any command modules supplied to startCli({ commandModules }) are passed to the same InteractiveSession in both print mode and TUI mode.
--output-format controls how the response is written to stdout:
| Format | Description |
|---|---|
text | Plain text response (default) |
json | Single JSON object with type, result, session_id |
stream-json | Newline-delimited JSON with content_block_delta events |
--system-prompt and --append-system-prompt are parsed but not yet connected to InteractiveSession. Requires SDK-level support for custom system prompt injection. Flags are reserved for future implementation.
Stdin Pipe
When -p is specified with no positional argument and stdin is piped (not a TTY), the CLI reads the full stdin stream as the prompt:
echo "Explain this" | robota -p
cat file.ts | robota -p "Review this code"If both stdin and a positional argument are provided, stdin content is prepended to the prompt.
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success or interrupted |
| 1 | Error during execution |
CLI Update Check
The CLI owns package-version update checks because they are distribution UX, not SDK agent behavior. This feature is exclusive to @robota-sdk/agent-cli. The SDK, providers, session store, and command modules must not know about npm, package manager commands, or CLI release cadence.
Update-check behavior:
- Startup checks are enabled by default only for interactive TUI startup and rate-limited by a product-level TTL constant.
- The default cache TTL is 24 hours.
- Registry lookup uses the npm package metadata endpoint for
@robota-sdk/agent-cli. - Registry URL, timeout, package name, and TTL are CLI-owned constants. They are not written into
settings.jsonduring startup. - Registry lookup failure must never prevent interactive, print, or headless startup.
- Update notices must not be written into project session history.
- TUI notices are rendered as transient UI outside
MessageList. - Print/headless execution (
robota -p, JSON output, and streaming JSON output) must not schedule automatic startup update checks and must not emit startup update notices. This keeps automation, pipes, and structured stdout/stderr contracts deterministic without requiring--disable-update-check. - The CLI may show the command
npm install -g @robota-sdk/agent-cli@latest, but it must not execute install/update commands without explicit user confirmation.
Operational cache lives in ~/.robota/update-check.json and is not part of .robota/sessions. Cache fields include package name, checked timestamp, current version, latest version, and the last non-fatal error message if a registry lookup failed.
robota --check-update forces a registry lookup and exits after printing one of:
- update available notice with the install command;
- already-current notice;
- registry failure message.
robota --disable-update-check disables only the current interactive startup invocation. Persistent policy storage is not part of the first implementation.
Session Resolution Logic
| Flag | Behavior |
|---|---|
--continue / -c | Finds the most recent session matching the current working directory and resumes it (reuses original session ID, continues writing to the same session file) |
--resume [id] | If an ID or name is provided, resumes that session (reuses original session ID). If omitted, shows a session picker |
--fork-session | Boolean flag, used with --continue or --resume. Creates a new session (fresh UUID) but restores context from the resumed session. Original file preserved |
--name <name> | Sets the session name. Can be combined with other flags |
When --resume is used without a value, a ListPicker overlay is shown with all saved sessions. The user selects one to resume.
Session Storage
The CLI constructs SessionStore with the current project path .robota/sessions, not the generic user-level default. Every resumable session record must stay beside the project logs and must include provider messages, UI history, the exact system prompt, and registered tool schemas. This makes /continue, /resume, and local debugging inspect the same project-local .robota tree.
Tool Output Limits
- Universal cap: Tool output is capped at 30,000 characters. Outputs exceeding this limit are middle-truncated (first and last portions are kept, with a truncation marker in the middle).
- Glob entry limit: The Glob tool defaults to a maximum of 1,000 entries per invocation to prevent oversized responses.
First-Run Setup
When no settings file exists (~/.robota/settings.json, .robota/settings.json, or .robota/settings.local.json), the CLI prompts for initial setup:
- Anthropic API key (input masked with asterisks)
- Response language (ko/en/ja/zh, default: en)
Creates ~/.robota/settings.json with provider config and language setting. The language is injected into the system prompt as "Always respond in {language}." and persists across compaction.
Use robota --reset to delete the user settings file and return to the first-run state.
Session Logging
Session logging is an SDK-internal concern. The CLI does not configure or manage log files. For logging details (JSONL format, log paths, event types), see the agent-sdk SPEC.
Tool Execution Display
Tool execution uses a unified visual style across real-time streaming and post-execution summary.
Icons and Colors
| State | Icon | Color | Strikethrough | When |
|---|---|---|---|---|
| Running | ⟳ | yellow | no | Tool is executing |
| Success | ✓ | green | no | Tool completed successfully |
| Error | ✗ | red | yes | Tool execution failed |
| Denied | ⊘ | yellowBright | yes | Permission denied |
Labels
Tools:/Tool:headers use white bold (visible on dark terminals).- Tool count badge:
[N tools]in white dim.
Argument Truncation
Long tool arguments are truncated with middle ellipsis, keeping the last 30 characters visible:
- Before:
Read(/Users/jungyoun/Documents/dev/robota/packages/agent-sdk/src/plugins/ver...) - After:
Read(/Users/jungyoun/Documents/dev/...sdk/src/plugins/very-long/file.ts)
This ensures file names and important suffixes remain visible.
Plugin Skill Display
Plugin skills show the plugin hint before the description:
- Format:
/skill-name (plugin-name) description - Example:
/audit (rulebased-harness) Audits your project's harness setup
Assistant Markdown Diff Rendering
Assistant responses are rendered as Markdown through render-markdown.ts. A fenced code block with the diff language identifier is the canonical way for the assistant to show proposed code changes inside normal prose:
```diff
- const oldValue = true;
+ const newValue = true;
```Rules:
render-markdown.tsowns assistant Markdown diff rendering.difffenced blocks receive line-level terminal colors: removed lines red, added lines green, hunk headers cyan, diff metadata dim.- Color is controlled by renderer options and terminal color environment. With color disabled, the same diff text remains readable without ANSI escape codes.
- General fenced code blocks continue through
marked-terminal; onlydifffenced blocks take the Robota line-level path. - Tool execution summaries use the same Markdown diff body rendering path while keeping file path, truncation, permissions, and streaming status as structured UI metadata outside the fenced block.
Edit Diff Display
When an Edit tool summary includes diff lines, the CLI shows a compact diff below the tool line. This gives the user immediate visibility into what changed without inspecting the file.
Source: old_string and new_string from the Edit tool arguments.
Ownership: tool-diff-summary.ts converts structured IDiffLine[] data into a Markdown fenced diff body. ToolDiffBlock.tsx renders structured metadata around that body and delegates the diff body itself to renderMarkdown(). There must not be a second bespoke diff-coloring policy for tool summaries.
Display format:
✓ Edit(src/provider.ts)
│ src/provider.ts
`diff
- 42 | const DEFAULT_MAX_TOKENS = 4096;
+ 42 | const maxTokens = getModelMaxOutput(modelId);
`Rules:
- Show the file path as a header line.
- Diff body lines use Markdown
diffprefixes:-for removed,+for added, and a leading space for context lines. - Line numbers are included inside the diff body text as
PREFIX NN | contentso they remain readable with colors disabled. - File path is structured metadata outside the Markdown diff body.
- Truncation is structured metadata outside the Markdown diff body: max display lines: 12. If the diff exceeds 12 lines, render the first 10 lines plus
... and N more lines. - If
old_stringandnew_stringare identical (no-op edit), show nothing. - Diff is shown in both the real-time streaming indicator (after tool completes) and the post-execution summary.
- Post-execution
tool-summaryentries must render from structureddata.toolswhen present so persisteddiffFileanddiffLinesare not lost. The plainsummarystring is a fallback for legacy entries only.
Permission prompt integration (future):
When a permission prompt is shown for an Edit tool, the diff should be displayed alongside the Allow/Deny prompt so the user can see what will change before approving.
Keyboard Controls
Message Display Order (fixed)
The display order is Tool → Robota, fixed and identical for streaming, normal completion, and ESC abort:
During streaming (real-time):
You: [user prompt] ← MessageList (visible immediately on submit)
System: Invoking skill: audit ← MessageList (visible immediately, skills only)
Tool: ⟳ Read(file.ts) ← StreamingIndicator (real-time, below MessageList)
⟳ Edit(file.ts)
Robota: [streaming text...] ← StreamingIndicator (real-time)You: and System: messages are visible from the start of streaming — not delayed until completion. Messages are synced from InteractiveSession on both thinking=true (execution start) and thinking=false (execution end). Only Tool: and Robota: are handled by StreamingIndicator during streaming.
After completion or abort (final state):
You: [user prompt] ← MessageList
Tool: ✓ Read(file.ts) ← MessageList (tool summary message, inserted before Robota)
✓ Edit(file.ts)
Robota: [response] ← MessageList
System: Interrupted by user. ← MessageList (abort only)Mechanism:
- During streaming:
StreamingIndicatorrendersactiveTools+streamingTextin real-time (Tool → Robota order). Each tool occupies exactly one line —onToolEndusesfindIndexto update only the first matching running entry (not all entries with the same tool name). - Individual
tool-startandtool-endevents are recorded asIHistoryEntryin the session history for persistence, butMessageListdoes not render them (returns empty fragment). They exist only for session resume and debugging. - On complete/interrupt/error:
InteractiveSession.pushToolSummaryMessage()inserts a formatted tool summary into themessagesarray BEFORE the Robota response. ThenactiveToolsis cleared andStreamingIndicatordisappears. - Result: Tool → Robota order is preserved in both real-time and final state. Tool information transitions from
StreamingIndicator(live) toMessageList(permanent).
Ctrl+C — Graceful Shutdown
Ink render uses exitOnCtrlC: false. The first Ctrl+C is handled by App.tsx, renders Shutting down..., and calls useInteractiveSession.handleShutdown('prompt_input_exit'). That delegates to InteractiveSession.shutdown(), so foreground abort, managed background task cancellation, session persistence, and SessionEnd hooks run in the SDK-owned lifecycle before the TUI exits.
Slash-command restarts and exits (/exit, provider/model/language restart, reset) also call InteractiveSession.shutdown() before useApp().exit(). The CLI owns only signal/UI wiring; it must not enumerate or kill SDK-managed background work directly.
ESC — Abort Execution
ESC aborts the current execution gracefully (unlike Ctrl+C which kills the process):
- ESC key handler in
App.tsxcallshandleAbort()(fromuseInteractiveSession). The App-level ESC listener remains mounted and guards permission, plugin, and session-picker overlays inside the handler instead of togglinguseInput({ isActive }). handleAbortsetsisAborting: trueand callsinteractiveSession.abort()- AbortSignal propagates through the entire stack (ExecutionService -> Provider ->
streamWithAbort) executeRoundcallscommitAssistant('interrupted')— the partial response is saved to conversation history withstate: 'interrupted'. Text is ALWAYS preserved (no stripping).InteractiveSessionemits theinterruptedevent; thethinkingevent fires withfalse
Rendering state on abort (onInterrupted handler):
- Tool list:
pushToolSummaryMessage()inserts tool summary intomessages(before Robota). ThenactiveToolsis cleared — tool info lives inMessageListnow, notStreamingIndicator. - Streaming text: cleared (
streamBuf = '',setStreamingText('')). The interrupted response is committed to message history. - isAborting: cleared by
onThinking(false)handler. - Border color: yellow (aborting) → green (normal) after
onThinking(false).
useInteractiveSession'sonThinking(false)handler:- Sets
isAborting: false - Re-syncs
messagesfrominteractiveSession.getMessages()— interrupted messages are already committed - Messages with
msg.state === 'interrupted'show an interrupted indicator in the UI
- Sets
- After abort, conversation continues normally — history includes the interrupted assistant message and any tool results
- History is the SSOT for all message content. Append-only, read-only — no edit, no delete.
What appears in the UI after ESC:
Tool: ← in MessageList (from pushToolSummaryMessage)
✓ Read(file.ts)
⟳ Edit(file.ts)
Robota: ← in MessageList (committed interrupted response)
[partial response text...]
System: ← in MessageList
Interrupted by user.Tool → Robota order preserved. StreamingIndicator is cleared (activeTools = []).
Prompt History Navigation
In InputArea, up/down arrows follow shell-style prompt history navigation:
- Up recalls the newest submitted prompt first, then moves toward older prompts.
- Down moves toward newer prompts and restores the in-progress draft after the newest history item.
- Empty prompts and consecutive duplicates are not added to prompt history.
- Restored session history contributes user chat entries to the prompt history list.
- The behavior is owned by
input-area-flow.ts;InputAreaapplies returned value/cursor/state changes. InputAreadisablesCjkTextInputvertical arrow handling so the parent prompt-history flow owns up/down semantics.
Up/Down Arrows — Visual Line Navigation
CjkTextInput can move the cursor between wrapped visual lines when enableVerticalNavigation=true. InputArea sets this to false because its product-level up/down semantics are prompt history navigation.
Architecture:
- Cursor-only manipulation — text is never modified, only flow
cursorposition changes - External value sync with
cursorHint— when parent sets value, cursor position is determined bycursorHintprop:null(default) moves cursor to end (tab completion, clear), a number moves cursor to that position (paste).cursorHintis consumed once and reset tonullafter use. - Helpers in
cjk-text-input-flow.ts:displayOffset(chars, charIndex, width)→ cumulative display column offset, accounting for CJK line-end gapscharIndexAtDisplayOffset(chars, targetOffset, width)→ char index closest to target offset
- Up arrow:
cursor = charIndexAtDisplayOffset(chars, offset - availableWidth, width) - Down arrow:
cursor = charIndexAtDisplayOffset(chars, offset + availableWidth, width) - Uses
string-widthfor CJK character support (2 columns per CJK character)
Available width calculation:
InputAreacomputesavailableWidthfrom Ink 7useWindowSize().columnsminus layout constantsavailableWidth = terminalColumns - BORDER_HORIZONTAL - PADDING_LEFT - PROMPT_WIDTH- Named constants (no magic numbers):
BORDER_HORIZONTAL = 2,PADDING_LEFT = 1,PROMPT_WIDTH = 2("> ") - Layout constants are co-located with InputArea (the component that owns the layout)
availableWidthis passed toCjkTextInputas a prop when visual navigation is enabled
Behavior:
- Up arrow when already on first visual line: no-op (target offset < 0)
- Down arrow when already on last visual line: no-op (target offset exceeds text)
- Column position is preserved across line moves via offset arithmetic
- Terminal resize recalculates available width via
useWindowSize()
Paste Handling
Paste event lifecycle:
CjkTextInputuses Ink 7usePaste, which owns bracketed paste enable/disable while the input is focusedrender.tsxmust not globally toggle DECSET 2004; paste lifecycle belongs to the active input hook, not the app rendererusePastedelivers the complete pasted string tocjk-text-input-flowas a single event, separate fromuseInputcjk-text-input-flownormalizes\r\n/\rto\nbefore deciding whether to insert text or emit a paste-label effect- Legacy bracketed paste marker handling remains in the flow as a fallback for callers that receive
[200~/[201~throughuseInput - Deterministic boundary detection — no debounce or timing heuristics
Single-line vs multiline paste:
- Single-line paste (no
\n): inserted directly into the input at the current cursor position viainsertAtCursor - Multiline paste (contains
\n): routed toonPaste(text, cursorPosition)→InputArea.handlePasteinserts a[Pasted text #N +M lines]label at the current cursor position, stores content inpasteStore - On submit,
expandPasteLabels()replaces labels with actual content frompasteStore - Paste store is cleared after each submit
Fallback for terminals without bracketed paste:
- Multi-char input containing
\nor\ris treated as a single paste (original heuristic)
Plugin Management TUI
The /plugin command opens an interactive TUI for managing bundle plugins, built with MenuSelect, TextPrompt, and ConfirmPrompt components.
Screen Stack Navigation
The TUI uses a screen stack pattern with 8 screens:
| Screen | Description |
|---|---|
main | Top-level menu (Marketplace / Installed / Exit) |
marketplace-list | List of configured marketplace sources |
marketplace-action | Actions for a selected source (Browse / Add / Back) |
marketplace-browse | Browse plugins from a selected source |
marketplace-install-scope | Choose install scope (project / user) |
marketplace-add | Add a new marketplace source URL |
installed-list | List of installed plugins with enable/disable state |
installed-action | Actions for a selected plugin (Enable/Disable / Uninstall / Back) |
ESC navigates back in the stack. When the stack is empty, the TUI closes and returns to the normal input area.
Subagent Execution
Subagent execution (Agent tool, fork sessions, agent definition loading) is managed by @robota-sdk/agent-sdk internally. The CLI does not own subagent lifecycle state — InteractiveSession handles subagent and background task lifecycle.
The CLI owns Node runtime process adapters. It injects createManagedShellProcessRunner() into InteractiveSession as a kind: 'process' background task runner. SDK composition then exposes the separate BackgroundProcess tool; the existing foreground Bash tool remains unchanged.
The CLI also injects createChildProcessSubagentRunnerFactory() into InteractiveSession as the production subagent runner factory. The factory receives SDK-assembled subagent dependencies, but the runner starts a child Node worker and sends only serializable config/context/provider/agent-definition data over IPC. The worker reconstructs its provider inside the child process using the same concrete provider profile the CLI used for the parent session.
child-process-subagent-runner-result.ts owns child-worker result orchestration for the adapter: IPC message validation, timeout timer cleanup, early-exit errors, and transcript metadata projection. child-process-subagent-runner.ts remains the process factory and payload composer.
Agent command behavior is not owned by the TUI. The Robota binary can compose @robota-sdk/agent-command-agent as a default command module, but reusable CLI UI code only handles generic command modules.
Child-process subagent runner responsibilities:
- fork one worker process per subagent job
- pass
ISubagentSpawnRequest, agent definition, parent config/context, permission mode, and serialized provider profile over IPC - expose child
pidon the background task state - forward worker text/tool IPC messages to
BackgroundTaskManagerprogress events - create an append-only subagent transcript at
.robota/logs/PARENT_SESSION_ID/subagents/AGENT_ID.jsonland make/agent read AGENT_IDread that transcript while the worker is still running - forward cancellation to the worker and terminate it after a grace period
- forward follow-up prompts to workers that support input
- keep runtime-owned lifecycle state inside
BackgroundTaskManager; the CLI owns only the Node process adapter
When an agent request sets isolation: 'worktree', the CLI composes the runtime-owned WorktreeSubagentRunner exposed through SDK contracts around the child-process runner and injects a CLI-owned GitWorktreeIsolationAdapter.
The runtime worktree runner owns worktree lifecycle orchestration:
- delegate non-worktree requests unchanged
- run isolated workers with
cwdset to the prepared worktree path - remove clean worktrees on success or worker failure
- preserve dirty worktrees and return
worktreePathplusbranchNamein result metadata - fire SDK hook notifications for
WorktreeCreateandWorktreeRemovewhen configured
The CLI-owned Git adapter implements only local Git/filesystem I/O:
- create a temporary branch and worktree before the worker starts
- remove the worktree and branch when the worktree remains clean
- report whether the worktree has local edits
When a user invokes a skill slash command with context: fork, the CLI must call interactiveSession.executeSkillCommand(...). The CLI may render a skill-invocation event, but it must not convert fork skills into plain prompt injection. This keeps fork execution deterministic and preserves the CLI as a thin TUI shell.
When a user asks in normal conversation to call or delegate to an agent, the request is handled by the model through the SDK-owned Agent tool. The CLI only displays the resulting tool execution events and final assistant response.
Background agent task lifecycle and progress are projected into TuiStateManager.backgroundTasks through the runtime-owned event union exposed as the SDK background_task_event event. Text deltas are accumulated into a short preview, and tool start/end events update the current action. React components must render this state only; they must not own task transition or cancellation logic.
TuiStateManager owns presentation-only visibility policy. Clean completed tasks remain visible as an unread completion notice until the next accepted user turn, then leave the always-visible background panel without calling closeBackgroundTask(). Failed, cancelled, non-zero exit, signal-terminated, and worktree/branch-bearing terminal tasks remain visible until explicit close or acknowledge. /background list and /background read continue to use the SDK runtime registry, so tasks hidden from the panel remain inspectable until runtime close or session cleanup.
BackgroundTaskPanel renders active and recently completed background tasks as a one-level tree headed by Background work. Each child row is built by the pure formatBackgroundTaskRow formatter and contains a compact status marker, human-readable agent/process label, secondary metadata such as idle time or timeout reason, and a short whitespace-normalized preview. The always-visible panel must not expose raw task IDs; task IDs remain available through /background list and /background read. The status marker uses the panel's existing status colors instead of rendering status words in the always-visible task list. User controls are routed through SDK system commands:
| Command | Behavior |
|---|---|
/background or /background list | List current background tasks |
/background read <task-id> [offset] | Read stdout/stderr log lines |
/background cancel <task-id> | Cancel one queued/running task |
/background close <task-id> | Dismiss one terminal task |
For implementation details of subagent/background execution (Agent tool, context: fork skills, background task manager, agent definition scanning), see the agent-sdk and agent-runtime SPEC files.
Background job groups are SDK-owned orchestration state. The TUI may render group view models derived from background_job_group_event, but it must not decide group completion, aggregate raw logs, trigger continuations, or own retry/wait behavior. Group waiting and summaries are exposed through SDK APIs and /agent wait command behavior.
Memory Management
Project Memory Review Surface
Project memory behavior is SDK-owned. The CLI and TUI must not extract memory candidates, select relevant topics, decide approval policy, or write .robota/memory files directly. They route /memory commands through session.executeCommand() and render returned messages/data.
Supported SDK-owned project memory commands exposed through the CLI:
| Command | CLI responsibility |
|---|---|
/memory list | Render memory index/topic paths returned by the SDK |
/memory show [topic] | Render memory index or topic content returned by the SDK |
/memory add ... | Pass arguments to the SDK command; render save/dedup result |
/memory pending | Render pending automatic candidates returned by the SDK |
/memory approve ID | Pass the selected candidate ID to the SDK; render save result |
/memory reject ID | Pass the selected candidate ID to the SDK; render reject result |
/memory used | Render SDK-reported memory references used in the latest turn |
Pending-memory notices emitted into InteractiveSession history are presentation data only. TUI components may style or position them, but must not infer candidate IDs or mutate memory state outside SDK commands.
Edit Checkpointing
Edit checkpoint behavior is SDK-owned. The CLI and TUI must not snapshot files, restore files, inspect checkpoint manifests directly, or decide rollback ordering. They route /rewind commands through session.executeCommand() and render returned messages/data.
Supported SDK-owned edit checkpoint commands exposed through the CLI:
| Command | CLI responsibility |
|---|---|
/rewind list | Render checkpoint summaries returned by the SDK |
/rewind restore <checkpoint> | Pass the selected checkpoint ID to the SDK |
/rewind code <checkpoint> | Alias for SDK code restore; render the restore result |
Future Esc Esc or picker UI is terminal chrome only. The picker must call SDK APIs or commands; it must not duplicate checkpoint storage or restore algorithms.
Message Windowing
TuiStateManager keeps only the most recent 100 entries (MAX_RENDERED_MESSAGES) in history: IHistoryEntry[]. Older entries are dropped from the render tree to prevent unbounded memory growth. Full conversation history is preserved in the session store on disk.
Tool State Cleanup
Completed tool execution states are trimmed to the most recent 50 entries (MAX_COMPLETED_TOOLS). Running tools are always kept. This prevents activeTools array from growing unbounded during tool-heavy responses.
React.memo
MessageItem component uses React.memo to skip re-renders when message props are unchanged, reducing CPU and indirect memory pressure from Ink's full-tree reconciliation.
Message Architecture
The CLI uses IHistoryEntry (from @robota-sdk/agent-core, re-exported by @robota-sdk/agent-sdk) as the primary message type for the message list. TUniversalMessage is still used in lower-level contexts (session history access, type guards, provider calls). There is no local IChatMessage type.
Type Unification
IHistoryEntry[]is the primary type held byTuiStateManagerand passed toMessageListMessageListrenders entries viaEntryItem, which dispatches onentry.category:'chat'entries: rendered as conversation messages (user, assistant, system, tool)'event'entries: rendered based onentry.type(e.g.,'tool-summary'renders the tool call list,'skill-invocation'renders a system notice)
entry.id(UUID) is used as the React key for message list renderingTUniversalMessageis still used where needed (type guards, provider API calls,getMessages()for backward compat)msg.state === 'interrupted'shows an interrupted indicator in the UI
Message State in useInteractiveSession
history: IHistoryEntry[]React state is managed byTuiStateManagerand derived frominteractiveSession.getFullHistory().- After each execution (when
thinkingtransitions tofalse), the hook delegates toTuiStateManagerto synchistoryfrominteractiveSession.getFullHistory()— the session is the SSOT for all history content. addMessageappends a local system message directly to React state (used for command output and error notices that are not part of the AI conversation). These are wrapped asIHistoryEntrywithcategory: 'event'before insertion.- After abort: interrupted messages are already committed to session history by
InteractiveSession; the hook re-syncs from full history — no separate streaming text ref is needed.
Tool Message Type Guards
Tool messages use the isToolMessage(msg) type guard for safe access to msg.name.
Known Limitations
- Korean IME on macOS Terminal.app: Ink's renderer shifts the input area during IME composition, causing Terminal.app to crash (SIGSEGV). Fixed by adding a permanent blank line below the input area, which stabilizes the cursor position during IME composition. Use iTerm2 for the best experience.
- CjkTextInput: Custom text input component with try-catch error handling, non-printable character filtering,
setCursorPositionremoved to minimize IME interaction surface, and optional visual-line-aware up/down arrow navigation for wrapped text.
Dependencies
@robota-sdk/agent-cli requires Node.js 22+ because Ink 7 requires Node.js 22 and React 19.2+.
| Package | Purpose |
|---|---|
@robota-sdk/agent-command-agent | Optional default /agent command module composed by the Robota binary |
@robota-sdk/agent-sdk | InteractiveSession, CommandRegistry, command sources, plugin management, re-exported runtime contracts |
@robota-sdk/agent-core | Public types (TPermissionMode, TToolArgs, TUniversalMessage, etc.) |
@robota-sdk/agent-provider-anthropic | Default provider definition contributed by the Robota binary |
@robota-sdk/agent-provider-openai | Default provider definition contributed by the Robota binary |
@robota-sdk/agent-provider-gemma | Default provider definition contributed by the Robota binary |
@robota-sdk/agent-transport-headless | Headless runner for print mode (-p) execution |
ink 7, react 19.2+ | TUI rendering |
ink-select-input | Arrow-key selection (permission prompt) |
ink-spinner | Loading spinner |
chalk | Terminal colors |
ink-text-input | Base text input (extended by CjkTextInput) |
marked, marked-terminal | Markdown parsing and terminal rendering |
cli-highlight | Syntax highlighting for code blocks |
string-width | Unicode-aware string width calculation |