The four substrates skillscript expects
Skillscript-runtime is substrate-neutral and assumes you have (or will choose):- A filesystem — for skill source files (
.skill.md), trace records, the bundled sqlite databases. Sandbox via container, chroot, or limited-privilege process — operator’s call. - A data store — for retrieval and writes from skill ops. Could be SQLite-FTS (bundled), a vector database, an in-house store, an Obsidian-style notes system, a memory broker — whatever you already have.
- An LLM endpoint — Ollama running locally (bundled), a hosted API like OpenAI / Anthropic / Azure, or your own inference server.
- An agent harness — where skill output is delivered. Could be tmux sessions, a webhook receiver, an in-house agent runtime, or no harness at all (skills run for their text output only).
SkillStore, DataStore, LocalModel, AgentConnector. Plus McpConnector for any external tool you want to invoke from a skill body.
What the runtime promises connector implementors
You only need to know what methods the runtime will call. Everything else — where data lives, what fields you honor, internal authorization, expiration, indexing — is your implementation choice. The contracts:- Where the data lives (sqlite / your DB / hosted service / vector store)
- What metadata fields your substrate honors or ignores (kwargs passed via
metadata.<key>ride through; you choose what to do with them) - Vaults, namespaces, tenants, access-control — substrate-specific
- Expiration / decay / pinning / reranking — substrate-specific
- Authentication into your own backend — your code, your decision
Case 1 vs Case 2 — the load-bearing wiring decision
This is the most important architectural choice you’ll make.Case 1 — typed-contract wiring (substrate-portable)
You implement the typed connector contracts (DataStore, LocalModel, etc.) against your substrate. The bridge classes (DataStoreMcpConnector, LocalModelMcpConnector) surface them as canonical $ data_read / $ llm dispatch.
Case 2 — MCP-tools wiring (substrate-locked)
Your substrate exposes itself as MCP tools (via a local MCP server or remote one). You wire it as anMcpConnector and skills reference its tools by name with substrate-specific kwargs.
MCP transport — two paths. The protocol’s wire layer is the same; the transport differs:
-
Stdio MCP (most common for community servers — YouTrack, GitHub, Linear, etc.): the MCP server is a binary you spawn as a child process and communicate with via stdin/stdout. Wired via
RemoteMcpConnector:Set"framing": "newline"explicitly —mcp-remoteand most spec-compliant MCP stdio servers use newline-delimited JSON. The legacylspdefault silently hangs toinit_timeoutagainst newline-framed servers. See configuration.md —RemoteMcpConnectorconfig — stdio framing. Thecommand(server binary,node/npx,uv/uvx, etc.) must be on the runtime host’s PATH at spawn time. A programmatic bootstrap inherits its launching shell’s PATH; the CLI inherits the user’s shell PATH at invocation. -
HTTP MCP / Streamable HTTP (Anthropic’s hosted MCP, GitHub MCP, Linear MCP, etc.): the MCP server speaks JSON-RPC over HTTP with Server-Sent Events for the stream channel. Two ways to wire it:
-
(a) Stdio bridge —
RemoteMcpConnector+npx mcp-remote https://...runs a node child process that bridges HTTP MCP into stdio for the runtime to consume.mcp-remoteauto-negotiates the transport (Streamable HTTP first, falling back to SSE); add--sseonly if the server requires SSE explicitly. Works today; adds the bridge subprocess overhead per call. -
(b) Direct HTTP connector (bundled) —
HttpMcpConnectorspeaks Streamable HTTP MCP directly, no subprocess. Substrate-neutral: works against any MCP server speaking the spec. Wired declaratively:Identity-propagation config:identityHeader— when set, the connector readsctx.agentIdper call and threads it as both a per-call request header AND the session-pinning key. Each distinct agent identity gets its own session, pinned to that identity at server-sideinitializetime. Required for substrates that pin sessions to the initializing identity (the common case for memory-substrate MCPs). Omit it when every caller shares one identity — all calls then share a single default session.maxPoolSize— optional cap on the per-identity session pool (LRU eviction by access recency). Default unlimited; set when your substrate has session-count limits or you want bounded resource use.
identityHeaderis set,supports_identity_propagation: trueis declared inruntime_capabilities. TheRuntimeCapabilitiesConformancesuite then requires Level 1 + Level 2 probes wired viaflagProbes— see Connector Contract Reference for the probe contract. -
(c) Custom direct connector — fork
examples/connectors/McpConnectorTemplate/when you need behavior the bundledHttpMcpConnectordoesn’t cover (e.g., a non-spec auth handshake, tool-name normalization, custom retry logic).
-
(a) Stdio bridge —
my_store — its specific kwargs (region, cluster) and response shape. To move to a different substrate, every call site has to be rewritten.
⚠ Programmatic adopters: use bootstrapFromEnv() and it loads $SKILLSCRIPT_HOME/connectors.json (plus .env + SKILLSCRIPT_*) for you, exactly like the CLI. Only raw bootstrap() skips them — it reads connectors.json solely when you pass connectorsConfigPath, and skipping it makes $ name.tool calls fail at runtime with unknown-connector and no hint that the file simply wasn’t read. See §“Programmatic bootstrap path” below — bootstrapFromEnv() is the recommended default; the raw-bootstrap() explicit-wiring pattern is for hand-assembled custom substrates.
Picking — the tradeoff
| Aspect | Case 1 (typed) | Case 2 (MCP) |
|---|---|---|
| Skill portability | ✓ portable | ✗ substrate-locked |
| Substrate feature coverage | Limited to typed contract surface | Full substrate surface |
| Implementation effort | Implement typed interface | Wire existing MCP server |
| Best for | Skills you want to ship | Substrate-specific power features |
data_read (typed-contract via bridge) AND my_store (MCP) — and let skills opt into portability by which connector name they reference.
For the substrate-portability claim to hold, the substrates you care about must be Case-1-wired.
Joe Programmer setup walkthrough
1. Install + initialize
~/.skillscript/ with skills/, traces/, an empty connectors.json, and a config.toml stub.
If you’re writing a custom bootstrap (not just using the bundled CLI): install the package locally (npm install skillscript-runtime in your adopter project) so tsx/node can resolve it alongside your project dependencies. A global install is convenient for the bundled CLI but doesn’t expose the package to your custom bootstrap’s module resolution.
The package is ESM-only. npm init -y produces a CJS package.json by default, which will fail your first bootstrap run with top-level-await / ESM-import errors. Switch your adopter project to ESM before authoring bootstrap code:
2. Decide on substrate wiring
For each of the four substrates (data store, LLM, agent harness, MCP tools), decide Case 1 or Case 2. The onboarding scaffold (examples/onboarding-scaffold/) is Case 1 end-to-end against a file-backed data store + OpenAI + tmux.
3. Configure runtime knobs
Createskillscript.config.json in your $SKILLSCRIPT_HOME:
${VAR} substitutes against process.env. See skillscript.config.json.example in the repo for the full surface.
4. Wire your substrates
For the bundled CLI path (no custom code): useconnectors.json to declare your MCP servers; use OPENAI_API_KEY / OLLAMA_BASE_URL env vars; run skillfile dashboard --config ./skillscript.config.json.
For custom substrates: write your own bootstrap. See examples/custom-bootstrap.example.ts and examples/onboarding-scaffold/bootstrap.ts for complete worked walkthroughs.
Security knobs that adopters wiring real substrates should know about:
- The approval boundary (secured mode).
SKILLSCRIPT_SECURED_MODE=trueenforces that only key-signed skills perform effects. Default-deny by design — leave it off only for trusted-author / single-operator setups. Full detail, the approve flow, and key custody: Approval + secured mode. - Filesystem path allowlist.
SKILLSCRIPT_FS_ALLOWLISTis default-deny —file_read/file_writerefuse every path until you wire roots. See Filesystem path allowlist. (Keep your approval-key directory out of it.) - Per-connector tool allowlists —
allowed_toolson eachconnectors.jsonMCP connector entry restricts which tools that connector can dispatch. Three-state (undefined= allow all,[]= allow none, listed = exactly those). Tier-1disallowed-toollint + runtime defense-in-depth refuse out-of-list dispatch.allowed_toolsbelongs at the entry top-level, sibling toclassandconfig— NOT inside theconfigblock. The loader hard-errors on misplacement (placing it insideconfig:would silently allow-all every tool — the worst-case failure mode for a security control). Seedocs/configuration.md§“Named MCP connector instances” for the JSON shape. - Shell-execution discipline —
shell(command="...")runs structured-spawn by default (binary on PATH, whitespace-tokenized argv, no bash).shell(command="...", unsafe=true)opts into bash interpretation (pipes,$VAR, command substitution) and refuses to fire unless the runtime is configured withenable_unsafe_shell = trueinconfig.toml. Lint flags everyunsafe=trueop tier-2 to keep audit posture visible. Seescaffold/config.tomlfor the documented default +help({topic:"lint-codes"})for theunsafe-shell-disabledrule.
McpConnector class, register it with registerConnectorClass before loading config:
5. Two-instance posture
Running dev-skillscript alongside an adopter-wiring instance on the same machine:6. Running as a supervised service
For cron/event fires to be reliable, the dashboard/MCP host must run supervised — surviving both reboot and crash. The scheduler is in-process, at-most-once, with no catch-up (see Trigger model — durability honesty): a down process silently misses fires. A cron skill like0 3 * * * only fires if the process happens to be up at 3am. nohup-from-a-shell isn’t sufficient (dies on reboot/logout/crash).
Use the OS supervisor: a macOS LaunchAgent (RunAtLoad + KeepAlive) or a Linux systemd unit.
macOS — ~/Library/LaunchAgents/com.skillscript.adopter.plist:
launchctl load ~/Library/LaunchAgents/com.skillscript.adopter.plist.
Linux — /etc/systemd/system/skillscript-adopter.service:
systemctl enable --now skillscript-adopter.service.
The load-bearing gotcha — pin PATH in the supervisor environment. launchd and systemd run with a bare PATH (typically /usr/bin:/bin), not your shell’s. The runtime’s spawned binaries — node/npx for RemoteMcpConnector (mcp-remote bridges), uv/uvx for Python MCP servers, every binary on the shell allowlist (gh, curl, git, etc.) — must be on that pinned PATH. Omit it and connectors + shell ops fail with “command not found” under the supervisor even though they work from your shell. This is the same class of environment-specific break as forgetting to pin TMUX_TMPDIR in a tmux launcher’s plist.
Spirit-of: same PATH concern as the Case-2 stdio host-PATH note — two surfaces (the shell that launches it ad-hoc AND the supervisor that keeps it up persistently).
A few additional gotchas:
WorkingDirectorymatters. Set it to the bootstrap’s directory so relative imports (./connectors-config.ts) andnode_modulesresolve correctly. Without it, the supervisor runs from/and ESM imports fail.- Single supervisor only. Don’t also run the dashboard manually (
nohup skillfile dashboard &) alongside the supervised instance — two listeners on the same port either collide or round-robin events, dropping fires silently. .envstill loads. The runtime’sprocess.loadEnvFilecall reads.envfrom the working directory at boot, independent of the supervisor’s environment block. Don’t duplicate secrets in both — keep them in.envand let the bootstrap load them.
7. Wire your agent to the runtime over MCP
With the service running, point your agent at the runtime’s/rpc endpoint.
Simplest path (Claude Code + similar agent hosts): just ask. “Add the skillscript MCP server at http://localhost:7878/rpc” — Claude Code writes the config for you. Skip the JSON below unless you want to understand what landed or you’re on a host that doesn’t write its own config.
Manual config — standard HTTP MCP (modern agent hosts that speak HTTP MCP directly) — drop the runtime in your agent’s MCP config (~/.claude.json, .mcp.json, claude_desktop_config.json, etc.):
mcp-remote the same way you’d bridge any HTTP MCP server. Pair this with the framing: "newline" note from Case 2 if you wire it the other direction (runtime → external HTTP MCP):
- Port matches what the runtime binds. Default
7878(SKILLSCRIPT_PORTenv var /dashboard.portconfig). Two-instance setups (dev + adopter side-by-side per §5) need distinct ports — e.g., adopter on7879. Whatever the supervisor unit specifies is what the agent config must point at. - Auth + bind address.
/rpchas no auth and binds to127.0.0.1by default — the right posture for a local agent talking to a local runtime. If you setSKILLSCRIPT_HOST=0.0.0.0to reach the runtime across the network, front it with your own auth layer (reverse proxy + bearer token, mTLS, etc.) — same posture as the/eventHTTP ingress, which has its own bearer-token gate (SKILLSCRIPT_EVENT_INGRESS_AUTH_TOKEN).
Approval + secured mode
Every skill carries a lifecycle status —Draft → Approved → Disabled. Only Approved skills execute; Draft is the safety gate (a skill under authoring or review), Disabled is retired. That status gate applies in both runtime modes. What the mode decides is whether an Approved skill must additionally be cryptographically keyed:
| Unsecured | Secured (SKILLSCRIPT_SECURED_MODE=true) | |
|---|---|---|
Draft skill | refused | refused |
Approved skill | runs — a bare # Status: Approved is sufficient (unkeyed) | runs only if the body carries a valid operator signature |
| Tamper-evidence | none (an Approved body can be edited freely) | yes — editing the body breaks the signature |
Effectful ops ($, shell, file_*, notify) | run | refused unless the skill is signed (the effectsAuthorized capability gate) |
| Who can grant approval | anyone who can set the status | only a holder of the operator’s private key |
/event, MCP, in-skill composition). Unsecured is the current default; secured is the recommended posture for any multi-author or networked deployment.
The shell + filesystem allowlists (below) are always enforced, in both modes — they bound what a running skill may touch. Secured mode adds the orthogonal boundary on whether an unapproved skill runs at all.
Asymmetric signing — how secured approval works
Approval is an Ed25519 signature, not a shared secret:- Private key — held by the operator, read only by the approve flow. That flow is either
skillfile approveat a terminal (the runtime process never loads the key) or, if you opt into in-browser approval, the dashboard’s passcode-gated/approve— which reads the key only after a live human enters the session passcode, never on its own. For maximum isolation, keep signing terminal-side and run the runtime under a uid that cannot read the private key — then a co-resident agent can’t forge approvals even with full read access to the runtime process. Same-uid (or in-browser signing) still works, but isn’t fully isolated (a documented 1.0 precondition; managed custody closes it later). - Public key — non-secret; the runtime reads it to verify signatures on every execution. No hot-path secret is the point: verification needs only the public key.
skillfile init in secured mode generates the keypair if absent (private written 0600, default ~/.config/skillscript/approval.{key,pub}, deliberately outside SKILLSCRIPT_HOME, the agent-readable data dir).
The approve flow
| Action | How |
|---|---|
| Approve one skill (review the body, then sign) | skillfile approve <name> at a terminal |
| Batch-approve everything pending | skillfile reapprove --apply (dry-run preview: skillfile reapprove) |
| Approve from the dashboard — terminal signing | the Approvals queue surfaces the pending set, each skill’s security signals, and its effectful footprint; copy the emitted skillfile approve command and run it at your terminal |
| Approve from the dashboard — in-browser | with SKILLSCRIPT_APPROVAL_PASSCODE set, the queue’s Approve button signs server-side after a one-time passcode unlock |
- Terminal signing. With no passcode configured, the dashboard is a pure review surface — pending queue, per-skill security signals, effectful-footprint checklist — but
/approveis disabled and signing happens at your terminal where the key lives. The right posture when the dashboard host and the key-holder’s terminal are the same machine. - In-browser signing. With
SKILLSCRIPT_APPROVAL_PASSCODEset, the dashboard can sign without a terminal round-trip, and holds no standing signing power: the first approve in a browser session prompts for the passcode (POST /unlock), unlocking signing for that session only (~15-minute idle TTL, server-side, cookie-bound). A live human with the passcode is in the loop for every session — the runtime never gains the ability to approve on its own. Pair it withSKILLSCRIPT_DASHBOARD_AUTH_TOKEN(network hygiene on the dashboard) when the dashboard is reachable beyond localhost.
skill_write / skill_status cannot grant approval in secured mode — an agent-written or status-flipped skill lands Draft until a key-holder signs it, enforced at the MCP handler regardless of the SkillStore substrate. (In unsecured mode, skill_status → Approved is honored directly, unkeyed.)
Misconfiguration is loud. If secured mode and a passcode are both set but signing isn’t actually wired (no SkillStore reaches the dashboard, or no readable key), the dashboard surfaces the gap as a banner and a stderr boot-log line and stays review-only — skills remain inert (fail-closed) while you see and fix it. The boot-log is what reaches a headless / programmatic adopter who never opens the dashboard.
Migrating an existing skill set
Turning secured mode on means any skill that’sApproved without a valid signature is refused at execution. skillfile reapprove sweeps the store, reports the set needing re-blessing, and --apply re-signs them in one pass — so you don’t run skillfile approve once per skill.
Bundled demos
Bundled example skills ship as# Status: Draft — a signature baked at package-build time could never validate on your install (the key is per-operator). skillfile init locally approves the three seeded demos with your machine’s authority (secured → provision keypair + sign; unsecured → bare Approved), so they’re runnable immediately after init.
Deleting skills
Deleting a skill is an operator-only action — there is no agent / MCP delete surface (an agent can author and disable, but only the operator removes). Two surfaces:| Surface | How |
|---|---|
| CLI | skillfile delete <name> |
| Dashboard | the Delete skill button on a skill’s detail view |
FilesystemSkillStore unlinks the files; SqliteSkillStore hard-cascades both tables). Its triggers are dropped from the live scheduler, and its name frees up immediately for a fresh skill_write/store. The safety is the confirm step plus the reverse-dependency check, not recoverability.
Before deleting, both surfaces run a best-effort reverse-dependency scan — which other skills statically reference the target via $ execute_skill(name="…") or inline(skill="…"). If any do, the delete is blocked until you confirm (“triage-flow references this — delete anyway?” in the dashboard; --force on the CLI). The scan is literal-name only: a runtime-resolved name="${VAR}" reference can’t be detected statically, so a dynamic caller may slip past the warning — and because delete is permanent, treat the scan as a heads-up, not a guarantee.
Adopter SkillStores implement delete(name) per the contract; the runtime only requires “remove from normal views,” so a soft-delete (tombstone + filter) is a valid adopter-side choice if you need recovery or audit-grade retention — recovery semantics are the store’s concern. See Connector Contract Reference.
Shell binary allowlist
The runtime enforces a default-deny operator allowlist for binaries reachable viashell(...) ops. Skill authors are agents, agents are a weak trust anchor (hallucination, prompt-injection, no human-in-loop at scale), and operator-side scoping converts “a human reviews every skill” from discipline into an enforced constraint at the language level.
The behavior
Two independent operator axes — do not conflate:| Axis | Operator switch | Controls |
|---|---|---|
| Binary scope | SKILLSCRIPT_SHELL_ALLOWLIST | Which binaries shell(...) can invoke |
| Syntax scope | SKILLSCRIPT_ENABLE_UNSAFE_SHELL | Whether bash interpretation (pipes / $VAR / $(...)) is permitted |
| Skill op | Binary on allowlist | Binary off allowlist |
|---|---|---|
shell(command="X ...") (structural) | runs | refused with ShellBinaryNotAllowedError |
shell(argv=["X", ...]) (explicit token list) | runs | refused with ShellBinaryNotAllowedError |
shell(command="X ...", unsafe=true) with enableUnsafeShell=true | runs (if bash on allowlist) | refused (off-list bash blocks ALL unsafe shell) |
shell(command="X ...", unsafe=true) with enableUnsafeShell=false | refused with UnsafeShellDisabledError (syntax axis fires first) | — |
unsafe keyword, not # Autonomous: true, not approved="reason". Binary scope is an operator boundary the author cannot talk past.
Three call shapes — command=, argv=, unsafe=true
Three ways to invoke a binary. Each picks a different point on the safety/expressiveness curve:
| Form | When to reach for it |
|---|---|
shell(command="curl -s https://example.com/${ID}") -> R | Simple commands, no spaces or quote-special chars in substituted values. The string is whitespace-tokenized + quote-stripped; structural spawn (no shell). |
shell(argv=["say", "-v", "${VOICE}", "-f", "${PATH}"]) -> R | Args with spaces, JSON payloads, dynamic strings. Each list element is exactly one argv token; substitution per element doesn’t re-tokenize. Strictly safer than unsafe=true — no shell process, no metacharacter interpretation, injection-surface zero. The right tool when a ${VAR} may contain whitespace. |
shell(command="echo hi && date", unsafe=true) -> R | Pipes, redirects, shell built-ins, command-substitution. Requires enableUnsafeShell: true at the runtime; lint flags every appearance (tier-2 unsafe-shell-op). |
argv= is mutually exclusive with command= and with unsafe=true (parse errors on either combination — argv is execv-class, there’s no shell to opt into). The same allowlist gate applies to argv[0].
Quote-trap lint (shell-quoted-var-in-command)
The pattern shell(command="echo '${VAR}'") looks safe — quotes around the substitution — but the structural tokenizer respects quotes during the original whitespace split, then strips them. After substitution, if VAR contains quote characters (Jamie's) the matching can drift. Tier-2 lint shell-quoted-var-in-command fires on this pattern and points at the argv= form as the safer answer. Works fine for known-simple values; the lint exists because the failure mode is silent-wrong, not loud-error.
Discovering your corpus’s binaries
The default-deny posture means a freshly-installed runtime with no allowlist wired refuses everyshell() call. Use skillfile shell-audit to enumerate the binaries your skill corpus invokes:
$SKILLSCRIPT_HOME/.env (review/narrow as desired). The audit is the canonical path — running it lets you make explicit policy decisions instead of discovering refusals through runtime errors.
Programmatic bootstrap path
Most adopters operate via the web dashboard after a one-timeskillfile init. To stand that dashboard up from your own code, use bootstrapFromEnv() — the blessed entry point that wires a full runtime + DashboardServer from $SKILLSCRIPT_HOME exactly the way the CLI does: it loads .env, skillscript.config.json, and connectors.json, resolves the whole SKILLSCRIPT_* env cascade, calls bootstrap(), wires declarative triggers, and assembles the server.
start(). Options: mode, home (default $SKILLSCRIPT_HOME), configPath, connectorsConfigPath, port, host, and overrides. Precedence: explicit option > overrides (passed to bootstrap(), wins last) > SKILLSCRIPT_* env > skillscript.config.json > default.
To wire your own custom substrate (e.g. a remote SkillStore), two paths:
- Declarative — if it’s expressible in
connectors.jsonvia the{ "type": "custom", "module": "./my-store.js", "export": "MyStore" }substrate form, just declare it;bootstrapFromEnv()honorssubstrate.*like the CLI. - Instance injection — pass a pre-built instance through
overrides, the common path for a store that needs constructor wiring you hold in code:Everything else (data store, models, connectors, env cascade) still auto-wires from$SKILLSCRIPT_HOME.overridesaccepts anybootstrap()option (skillStore/dataStore/localModel/agentConnector/ allowlists / …).
bootstrap() only when you’re hand-assembling a registry that neither connectors.json nor overrides can express.
⚠ Migrating an existing hand-assembled bootstrap to bootstrapFromEnv()? Move any options you previously hardcoded on bootstrap() / new DashboardServer({...}) to their SKILLSCRIPT_* env equivalents — bootstrapFromEnv() resolves them from env, so a value you drop reverts to the default. Two to watch (adopter-verified, 0.24.0): enableUnsafeShell: true → SKILLSCRIPT_ENABLE_UNSAFE_SHELL=true (fails loud — unsafe ops just refuse, you’ll notice); and mcpCallerIdentityHeader: "X-Agent-Id" → SKILLSCRIPT_MCP_CALLER_IDENTITY_HEADER=X-Agent-Id, which fails silently — drop it and skill-author attribution quietly reverts to the store’s default writer identity, with no error. (bootstrap()-level opts like enableUnsafeShell can instead go through overrides; the DashboardServer-level mcpCallerIdentityHeader is env-only via bootstrapFromEnv — set the env var.) Verify after migrating: send your identity header on a /rpc skill_write and confirm the captured author.
Raw bootstrap() — the CLI-auto-vs-programmatic-explicit asymmetry
bootstrapFromEnv() does the discovery steps below for you. If you call bootstrap() directly instead (hand-assembling substrates), be aware the CLI (skillfile dashboard / serve) does several discovery steps automatically — load $SKILLSCRIPT_HOME/.env, read SKILLSCRIPT_* env vars, load $SKILLSCRIPT_HOME/connectors.json — that raw bootstrap() does NOT. Each surface then needs explicit wiring:
| Surface | CLI path | Programmatic path |
|---|---|---|
.env (env vars in a dotfile) | auto-loaded from $SKILLSCRIPT_HOME/.env | call process.loadEnvFile() yourself BEFORE bootstrap() |
SKILLSCRIPT_* env vars | read from process.env automatically | read automatically once env is loaded (bootstrap() env-fallback) |
connectors.json (MCP wiring) | auto-discovered at $SKILLSCRIPT_HOME/connectors.json | pass connectorsConfigPath: "/path/to/connectors.json" to bootstrap() |
bootstrap() stays decoupled from the dotenv + SKILLSCRIPT_HOME convention so embedders can drive every input explicitly. The cost is doc-prominence — three surfaces, all silent-empty on omission, all noted here.
For your shell allowlist to work on the programmatic path, ensure the env var is in process.env BEFORE calling bootstrap(). Two patterns:
bootstrap() env fallback semantics: when opts.shellAllowlist === undefined, the runtime reads SKILLSCRIPT_SHELL_ALLOWLIST from process.env (comma-separated, trimmed). When opts.shellAllowlist is supplied — including the explicit [] deny-all — the option is authoritative and env does NOT widen it. This is security-load-bearing: an adopter passing shellAllowlist: [] to assert lockdown gets lockdown regardless of ambient env.
opts.shellAllowlist | SKILLSCRIPT_SHELL_ALLOWLIST env | Effective allowlist |
|---|---|---|
undefined (omitted) | unset | undefined → default-deny |
undefined (omitted) | "curl,jq" | ["curl", "jq"] (env fallback) |
undefined (omitted) | "" (explicit empty) | [] (explicit deny-all from env) |
["curl"] | anything | ["curl"] (explicit opt wins) |
[] (explicit) | "ssh,kubectl" | [] (explicit deny-all wins; env does NOT widen) |
Trust model — lint vs. runtime
Lint is local advisory; runtime is authoritative. Theshell-binary-not-allowed lint rule checks against the author’s environment allowlist (their local .env or the linter’s loaded config), which may differ from production. Passing lint does NOT guarantee the call will run.
- Authors lint with their own allowlist for immediate feedback (“you can’t use
sshin your dev env, fix this skill”). - CI pipelines lint with the deployment’s
.envto catch production-specific gaps before merge. - Runtime is the authoritative gate — refuses off-list binaries at first dispatch regardless of lint state.
The scope caveat — don’t oversell
The allowlist scopes which binaries are reachable, not what they do. Killsssh / kubectl / rm / nc wholesale — that’s real value. Does NOT stop misuse of a permitted binary: curl on the list still means curl-to-anywhere (exfil possible to attacker-controlled URLs).
Argument-level policy (host allowlists, URL pattern matching) and OS-level egress controls are separate, heavier layers — explicitly out of scope here. Don’t let this create false confidence that shell is “solved.” For the unsafe path, the proper binary-scope layer is OS-level:
- Restricted
PATH(only allowed binaries reachable to the runtime process) execve/seccompfilters (kernel-level allowed-binary set)- Container with curated
/usr/bin(immutable distroless image)
bash -c <body> would be unsound — e=curl; $e ..., $(printf cur)l ..., eval, xargs, var-built command names all defeat it. Against agent-author threat models, false confidence is worse than no enforcement.
Dashboard observability
The dashboard SPA athttp://<host>:<port> exposes two security-focused surfaces for the observe→promote loop:
Security view (#security route). Cross-skill list of blocked shell attempts — {skill, target, binary, body, timestamp} per refused call. Aggregated by binary so you can see at a glance “what did skills try to invoke that I haven’t allowlisted.” Backed by the blocked_shell_attempts MCP tool, which filters trace records by blocked_reason: "binary-not-allowed". Runtimes without the tool exposed degrade cleanly to an “unavailable” note.
Skill detail view — footprint checklist + security signals + source highlighting. Each skill’s detail page (clicking a skill name from #skills) shows:
- A “What this skill touches” panel right at the approve action: the skill’s effectful footprint — the MCP connectors it dispatches to, built-in
$ops, shell binaries, and counts offile_write/file_read/ unsafe-shell /notifyops, or “nothing effectful to authorize” for a pure skill. This is AST-derived — the same op enumeration the capability gate authorizes — so it’s the authoritative surface the skill can touch once signed, not a textual guess. The least-privilege checklist the approver reads before signing. - A “Security signals” panel: aggregated counts of shell ops + binaries used, unsafe-shell count,
# Autonomous: true, per-opapproved="..."authorizations, mutation ops ($ skill_write/$ data_write/file_write), wake-class@sessiondeliveries, cron triggers. (A heuristic source scan — it adds risk-framing signals the footprint doesn’t, e.g. theapproved=bypass and the autonomous flag; the two panels complement each other.) - Inline tinted highlighting on the skill source
<pre>body. Two tiers: orange for HIGH-tier signals (unsafe=true,# Autonomous: true,approved="...", mutation ops); yellow for MEDIUM-tier signals (shell(...)calls,notify(agent="X@session", ...)wake-class deliveries). Reviewers scan-prioritize: orange = review carefully; yellow = worth noting.
skill_list entry and the skill_preflight contract, so an agent composing against a skill reads the same truth the human approver does.
Future direction
Per-skill capability declaration: skills declare what shell binaries they need in their frontmatter:declared ∩ allowlist — each skill’s shell footprint becomes self-documenting and auditable. Slated for a future ring once the chokepoint + observability surfaces ship.
Filesystem path allowlist
The runtime enforces a default-deny operator allowlist for paths reachable viafile_read(...) / file_write(...) ops — the third operator boundary, mirroring the shell allowlist with the same rationale (skill authors are agents; the operator scopes which parts of the filesystem skills may touch).
| Operator switch | Controls |
|---|---|
SKILLSCRIPT_FS_ALLOWLIST | Comma-separated roots under which file_read / file_write may operate |
- Default-deny. An unset or empty allowlist refuses every file op with
FilePathNotAllowedError— a freshly-installed runtime does no file I/O until you wire roots. Applies tofile_readas well asfile_write(a read-then-emitis an exfiltration path). - Canonicalized before the check. Both the target and each allowed root are resolved to their real absolute path (realpath, component-by-component), so
..traversal and symlink evasion can’t escape an allowed root — the classic allowlist bypasses are closed. - Off-allowlist is final. No in-skill keyword (
approved=,# Autonomous: true) escapes it. Keep secret / key directories OUT of the allowlist — a skill must never be able to read the operator’s approval key. (This is why the default key path lives outsideSKILLSCRIPT_HOME.)
Secrets
A skill can reference an operator-provisioned secret by name, use it at a sink, and never read it back. This is how a credential (a bearer token, an API key) reaches ashell(...) or $ connector.tool call without living in the skill source, the transcript, or a trace.
Three parts:
- Declare with
# Requires: secret.NAME. - Place with a
{{secret.NAME}}marker — only inside a sink (ashell(...)op or a$ connector.toolarg). It is deliberately distinct from${VAR}:${VAR}is readable substitution over the skill’s variable scope and cannot reach a secret;{{secret.NAME}}resolves only at the sink and is never bound to a variable. - Provision the value as an env var (operator side, e.g.
.env):
| Operator switch | Controls |
|---|---|
SKILLSCRIPT_SECRET_<NAME> | The value {{secret.NAME}} resolves to. The SKILLSCRIPT_SECRET_ prefix scopes which env is secret-reachable — {{secret.PATH}} looks for SKILLSCRIPT_SECRET_PATH, never $PATH. |
secret-use-only — a marker in emit/$set/a condition/an # Output: template/a file_*/notify op/an op (fallback:); secret-undeclared — a marker with no # Requires; secret-dynamic-name — a non-literal name like {{secret.${VAR}}}). The runtime is the authoritative gate (it survives a lint bypass): it resolves only names declared in # Requires, only at the sink, and the trace + any error message render the {{marker}}, never the value. Fail-closed — an unprovisioned or undeclared secret aborts the op.
Provisioning is operator-only. A skill author names a secret; only the operator sets it (in the runtime env / .env). Keep secret values out of skill source and out of the fs allowlist, exactly like the approval key.
Threat-model boundary — read this before granting a secret. A secret you authorize a skill to use in a shell, that skill can also exfiltrate — e.g. shell("echo {{secret.X}}") -> OUT then emit("${OUT}"), or a curl that POSTs it to any host. This is inherent to passing a credential into a shell command and is not prevented by the runtime. Two controls:
- Approval review. In secured mode a human approves the skill before it runs — review shell-sink skills for exactly this (a command that echoes or forwards the credential).
- Prefer a connector sink. A
$ connector.toolwhere the connector holds the credential out of band is stronger than a shellcurl: the value never round-trips through skill-visible stdout. Use a connector sink for any credential whose exposure you can’t accept.
ps; a future vault-backed SecretProvider closes that with sink-aware injection. The SecretProvider seam is already in place — an adopter can supply a vault-backed provider via bootstrap({ secretProvider }) or bootstrapFromEnv’s overrides with no skill changes.)
Trigger model
The trigger surface is two primitives.| Primitive | Purpose |
|---|---|
cron | Time-based fires (unchanged) |
event | Generic external-signal fires via HTTP POST to /event |
/event when relevant. Keeps the runtime substrate-neutral; the trigger surface stays tight. The DeliveryMeta.origin.trigger_kind receiver enum exposes only "cron" | "event" | "webhook" | "agent" | "cli" | "dashboard" | "inline".
The event primitive
Skills declare an event trigger in their frontmatter:
run_id is the runtime’s trace_id — adopters paste it into the dashboard /fires view or query via the fires({trace_id}) MCP tool to look up completion status.
Wire contract
| HTTP status | Meaning |
|---|---|
| 200 | Accepted into THIS process’s in-memory queue. {run_id, durability: "in-process"}. Skill fires async; the 200 does NOT mean skill-completed |
| 400 | Body not JSON; or event_name missing/empty; or params don’t match declared (missing required or unknown extras) |
| 401 | Auth token configured + missing/wrong Authorization: Bearer <token> |
| 404 | /event route not enabled OR event_name not registered |
| 405 | Wrong method (POST-only) |
# Autonomous: true for event/cron skills doing mutations
Skills fired by cron OR event have no interactive author to confirm mutation ops. The runtime’s mutation gate ($ data_write / $ skill_write / file_write / mutating MCP tools) requires explicit authorization in non-interactive contexts. Three authorization paths exist:
?? / ask() in same target) is for interactive contexts (CLI / dashboard) and doesn’t apply to cron/event fires — there’s no user to ask.
Without one of these, the runtime throws UnconfirmedMutationError at the mutation op + the skill fails its fire. Symptom in the trace: class: "UnconfirmedMutationError" on the offending op.
The mutation gate is identical for cron + event sources — both are non-interactive. Lint surfaces unconfirmed-mutation as tier-2 warning at compile time so authors get the signal before the first fire.
event_name semantics
- Public contract addressed by POSTers. Skill behind the event can swap without breaking callers — the
skill_nameis private impl. - Case-insensitive at register + lookup (normalized to lowercase internally).
- 1:1: one
event_name→ exactly one skill in the deployment. Fan-out is NOT supported — a skill can call other skills via$ execute_skillif needed. - Cross-skill rebind allowed but audited: registering an
event_namethat’s already bound to a different skill replaces the binding (last-write-wins) AND logs a visible line (event_name 'X' rebound: skill A → skill B). Prevents silent cross-skill hijack without blocking the declarative re-save path. Same skill re-registering its own event is a silent upsert.
Enabling the /event HTTP ingress
Off by default. Two operator knobs:
eventIngressEnabled=true, the route mounts on the same HTTP server as the dashboard/RPC (no second port). Default bind is 127.0.0.1 (localhost-only) — adopters wanting 0.0.0.0 external reach pass --host 0.0.0.0 explicitly. The DMZ assumption is enforced by the bind, not by hope.
Durability honesty
**200 = ACCEPTED, not durable**.** The skill is queued in this process's memory; if the process crashes or restarts before the queue drains, the fire is lost. The durability: “in-process”` field on every successful response self-describes this — adopters reading the response know the contract without consulting docs. Cron triggers have the same property today (no catch-up replay if the process was down).
Durable / at-least-once delivery is a v2 if real adopter need surfaces. Don’t oversell the 200 contract; build durable queueing on top yourself if you need it now.
Bridging external sources
Anything that isn’t time-based is an external adapter that POSTs to/event:
- Session start/end — your harness POSTs
/eventafter boot or before shutdown if you want a skill to fire at session boundaries. - Agent events — your agent emits
POST /eventwhen the event of interest occurs. - File-watch — an external watcher script (
inotify/chokidar/fswatch) POSTs to/eventon changes. The file-watching itself is standard OS plumbing — you own that script. - Sensors — same pattern as file-watch; the sensor adapter POSTs to
/event.
# Triggers: sources fail to parse with a tier-1 parse error. skillfile lint against your corpus surfaces them.
Output template — body text IS the output
A skill’s body text — prose lines that don’t sit inside a target — is its declarative output template. The runtime interpolates${VAR} references against final vars and publishes the rendered string as the skill’s canonical output (outputs.text or the agent/template/file payload, depending on # Output: kind). No emit() ceremony required for the common case. The template may appear anywhere a target doesn’t — at the top, at the bottom, or split between targets — so authors can put output where their language instincts suggest (most write the output line after the compute that fills it in).
The shape
# Vars: defaults are. The author writes the sentence they want as output; the compute block fills in the holes.
Complementary channels — template vs. emit
Two independent output channels:| Channel | Source | Consumer |
|---|---|---|
Canonical output (outputs.text, agent/template/file payloads) | Body template if authored, else joined emit() text | The caller / lifecycle-hook recipient / file write |
Transcript (emissions[]) | All emit(text=...) calls + reasoning ops (?, @) | Trace records, dashboard /fires view, debug log |
emit(text=...) calls, the template owns canonical output; emit() populates transcript only. This is intentional — emit is the debugging / reasoning channel. The emit-with-template advisory lint surfaces this to confirm intent.
If you want emit to be your canonical output, don’t write a body template. Emit-only skills route emissions to the canonical output channel.
Output kinds
The body template populates whatever# Output: kind the skill declares:
# Output: text(default) — caller readsoutputs.text# Output: agent: <name>— rendered template delivered as augment payload to<name>via AgentConnector# Output: template: <name>— rendered template delivered as playbook payload# Output: file: <path>— rendered template written via file ops in the body
Target syntax — what counts as a target
A target is<name>: alone on a line, with an indented op-block on the next non-blank line. Anything else in the body region — content after a colon, a bare Note: with no following op-block, plain prose — reads as template text.
Two examples that are legal but worth knowing:
needs: keyword, which makes intent unambiguous:
report: fetch_issues shape (deps after colon, no needs: keyword) is still parsed as a target when an indented op-block follows on the next line, but needs: is the recommended form for new skills.
One template region per skill
A skill may carry template prose in one region. Splitting it into two — some prose above the targets AND some below — raises a tier-1 parse error: “skill has body template content in two places (before targets AND after targets). Pick one location.” Picking is loud over silent-concat; the author chooses where the template lives, the runtime doesn’t guess.Lints to know
unset-template-var(tier-1) — every${VAR}in the template must resolve to a# Vars:/# Requires:input, an ambient ref (NOW,USER, …), or a$set/->binding somewhere in the skill body. Tier-1 because an unbound ref renders empty silently.template-looks-like-target(tier-2) — bare<word>:alone in the template region, no following op-block. Ambiguous shape — author may have meant a target.connector-as-tool(tier-1) —$ <connector> <tool>space-separated catches the muscle-memory foot-gun (CLI-stylegit status). Bare-form dispatch treats the first token as the tool name, sendingname: "<connector>"to the MCP server, which replies with a misdirectingTool '<connector>' not found.The two correct shapes are$ <connector>.<tool> args(dotted) or$ <tool> args(bare-tool; runtime resolves).remote-result-needs-parse(tier-3) —${R|length}on anRbound by$dispatch. Per-MCP-server result shapes vary: if the server returns prose-wrapped or non-JSON text, the value binds as a STRING and|lengthreturns the string’s char-count instead of element-count. Add$ json_parse ${R} -> Pafter the dispatch and use${P|length}. Suppressed when the skill already does$ json_parse ${R}somewhere — your defensive parse is taken as intent.body-template-detected(tier-3) — non-blank, non-#lines in the body region, no${...}interpolations, no text-consuming# Output:declaration. Suggests “I wrote prose; it became template by accident.” Prefix with#to mark as comments, or add an interpolation /# Output:to confirm intent.emit-with-template(tier-3) — skill has both a template ANDemit(text=...)calls. Confirms the channel-shift is intentional.
Fallback semantics — (fallback: ...) op-trailer and |fallback: filter
Two surfaces, one emptiness predicate. Both fire when the upstream value is any of:
- empty string after
trim() - empty array (
[]) null/undefined
Op-trailer — (fallback: "value")
Trails the -> R binding on $ dispatch ops, shell(...), file_read(...). Binds the fallback value to R when the op throws OR produces an empty result.
result.fallbacks[] so callers can distinguish “real value” from “fallback substituted.” The original error message rides on the fallback record’s reason field; the op completes cleanly.
Filter — ${VAR|fallback:"value"}
Substitution-time fallback inside a template or kwarg. Same emptiness predicate; binds to the filter argument when the upstream value is empty.
${RAW|trim|fallback:"-"} applies trim first, then fallback if the trim left an empty string.
When to use which
- Op-trailer for binding-time protection — the variable lands populated regardless of the op’s outcome. Downstream consumers (other ops, template renders) don’t need to know whether the value is real or fallback.
- Filter for substitution-time defaulting — the variable may legitimately be empty/unset, you just want a placeholder at the render site.
Wiring the AgentConnector
AgentConnector is the substrate-neutral delivery surface for # Output: agent: X / # Output: template: X lifecycle hooks and notify() / exchange() ops. The runtime calls into the contract; your impl decides where the payload lands (webhook, tmux session, file drop, IPC pipe, Slack thread, your own agent harness, etc.).
The full contract surface — methods, payload shapes, receipt shapes, the agent@session targeting convention, the graceful-degradation rule — lives in Connector Contract Reference §AgentConnector. This section covers the wiring path for adopters: how to bring an impl online so the runtime uses it.
Two wiring paths
Same shape as the other substrate slots — programmatic (recommended for custom impls today) or declarative (connectors.json, restricted to bundled types).
(a) Programmatic — for adopter-written impls. Construct the connector in your bootstrap and pass it via BootstrapOpts.agentConnector:
bootstrap() calls registry.registerAgentConnector("primary", ...) for you. health_check() fires during registration — wiring failures throw at boot, not at first delivery.
(b) Declarative connectors.json — for bundled types and the (deferred) custom-via-dynamic-import path:
| Value | Behavior |
|---|---|
null (or omitted) | NoOpAgentConnector — silent fallback. deliver() / wake() log + resolve; # Output: agent: declarations complete with a stderr warning. Lets a runtime start with no harness wired. |
"noop" | Same as null but explicitly stated. |
Object with "type": "custom" | Adopter impl resolved by dynamic-import (deferred — surfaces a clear error today; use programmatic path). |
Precedence
Same as other substrate slots:- Programmatic
BootstrapOpts.agentConnector— explicit, highest priority. - Declarative
connectors.jsonsubstrate.agent_connector— deployment-durable. - Built-in default —
NoOpAgentConnector. Skills with# Output: agent:fire warnings, not errors.
health_check() throw if it can’t start.
Worked example
The canonical bundled example isexamples/connectors/HttpWebhookAgentConnector/ — a complete AgentConnector impl against an HTTP-webhook substrate. It demonstrates:
- Per-agent URL routing (
HTTP_WEBHOOK_AGENTSJSON env) - Optional
wake_urlper agent — present means wake-capable, absent means degrade-on-wake - Bearer + HMAC auth (combinable)
- Tolerant receipt synthesis (substrate returns substrate-shaped JSON; connector translates to canonical
DeliveryReceipt) - Tests covering the deliver / wake / health-check / request-response surface
agent@session opaque composite. Every messaging substrate needs either bare-identity OR specific-live-session addressing. The contract keeps agent_id opaque; sessions ride as "<agent>@<session>" (e.g. "alice@laptop-tab-3") or via WakeOpts.session_id. Substrates that care decompose; substrates that don’t ignore.
The runtime address-routes skill-author surfaces (notify() + # Output: agent: / template:) on @session presence: bare → your deliver(), composite → your wake(). You receive whichever method the runtime decided; your job is to honor what arrives. For wake(), expect the FULL composite ("<agent>@<session>") — decompose to route to the right session:
wake() must not throw because your substrate lacks interrupt capability. Distinguish capability-gap (degrade) from operational-fault (throw):
WakeReceipt.woken distinguish “the substrate woke them” from “the substrate stored the payload for later” without needing per-substrate knowledge.
Pattern 3 — session echo on receipts. When your substrate routes to a specific session, echo it back on DeliveryReceipt.session_id / WakeReceipt.session_id. Dashboards rendering “delivered to alice@laptop-tab-3” rather than just “delivered to alice” depend on this.
Pattern 4 — read meta.origin.caller_agent_id to attribute, not for scope. The DeliveryMeta envelope your deliver() receives carries origin.caller_agent_id = the authenticated caller who fired the dispatch (not the skill’s owner — those are separate semantics; see Connector Contract Reference §field semantics). Use it for attribution — rendering from <caller> on the receiving end, audit logs, accountability — not for authorization scoping. Outbound substrate scoping should derive from the skill owner (which the runtime applies at the connector layer via ctx.agentId, not via the envelope). If caller_agent_id is undefined on a delivery you receive, it means the chain originated from a non-human trigger (cron / event / scheduler) — your substrate should attribute it as “system-fired” or similar, not assume an identity.
Pattern 5 — surface non-fatal notes via DeliveryReceipt.warnings. When your substrate needs to signal something non-fatal about a delivery — “stripped @session because verb is deliver”, “rate-limit hint”, “fan-out: delivered to 3 sessions” — return them as warnings: string[] on the receipt instead of writing to stderr. The runtime echoes warnings onto AgentDeliveryReceiptRecord.receipt.warnings, where the dashboard can render them and observability tools can scrape them. Stderr noise gets lost; receipt warnings are structured + caller-visible.
When to fork vs. when to write fresh
- Fork
HttpWebhookAgentConnectorwhen your substrate is HTTP-shaped and your changes are: tweaked auth (OAuth, mTLS), retry policy, different routing model. Most production deployments end up here. - Write fresh when your substrate is fundamentally non-HTTP (tmux, file drop, gRPC, websocket-push). Implement the five required methods (
list_agents,deliver,wake,health_check,request_response) + optionalagent_status. UseNoOpAgentConnectoras the minimal-shape reference.
tests/HttpWebhookAgentConnector.test.ts is a useful template), wire via BootstrapOpts.agentConnector, and let health_check() enforce the “fail-at-boot, not at first delivery” property.
Authoring posture — who owns the skills you write
Every skill stored in aSkillStore carries a SkillMeta.author field captured at first-write. The author is then load-bearing at dispatch time: the runtime threads it into ctx.agentId so identity-scoped substrates (memory stores, multi-tenant DBs) read and write under that scope.
How author is captured depends on how the skill gets written:
-
CLI / dashboard / direct programmatic API. When you call
SkillStore.store(name, body)from your own code (CLI, bootstrap, scripts) or via the dashboard’s approval flow, the SkillStore captures author from its bundled default.FilesystemSkillStoreusesos.userInfo().username; adopter stores capture from their own auth context. -
MCP
skill_writefrom a single-tenant host. If only one agent (or one human) calls your runtime, you don’t need to configure anything — theSkillStore.store()default-author logic above applies. Skip the multi-agent section below. -
MCP
skill_writefrom a multi-agent host. If multiple agents share one runtime instance via MCP (e.g., a host that bridges several authenticated agents into one transport), the runtime can’t tell them apart at the protocol layer. See the next section.
Direct-write authoring path
Adopters whoseSkillStore is backed by an addressable substrate (e.g., a memory store) can author skills by writing the substrate record directly — without going through the MCP skill_write handler. This captures SkillMeta.author from the substrate’s own writer-identity (whatever the direct-write API authenticates as).
Gotcha: in secured mode, a direct-write declaring # Status: Approved without a valid signature is forced to Draft — the substrate can’t mint approval, only the operator’s key can. Publish by writing Draft, then signing via skillfile approve <name> or the dashboard Approvals queue (both preserve the captured author). In unsecured mode a bare # Status: Approved direct-write is honored as-is. Either way, in-skill $ skill_write always lands its child Draft regardless of body declaration. See Approval + secured mode.
Discovering connector tools while authoring
When a skill dispatches an MCP connector tool ($ <connector>.<tool> ...), the runtime helps an author get the call right without a guess-and-run loop:
runtime_capabilitieslists the wired connectors and their tool names (a compact menu). Pull one tool’s full argument schema on demand withruntime_capabilities({ tool: "<connector>.<tool>" })— the manual for the tool you’re about to call, kept out of the default response so it stays small.lint_skillvalidates the args you wrote against that schema: an unknown arg name ($ ddg.search querry=...) or a missing required arg surfaces as a tier-2 warning at author time instead of failing at dispatch. Degrades silently when a connector’s schema isn’t reachable (no false positives).skill_preflightshows, for the connector tools the skill actually calls, both their input schema and — once any approved run has dispatched the tool — its last-observed output shape (keys + types), so you know what-> Rwill bind before you run.
allowed_tools gate: a tool the operator gated off is never surfaced or validated against. The observed-output-shape cache holds keys + types only (not values) and lives operator-local under ~/.skillscript/.
Identity propagation — for multi-agent hosts
Skip this section if your runtime serves one agent (CLI tools, single-user dashboards, hobby deployments). The default —SkillMeta.author captured from the SkillStore’s writer identity — already attributes authorship correctly when there’s only one writer.
This section is for adopters whose runtime is fronted by an MCP host that bridges multiple authenticated agents into one transport (e.g., a multi-agent gateway, or a multi-tenant SaaS where agents share a runtime pool).
The gap MCP doesn’t close on its own
JSON-RPC over HTTP doesn’t carry a standard “calling identity” field. Without an extra convention, everyskill_write call into your runtime stamps SkillMeta.author = <runtime's own writer identity> — regardless of which agent on the host actually originated the call. Subsequent execute_skill dispatches then run under the wrong scope. Identity-scoped reads return the runtime’s own data, not the calling agent’s.
Opt-in: a configurable inbound header
When you configuredashboard.mcpCallerIdentityHeader, the runtime reads that header on every /rpc request and threads its value as the caller-identity through to skill_write. The handler stamps SkillMeta.author = <header value>. Different callers with different header values get distinct stored authors.
- Header lookup is case-insensitive (Node lowercases inbound header names).
- Absent header on a configured runtime → caller identity is undefined for that request →
SkillStore.store()falls back to its default author capture (existing single-tenant behavior). Backwards-compatible — hosts that don’t inject identity behave exactly as before. - Empty header value → treated as absent.
Trust model
The runtime trusts the host’s header attestation. There’s no signature verification on the identity claim (distinct from secured-mode approval, which is signature-verified — different boundary, different mechanism): anyone reaching the runtime with a forgedX-Agent-Id could claim to be anyone. The runtime is not the authentication boundary; the host is. Bilateral trust:
- The host (your MCP gateway) authenticates the agent via its own auth surface (OAuth, JWT, session cookies, mTLS — whatever fits your platform) and injects the verified identity into the outbound
X-Agent-Idheader. - The runtime trusts the host because you configured it to (
mcpCallerIdentityHeaderis opt-in; unset means “I don’t trust any inbound identity claim, fall back to my own writer identity”).
/rpc endpoint directly to untrusted clients with this configuration. Run behind your host’s auth-enforcing reverse proxy or in a trusted-network deployment.
Inbound vs outbound — same header, two layers
Connectors likeHttpMcpConnector use the same header name (X-Agent-Id by convention) for outbound calls to substrates — see the HttpMcpConnector configuration above. The two are NOT the same value in general:
- Inbound (this section) = request-scoped caller — who’s currently calling the runtime via MCP.
- Outbound (
HttpMcpConnector.identityHeader) = dispatch-scoped owner — derived fromSkillMeta.authorof the skill being executed, asserted to the substrate so reads land in the owner’s scope.
SkillMeta.author. The runtime captures inbound caller identity at skill_write (stamps it as the skill’s author); at execute time, the runtime threads author into ctx.agentId; the outbound connector asserts that to the substrate. The same X-Agent-Id header carries two different identity claims at the two boundaries; the stored author is the bridge.
Critical: never forward an inbound X-Agent-Id header straight to an outbound connector. The skill’s owner is who should access the substrate, not the current caller. If anyone invokes alice’s skill and the outbound used the caller’s identity instead of alice’s, the substrate would scope to the caller — a setuid hazard. The runtime keeps the two separate; outbound identity is always derived from author at dispatch.
Verification
After wiring + restart, a smoke test:"alice", either the config field is unset, the header didn’t reach the runtime (check your proxy / host wiring), or you sent the request with a different header name than configured.
Conventions for upstream-merge-friendly modifications
If your wiring needs require modifying skillscript-runtime source (rather than just configuration), follow these conventions to minimize merge friction.1. Prefer dedicated adopter files over editing upstream
Put your code in dedicated paths upstream won’t touch:src/connectors/data-store.ts won’t conflict with your local/ files.
2. Use the public registration API; don’t edit the closed-set Map
KNOWN_CONNECTOR_CLASSES in src/connectors/config.ts is upstream-owned. Add your classes via registerConnectorClass(name, entry) from your bootstrap instead. Closes the merge-conflict bait of editing that file every release.
3. Mark unavoidable upstream-file edits with sentinels
When you genuinely have to edit an upstream file, mark the change:// ADOPTER:myorg — prefix is greppable across merges; your future-self can re-evaluate whether the modification is still needed when upstream changes the surrounding code.
4. Treat src/bootstrap.ts as reference, not canonical
For the standard “wire the whole runtime + dashboard from $SKILLSCRIPT_HOME like the CLI does” case — which is most adopters — call bootstrapFromEnv() (see §“Programmatic bootstrap path”). It’s the supported public entry point; you don’t hand-assemble anything, and you swap substrates declaratively via connectors.json. Remember to await wired.registry.disposeAll() on shutdown so stdio-bridged connector children (RemoteMcpConnector) are reaped, not orphaned — bootstrapFromEnv() callers own teardown (the CLI does it for you).
Drop to raw bootstrap() + Registry only when you’re hand-assembling a substrate that connectors.json can’t express — import the public APIs (Registry, the connector classes, loadConnectorsConfig, loadSkillscriptConfig, etc.) rather than modifying the bundled bootstrap, which churns on every upstream release.
See examples/custom-bootstrap.example.ts for a worked walkthrough.
Substrate ship-status
| Substrate | Shipped contract | Shipped impls | Shipped bridge |
|---|---|---|---|
| SkillStore | ✓ 8 methods (load / query / store / update_status / delete / versions / metadata / staticCapabilities) | FilesystemSkillStore, SqliteSkillStore | n/a |
| DataStore | ✓ 3 methods (query / write / get) | SqliteDataStore | ✓ DataStoreMcpConnector |
| LocalModel | ✓ 1 method (run) | OllamaLocalModel | ✓ LocalModelMcpConnector |
| McpConnector | ✓ 1 method (call) | RemoteMcpConnector, CallbackMcpConnector | n/a |
| AgentConnector | ✓ 5 required (list_agents / deliver / wake / health_check / request_response) + 1 optional (agent_status) | NoOpAgentConnector (default), HttpWebhookAgentConnector | n/a |
SqliteDataStoreis a deliberately minimal reference impl. It satisfies the contract (query/write/get/staticCapabilities/manifest) with FTS-style tag/text retrieval. It does NOT support semantic retrieval, pinning, decay scoring, or thread-status filtering (the relevantsupports_*flags are all false). Deployments that need richer query semantics forkexamples/connectors/DataStoreTemplate/and wire their backing substrate.- SkillStore and DataStore have different lifecycle models — by design. SkillStore is mutable / versioned / named CRUD (Draft→Approved→Disabled→Delete with audit trail). DataStore is append-only with query/get (no per-record lifecycle in the contract). If you back both onto one substrate, you’re serving both lifecycle models at once. Substrates that conflate “data record expiry” with “skill expiry” silently break authored code; the contract doesn’t enforce this, you handle it impl-side.
- Durability is implementer’s responsibility. The typed contracts assume durable storage. Neither interface declares “writes live forever” — but the runtime + lint + dashboard all behave as if writes persist indefinitely. If your substrate has GC / TTL / decay scoring, build adopter-side guards (pin-rules, retention policies, periodic re-pin sweeps) or pick a substrate posture that satisfies “durable forever.” Silent staleness is the failure mode the contract won’t catch.
- Mutation ops require runtime-enforced authorization.
$ data_write/file_write/$ <mutating-name-tool>(write/update/delete/etc.) fireUnconfirmedMutationErrorat the runtime boundary unless the skill carries# Autonomous: true(cron/agent-fired) OR a preceding??/ask(...)confirms in the same target OR the op carriesapproved="reason"per-op kwarg. This fires regardless of how the skill was invoked —execute_skill({name})ANDexecute_skill({source})honor the gate identically; lint stays advisory. Adopters running unattended skills programmatically should set# Autonomous: trueat the header. - In-skill writes have asymmetric trust models.
$ skill_writelands its child as# Status: Draftregardless of body declaration — the bridge forces it. Authoring an executable artifact has unbounded blast radius (the child fires arbitrarily many times in arbitrary contexts); the Draft default keeps autonomously-written skills out of the immediate execution loop.$ data_writewrites verbatim — one bad data row is bounded blast radius. SkillStore impls receive the body already Draft-stamped; DataStore impls receive entries as authored. - Your
DataStore.write()is never called if the mutation gate rejects the skill. The runtime gates$ data_write(and other mutation ops) upstream of the bridge — substrates only see authorized writes. If your own probes hitUnconfirmedMutationError, that’s a skill-body issue (missing# Autonomous: true/??/approved=), not a substrate issue. - Filter scope is enforced at the bridge.
DataStoreMcpConnectorrejects every filter key outside the substrate’s declaredmanifest().supported_filtersset, throwingUnsupportedFilterError. This prevents silent scope leaks where unsupported filters get dropped without the caller knowing. Per-call opt-out:permissive_filters: trueacknowledges “unknown keys are advisory; substrate may ignore them.” Substrate implementors: declare every filter yourquery()actually honors so the bridge validates against your truth, not a guess. - FTS matching strictness varies by substrate. The
DataStore.query()contract names the modes (fts/semantic/rerank) but doesn’t pin down matching semantics within each mode — token-OR, phrase-tokens, fuzzy, exact, FTS5-syntax-passthrough, etc. are all conformant. The bundleddata-store-roundtripdemo assertsN ≥ 1(a successful round-trip) rather than a specific count, which works across any FTS-supporting substrate. For adopters who need deterministic exact-count reads (round-trip tests, idempotency checks, exact-record-matched fetches), the portable strict-match path isdomain_tags=[...]filtering — the bridge enforces tag-key againstsupported_filtersand substrates declaringsupports_tag_filter: truehonor exact-tag any-of-match per the contract. Use FTS for relevance ranking against open content; use tag filters when you need to be sure you got the specific record you wrote. - Durable-forever opt-in via
expires_at: null.DataWrite.expires_ataccepts a unix timestamp for finite expiry,nullto opt into “durable forever” (the portable verb for substrates with default TTL — AMP memory vaults, Redis with default expiry, hosted memory APIs), or omitted (substrate’s default lifecycle, may be durable or may have decay). Substrates that are durable-by-default (the bundledSqliteDataStore) treatnullas a no-op. Substrates with default sweep should mapnullto their pin / no-decay flag. - Two trigger primitives, both functional.
cron(time-based) andevent(HTTP/eventingress, named registration). All other concepts that look like triggers — session-start, agent-event, file-watch, sensor — are adapter responsibilities: the adapter POSTs/eventwhen relevant. Keeps the runtime substrate-neutral; the trigger surface stays tight. - Output kinds are intentionally substrate-neutral.
# Output:acceptstext/agent: <name>/template: <name>/file: <path>/none. Substrate-specific values (slack:,card:, etc.) are out of scope — adopters wanting Slack / WhatsApp / Discord / etc. delivery use either$ slack.post ...MCP dispatch inside the skill body OR deliver viaagent: <name>and let the receiving agent decide. - Authorization is signature-based in secured mode. An
Approvedskill runs unkeyed in unsecured mode (a bare# Status: Approvedis sufficient) and requires a valid Ed25519 operator signature in secured mode — verified on every execution against the operator’s public key, with the private key held off the runtime. See Approval + secured mode for the model, the approve flow, key custody, and migration. The shell + filesystem allowlists bound blast radius in both modes; secured mode is what gates whether an unapproved skill runs at all.
Skill discovery + cross-agent composition
If you back your SkillStore against a substrate that ALSO holds general data records (one substrate serving both contracts), skill discovery can use the canonical$ data_read surface to find skills via tag/query:
skill_list MCP tool (which calls SkillStore.query()). The $ data_read-as-discovery pattern is for the niche case where skills and other records share a backing store with rich tag/query semantics.
Contributing — dispatch-shape discipline
The multi-layer-promise pattern (lint passes; runtime fails, or vice versa) is the recurring failure mode for dispatch-shape work.validateQualifiedDispatch is the shared validator lint and runtime both call. To prevent the next recurrence, every PR that introduces a new dispatch shape (a new way of writing $ ... ops, a new connector class entry point, a new lifecycle hook on # Output:) must land with:
- Lint test — fixture that exercises the shape with lint only (
lint(source, {registry})) - Runtime test — same shape executed end-to-end (
executeSkillByNameorexecuteSkillFromSource) - E2E test — the full user path (write skill → store → execute via MCP, or trigger fire → dispatch)
McpConnectorClass-shaped contracts should also implement staticTools(): string[] | null whenever the tool surface is closed and knowable at compile time. Lift unknown-tool-on-connector from “advisory you fix at runtime” to “tier-1 error caught at compile time” for every adopter who wires your class.
Resources
- Onboarding scaffold —
examples/onboarding-scaffold/— complete adopter deployment with a file-backed data store + OpenAI + tmux - Custom bootstrap walkthrough —
examples/custom-bootstrap.example.ts— registering custom MCP connector classes - Connectors example —
scaffold/connectors.json— annotatedconnectors.jsonshape - Language reference —
docs/language-reference.md— skill syntax + frontmatter + lint codes - Connector contracts —
docs/connector-contract-reference.md— substrate-neutral contract surfaces - Configuration —
docs/configuration.md—connectors.jsonshape + substrate selection