Why the outer container runs with seccomp=unconfined, how bwrap is the real boundary against prompt-injected subagent bash calls, and the OrbStack /proc workaround

The outer typeclaw container runs every agent and subagent in one trust boundary. The user's .env, GH_TOKEN, and agent folder are all visible to anything inside. That's fine for the agent's own work. It's not fine when a subagent reads attacker-controlled content (a PR diff, a webpage, a stack trace) and a prompt injection steers it into emitting hostile bash. For that case the boundary needs to live below the agent process, in the kernel.

The kernel primitive for this is bwrap(1) (bubblewrap) — a setuid-less namespace sandboxer that wraps a single subprocess in its own user/pid/mount/net namespace. Per bash call, the wrapper rebuilds an empty rootfs view containing only what the caller's policy declared it needs.

The base layer — the apt package and the docker run flag that make bwrap available — is documented below. The reusable command builder that consumes them lives in src/sandbox/: an origin-agnostic primitive that takes a bash command plus a SandboxPolicy and returns the bwrap-wrapped argv. It knows nothing about subagents or the agent runtime; a consumer resolves a policy from whatever context it has and calls the builder.

Role-based path hiding

The bash builtin is wired through the builder in wrapAgentToolAsCustomToolDefinition (src/agent/plugin-tools.ts), after the tool.before guards inspect the raw command and immediately before pi-coding-agent executes it. The policy is not declared per subagent; it is derived per call from the live turn's resolved role. The container mount is unchanged — the whole agent folder stays bind-mounted at /agent. For a low-trust role the sandbox does two things to that tree: it hides a role-dependent subset (so masked paths can't be read) and it binds the whole agent folder read-only, then re-exposes only a narrow set of free-write scratch zones read-write. So a sandboxed bash call can write only to those zones — and nowhere else (.git/, node_modules/, the agent-dir root) is EROFS, closing the path that previously let bash sidestep the nonWorkspaceWrite guard. trusted+ has no masks and runs unsandboxed, keeping full read-write access (it needs .git/, node_modules/, lockfiles for git and package managers).

resolveHiddenPaths (src/sandbox/hidden-paths.ts) maps the resolved role to a deny-list via two filesystem-visibility permissions, phrased as capabilities to see so the role tower stays monotonic (guest ⊆ member ⊆ trusted ⊆ owner):

Role	`fs.see.private`	`fs.see.secrets`	Hidden from bash	Bash writable zones
guest	—	—	`workspace/`, `memory/`, `sessions/` (dirs) + `.env`, `secrets.json` (files)	`public/`, `mounts/`
member	✓	—	`.env`, `secrets.json`	`workspace/`, `public/`, `mounts/`
trusted, owner	✓	✓	nothing	everything (unsandboxed)

Directories are hidden with --tmpfs; single files with --ro-bind-data, the only bwrap primitive that masks a file (the rendered command self-opens an empty fd via a 3< /dev/null redirection because the bash tool's spawn does not inherit extra fds). Mount ordering is load-bearing — bwrap applies ops in order and the last op on a path wins, so the builder emits --ro-bind /agent first, then the masks, then the read-write zone binds (--bind /agent/workspace /agent/workspace, etc.) last. The writable set (src/sandbox/writable-zones.ts) is deliberately narrower than the nonWorkspaceWrite allowlist — a blanket RW bind is coarser than the write/edit guards, so only genuinely free-write scratch zones qualify: dirs workspace/, public/, mounts/, plus the existing root files (AGENTS.md, cron.json, typeclaw.json, …). packages/ (executable plugin code) and .agents/skills/ (validated by the skillAuthoring guard) are excluded; those writes go through the guarded write/edit tool only, so bash is never more powerful than write/edit. The set is then intersected with the role's masks (subtractMasked): a zone that is hidden for this role is dropped before it can be re-bound, so a guest's masked workspace/ is never re-exposed read-write — that is why a guest's writable set is just public/, mounts/. Only paths that exist and are not symlinks are bound read-write — a workspace -> /etc symlink at a zone root would otherwise grant outside write access. A missing root file is not pre-created (that would make a read-only command mutate the project), so bash can edit an existing cron.json but creating a new one goes through the write/edit tool. Resolution fails safe to guest: an undefined or unmatched origin holds neither grant, so everything maskable is hidden and writes are confined. A security.bypass.* fallback keeps custom roles (which may never name the fs.see.* strings) working by capability. When a role needs the sandbox but bwrap is unavailable, the call fails closed — it never runs unsandboxed. trusted+ has no masks, so its bash runs unchanged whether or not bwrap is present.

The agent folder's top-level public/ directory is the deliberate inverse of this deny-list: it is never masked for any role and is in the read-write allowlist, so a guest turn can read and write it through both bash and the file tools. It is scaffolded empty (with a .gitkeep) by typeclaw init. It sits at the agent-folder root rather than under workspace/ on purpose — workspace/ is an arbitrary free-write zone with no reserved subdir names, so a magic workspace/public/ would silently expose any subdir an agent happened to name public. A top-level sibling keeps the deny-list a flat, carve-out-free set (public/ is simply absent from it) and makes the public/private split legible in the path itself.

The bash masks alone would be hollow: every non-bash tool (read, grep, find, ls, edit, write, plus find_entry, look_at, and the channel attachment tools) runs in the main process, outside any sandbox, so a restricted role could read back through them exactly what bash masking denies. The privateSurfaceRead guard (src/bundled-plugins/security/policies/private-surface-read.ts) closes that at the tool.before boundary, driven by the same resolveHiddenPaths deny-list. It is deliberately fail-closed: rather than whitelisting a known set of file tools (which fails open the moment a new reader ships), it scans every non-bash tool's arguments — recursively, since paths hide in nested shapes like look_at's images[].path and channel_send's attachments[].path — and blocks any string that resolves under a hidden directory. bash is the only tool exempt, because its access is already contained by the bwrap masks above. It enforces the whole deny-list — the hidden directories AND the secret files (.env, secrets.json) — across every non-bash tool, so the file-tool surface enforces exactly the same deny-list as the bash masks ("two enforcement points, one deny-list"). It does not delegate the secret files to the pre-existing secretExfilRead guard: that guard only inspects read/grep/find/ls (not edit/write/look_at/channel_send) and is acknowledgement-bypassable, so delegating would leave the secrets reachable through the tools it does not cover. secretExfilRead remains as independent defense in depth for the four tools it does cover.

The guard resolves each candidate to its real path before matching — it follows symlinks on every existing path component (walking up to the nearest existing ancestor for not-yet-created targets), and resolves the deny-list entries the same way, because the agent dir itself may sit under a symlink. Lexical resolution alone was a hole: a guest could plant public/leak -> ../.env (or -> ../memory) via sandboxed bash, then read it back through read/look_at/a channel attachment whose path lexically lands in the guest-visible public/. Real-path resolution closes that.

Which arguments count as paths is field-aware, not value-blind. A universal denylist of free-text key names (text, query, prompt, oldText, filename, …) is skipped so ordinary prose like channel_reply({ text: "the memory leak" }) does not trip the guard. Crucially, this is a denylist of keys, not a tool whitelist — an unknown field on an unknown tool is still scanned, so a new path-bearing reader is covered the day it ships. Two keys are tool-dependent and handled per tool: grep.pattern is a regex (free text, exempt), but grep.glob and find.pattern are glob path-filters resolved relative to the search root — so grep({ path: '.', glob: 'workspace/**' }) and find({ path: '.', pattern: 'workspace/**' }) reach a hidden subtree and are scanned. path.resolve treats glob metacharacters as literal, so *.ts → /agent/*.ts (passes) while workspace/** → /agent/workspace/** (blocked) — no false positives.

Both enforcement points — the bash masks and the privateSurfaceRead scan — live in the bundled security plugin's tool.before path and in the builtin-tool wrapper that the plugin's hooks install. The bundled security plugin is auto-loaded and unconditional, so the boundary is always present in a real agent run. A session constructed without that plugin (a bare test harness) has neither enforcement point; this is by construction, not a bypass available to a model. channel_send / channel_reply reading a role's hidden directory is blocked here, but the broader "a role that legitimately sees workspace/ forwards it to a public channel" remains the authn-shaped output-exfil concern this sandbox does not solve — use separate agents per trust domain.

This hiding is filesystem read/exec only. A guest turn is still recorded to memory normally — guest simply cannot read the private surface back through bash or the file tools.

Custom writable paths and symlinks

The built-in writable set covers the common scratch zones, but some CLIs insist on writing a fixed config dir under the agent root — e.g. a tool that rewrites <agentDir>/.foo-cli/config.json on every run. Under a low-trust role that dir is EROFS, and there was previously no supported way to allow it. Two sandbox config fields close that, both restart-required (read from the boot-time config snapshot, like realProc).

sandbox.writablePaths is an array of agent-root-relative directories added to the writable set on top of the built-ins:

{ "sandbox": { "writablePaths": [".foo-cli", "workspace/cache"] } }

Entries are validated at two layers. At parse time (relativeAgentPathSchema): relative-only (absolute container paths rejected), no .. segments, no null bytes. At resolve time (resolveWritableZones, src/sandbox/writable-zones.ts) entries are dropped — never thrown, so a stale config degrades one path instead of aborting sandboxing — unless they exist, are a real directory (not a file), have a non-symlink root, resolve inside agentDir (and are not the agent root itself), and do not land on a security-sensitive root: .git, .env, secrets.json, sessions, memory, .typeclaw, node_modules. Configured paths still pass through subtractMasked, so a role's hidden paths are never re-exposed read-write.

sandbox.symlinks is the one-entry abstraction for the CLI case above: it creates the symlink and makes its target writable.

{ "sandbox": { "symlinks": [{ "from": "~/.foo-cli", "to": "workspace/.foo-cli" }] } }

from is the symlink location and is fully configurable — an absolute container path (/root/.foo-cli) or a ~/-prefixed path. to is agent-root-relative (reuses relativeAgentPathSchema) and is automatically folded into the writable set (getSandboxWritablePathSpecs), so the operator never lists it twice. from validation rejects null bytes, the filesystem root /, any path under /agent (a self-referential loop), and the kernel/virtual roots /proc, /sys, /dev, /run; /etc/... is allowed because the entrypoint's no-clobber guard already prevents overwriting an existing system file.

The two trust tiers need the symlink in two different places, because $HOME differs by stage:

Trusted/owner (unsandboxed). Bash runs with the real container $HOME (/root) and real /agent. The entrypoint shim's link_configured_symlinks (src/init/dockerfile.ts) creates from -> /agent/<to> at boot, threaded in via the base64-JSON TYPECLAW_SANDBOX_SYMLINKS env that planStart emits only when symlinks are configured. The shim refuses to clobber an existing non-symlink at from and skips bad entries without failing boot.
Low-trust (sandboxed). Inside bwrap HOME=/tmp (the per-session tmp bind), so the entrypoint's /root symlink is irrelevant. applyBashSandbox instead resolves each from against the sandbox HOME and emits a bwrap --symlink /agent/<to> /tmp/<name> op (resolveSandboxSymlinks → policy.symlinks → build.ts appendSymlinks), rendered after the /tmp bind so last-op-wins keeps it. Only symlinks whose to actually survived as a writable dir are emitted, so a dropped/masked target never yields a dangling symlink onto an EROFS/hidden path.

Per-session /tmp. A sandboxed role's /tmp is not the bare anonymous --tmpfs /tmp for that role: applyBashSandbox binds a per-session scratch dir (/tmp/typeclaw-session/<sessionId>, created 0700 by src/sandbox/session-tmp.ts) over /tmp, emitted via policy.mounts after the hardcoded --tmpfs /tmp so last-op-wins makes it the live /tmp. This exists because the two enforcement layers must agree on what /tmp is: the path-based file tools (read/write/edit/grep/find/ls) run unsandboxed against the real container /tmp, while a sandboxed role's bash sees the bwrap mount. Without the bind, a guest/member that writes /tmp/review.json and then reads it via gh --input /tmp/review.json would hit two different files. So the same wrapper also redirects a sandboxed role's /tmp/* path on every one of those file tools to that session backing dir (the model still names /tmp/...; only the on-disk target moves) — so a read of /tmp/foo resolves to the same file sandboxed bash wrote, not the real container /tmp. The backing dir lives on the real container /tmp — outside the agent folder, so it is never force-committed like sessions/, and ephemeral with the container. The security delta is deliberate and bounded: bash calls within one session now share /tmp state (normal Unix /tmp), but a sandboxed role still cannot see the container's real /tmp root or any other session's scratch, and /tmp is never a project/secret surface. Unsandboxed roles (trusted/owner, empty masks) skip both the bind and the redirect — their write and bash already share the real container /tmp, and routing them through bwrap solely for /tmp would strip the real /proc and .git RW they rely on. The nonWorkspaceWrite guard allows /tmp/** for all roles for the same reason: it is virtual scratch, not a non-workspace write to police.

Package installs (bun add). A standalone bun add <pkg> / bun install is a documented sandboxed workflow (the bundled agent-messenger skills shell out to bunx agent-*, and adding a dependency that survives restart means writing package.json in the bind-mounted folder). But bun add writes more than package.json: it creates node_modules/ and saves the lockfile via a temp file (bun.lock.NNN.tmp, atomically renamed) directly under the agent root. Both are EROFS under the narrow carve-out model — and a file-level RW bind of bun.lock alone cannot help, because the temp file needs the parent directory writable. So applyBashSandbox detects the narrow command class (isPackageInstallCommand, src/sandbox/package-install.ts: a single standalone local bun add/bun install/bun i, no shell metacharacters, chaining, redirects, or substitution, and not -g/--global) and switches to a package-install mode that RW-binds the whole agent root via policy.writableRoot. Unlike the default writable carve-outs (rendered after the masks), writableRoot renders before the masks: last-op-wins then re-hides .env/secrets.json/memory//sessions/ on top of the broad RW root. resolvePackageInstallZones then re-binds everything else read-only by allowlist inversion, not a denylist: only node_modules/, package.json, bun.lock, and the scratch zones workspace//public//mounts/ stay writable; every other existing root entry is RO-bound by enumerating the root (readdir), so src/, scripts/, the prompt-source files (AGENTS.md/SOUL.md/IDENTITY.md/USER.md), cron.json, typeclaw.json, and any unanticipated or newly-planted root file are read-only without a hardcoded list. node_modules/typeclaw (the live/symlinked runtime) is RO-re-bound nested under the writable node_modules, and the whole .git/ is RO (a bun add never needs git, which closes the hook / core.hooksPath escalation by construction). A denylist of "executable surfaces" was rejected: the dangerous set is "anything the unsandboxed runtime later reads or executes" — open-ended, and it fails open for new root entries (this was the PR #746 review finding). The allowlist fails closed. So a dependency's lifecycle scripts (bun runs them during bun add) can write node_modules/ and the lockfile but cannot overwrite src//scripts/, the agent's lifecycle scripts, or prompt-source files the unsandboxed runtime later executes/reads, and cannot read the masked secrets. The narrow detector is the load-bearing part: any metacharacter falls back to the default ro-root jail, so the broad RW root can never be piggybacked onto an attacker-chained second command. A symlinked agent root, node_modules, package.json, or bun.lock is rejected (an RW bind follows symlinks, so it would write outside the jail). trusted+ never reach this path — they are unsandboxed and already have the RW root.

Live role on the turn. For channel origins the cached system-prompt role block is a multi-speaker policy — it does not assert a concrete role or leak the opener's permission list, because a channel session is keyed by chat/thread and sees many speakers. The router re-resolves the role per turn for tool gating and surfaces it to the model as a <your-role authority="current-speaker">…</your-role> tag injected into the user turn (omitted for owner, the unconstrained default). The tag is marked authoritative and explicitly overrides the cached block, so an owner-opened session does not make a later guest speaker look like an owner. This is cache-free — the tag lives in the non-cacheable user-turn suffix, never the cached system-prompt prefix. The agent uses it to route output: shareable artifacts for an untrusted caller go to public/, and a denied by permissions block on a workspace/ write is the runtime fallback signal that the caller is untrusted.

system origin for runtime infrastructure. Memory logging and retrieval (memory-logger, memory-retrieval), backup, and dreaming are TypeClaw-owned infrastructure that operate on the operator's own state, not user-delegated work. They spawn under a system session origin that resolves to owner, so a guest channel turn that triggers memory logging does not demote the logger to guest and get its own sessions//memory/ access blocked by privateSurfaceRead. The triggering origin is preserved as triggeredBy for honest audit provenance (never a synthetic-TUI lie) and dropped from the persisted projection so channel author identity does not reach git-backed sessions/. The system kind is constructed only by runtime/bundled code — inbound channel/cron content can never produce it — so resolving it to owner is not a role-laundering vector. Role is authorization, not observability: guest turns remain eligible for memory logging; the confidentiality boundary lives at retrieval/output, not at logging.

Symlink handling. For bash, a symlink whose target lives under /agent (e.g. public/link -> /agent/.env) resolves inside the sandbox mount namespace to the masked overlay, so it cannot reach the real secret. For the non-bash file tools there is no mount overlay, so the privateSurfaceRead guard does the real-path resolution described above to catch a symlink that points from a visible dir into a hidden one. A symlink pointing outside /agent escapes the agent-folder bind entirely and is not covered by these masks; it is contained only by the always-on kernel-containment invariants below (cleared env, isolated network).

Known hardening (defense-in-depth, not live gaps). Two seams are correct today but rely on a current invariant rather than enforcing it directly. (1) The enforcement points are gated on the session having tool hooks; that holds because the security plugin is always bundled, but it couples a security control to plugin presence. (2) TUI session creation does not thread a permission service, which is safe only because the TUI origin resolves to owner (no masks); narrowing roles.owner.match away from TUI would silently leave TUI bash unsandboxed. Threading the permission service through the server and the default subagent factory would make both correct by construction instead of by invariant.

What the per-tool sandbox is NOT for

This sandbox limits what a single prompt-injected bash call can do within one agent's trust domain — hide secret-shaped reads, contain the network, mask the private surface. It is defense in depth against a hostile command, not a wall between trust domains.

Trust-domain problems come in two shapes, and they need different defenses. The sandbox addresses neither directly; naming both keeps "just sandbox it" from being mistaken for a complete answer.

Shape	Threat	What it wants	The defense
Authentication-shaped (confidentiality)	"Hide the private thing from the attacker." A public-facing review and a private repo share one agent; a public PR steers bash and reaches `.env` / `GH_TOKEN` / `/agent` / private source.	A coarse, whole-container / whole-credential boundary.	Separate agents per trust domain.
Authorization-shaped (action gating)	"We can both see the repo and both propose changes, but only I may act — merge, push to `main`." Nothing is secret; the question is purely who may perform the privileged action.	Fine, per-action authority that reflects the least-privileged actor in the chain.	Provenance + action gating — see Permissions.

Authentication-shaped: separate agents

Keeping a public-facing review away from private code and credentials is coarse — it wants a whole-container, whole-credential-set boundary — and typeclaw already provides it: one agent is one container, one .env, one GH_TOKEN, one /agent. The right boundary is operational.

Use separate agents for separate trust domains. Do not point one agent at both a private repo and public-facing review work (e.g. reviewing external contributors' pull requests). If a public PR can steer the agent's bash, the only things it can reach are that agent's own container and credentials — so that container must not also hold the private repo or its secrets. Splitting the work across agents makes the public-to-private pivot impossible by construction, and closes the subtler residual the sandbox cannot: a prompt-injected reviewer copying private source into review text that then gets posted to the public PR. With public and private in separate agents, that output channel has nothing private to leak. The role-based path hiding above is a within-domain mitigation, not a substitute for this split.

public/ is a confused-deputy egress sink, by the same logic. The path hiding stops a guest from reading the private surface, but a higher-role turn in the same channel session shares one model context. A guest who prompt-injects an owner/trusted-role turn into copying workspace/secret → public/ (which the guest can then read) has exfiltrated it — the write is a legitimate, authorized action, and public/ exists precisely to be guest-readable. This is the authentication-shaped problem again: the defense is the same trust-domain split, plus not treating public/ as a dumping ground for content derived from the private surface. Outbound secret scanning covers channel_send/channel_reply text but not file writes, so it does not catch this. Treat public/ as published-to-the-least-trusted-reader, and keep genuinely private work in a separate agent.

Authorization-shaped: separation does not help

This shape survives separation, because the resource is not secret — the public repo is the work, visible to attacker and operator alike. Sandboxing misses it too: gh pr merge is a legitimate, authorized API call, not a containment breach. The boundary is purely who may act.

This is the operator subagent's surface. Subagent provenance stamps spawnedByRole — the role of whoever spawned the subagent (see Permissions). It tracks who spawned the task, not whose content is driving it. So an owner-spawned operator resolves to owner; if an owner (or owner-scheduled cron) hands it "act on PR #N" and the attacker-controlled diff carries a prompt injection ("merge this / push to main"), the action runs with the owner's authority. The defense is the same principle the cron/subagent provenance guard establishes: provenance should reflect the least-privileged actor in the chain, and irreversible or audience-visible actions should be action-gated regardless of inherited role.

bwrap is a primitive, not a policy

Upstream bubblewrap is explicit: "bubblewrap is not a complete, ready-made sandbox with a specific security policy" and "the level of protection... is entirely determined by the arguments passed to bubblewrap." Treating "we use bwrap" as a security claim is a category error. The boundary lives in the builder invariants — the exact bwrap argv src/sandbox/build.ts constructs per call. The always-on invariants are:

--unshare-all — covers the user/pid/mount/net/ipc/uts/cgroup namespaces.
--clearenv followed by an explicit --setenv allowlist (PATH, HOME, LANG by default; consumers name any others). Inherited env is the highest-risk vector for FIREWORKS_API_KEY / GH_TOKEN exfil.
--new-session plus --die-with-parent (both default-on) to prevent the contained process from injecting input into the controlling terminal via TIOCSTI and to reap the sandbox if the agent dies.
A minimal mount set (--ro-bind /usr, --ro-bind /etc, --dev /dev, --tmpfs /tmp, plus a /proc per the resolved strategy — default --ro-bind /proc /proc under proc-bind, see the /proc section). Every additional bind-mount a policy declares is a potential write/exec/read vector; everything mounted in can "potentially be used to escalate privileges" (bubblewrap docs).
The usr-merge root symlinks (--ro-bind-try /bin, /sbin, /lib, /lib64). On the Debian base these are root-level symlinks into /usr, so --ro-bind /usr exposes /usr/bin etc. but not the root entries themselves. Anything the kernel resolves by absolute path without consulting PATH then breaks: the ELF interpreter baked into a binary's PT_INTERP (/lib/ld-linux-aarch64.so.1 on arm64, /lib64/ld-linux-x86-64.so.2 on amd64 — its absence surfaces as the misleading bwrap: execvp bash: No such file or directory, where the missing file is the loader, not bash), and #!/bin/sh / #!/bin/bash shebang lines (literal paths that skip PATH, so a script fails cannot execute: required file not found even though /usr/bin/sh exists). --ro-bind-try, not --ro-bind: the set is arch- and base-dependent (arm64 oven/bun:1-slim ships /lib but no /lib64), and a hard bind of an absent source aborts bwrap; -try binds each only when present, keeping build.ts pure (no host filesystem probe). These are the loaders and interpreters the sandboxed workload needs; they are not a widening of the read surface beyond what /usr already exposes.
Network isolation by default (--unshare-all leaves the net namespace unshared); --share-net only when a policy explicitly sets network: 'inherit'.

These invariants are the always-on kernel containment boundary, applied to every command regardless of policy. Command-syntax restrictions — a prefix allowlist or shell-metacharacter rejection — are optional commandFilter knobs a narrow consumer (e.g. a read-only reviewer) can opt into; they are not part of the base boundary, because a general-purpose bash sandbox must allow pipes, &&, and $(). Reviewers of any consumer should audit the resolved policy and the argv construction, not just confirm that bwrap was invoked.

`--security-opt seccomp=unconfined`

planStart() in src/container/start.ts adds --security-opt seccomp=unconfined unconditionally. Without it, Docker's default seccomp profile blocks unshare(CLONE_NEWUSER) and clone(CLONE_NEWUSER) for non-privileged containers, and bwrap exits at startup with "setting up uid map: Permission denied."

The default profile is calibrated for multi-tenant container hosts — Kubernetes nodes, CI runners — where seccomp is one of several boundaries protecting tenant isolation. TypeClaw is single-tenant. The user owns the host, the agent folder is bind-mounted in, the secrets are in the env. The blast radius of an attacker already executing code inside the container is dominated by what they can do with those existing reads, not by the syscalls seccomp blocks. Dropping the filter loses the wrong thing from the wrong threat model.

`--cap-add=SYS_ADMIN` and `sandbox.realProc`

CAP_SYS_ADMIN looks narrower than seccomp=unconfined (one capability vs ~44 syscalls) but is broader in practice. It is called "the new root" because it grants ~38 distinct privileges: mount, swapon, sethostname, namespace ops, bpf, perf_event_open, more. With it the outer container can mount --bind / /proc/$$/root and escape its own filesystem view. With seccomp=unconfined alone, it can call mount(2) but the kernel rejects it for lack of CAP_SYS_ADMIN. The textbook "more caps is worse than more syscalls allowed" applies — so the default avoids it.

Running external package CLIs (bunx, bun add, bun run <pkg-bin>) inside the sandbox is a hard requirement — the bundled agent-messenger skills shell out to bunx agent-*, and under the bare --tmpfs /proc profile every such call aborts with Bun's error: An internal error occurred (NotDir) (see the OrbStack /proc section below). The fix does not need CAP_SYS_ADMIN: the default proc-bind strategy --ro-binds the container's already-real procfs into the sandbox, giving the runner's child a working /proc/self/{fd,maps} with no mount and no capability. So typeclaw.json#sandbox.realProc defaults to false and planStart() does not add --cap-add=SYS_ADMIN by default.

sandbox.realProc: true is an opt-in for the stricter real-proc strategy below, which mounts a fresh procfs in a new PID namespace for full PID isolation. That mount needs CAP_SYS_ADMIN, so planStart() grants --cap-add=SYS_ADMIN only then. On TypeClaw's single-tenant boundary the cap would be an accepted trade (the same reasoning that makes seccomp=unconfined the default — the user owns the host, the agent folder and secrets are already mounted in, so the inner per-tool bwrap user namespace is the load-bearing boundary, not the outer cap). But since proc-bind already unblocks the core external-package workflow without it, avoiding the grant is the better default: a smaller outer blast radius in exchange for the non-secret PID metadata proc-bind leaves visible (see below).

Why not a custom seccomp profile

A profile based on Docker's default plus the namespace/setup syscalls our exact bwrap invocation needs would be narrower than seccomp=unconfined. The exact syscall delta is not pinned down in this PR — bwrap also calls mount, pivot_root, and related namespace-setup syscalls under the user-namespace path, and the full list depends on the specific --unshare-* / --proc / --dev combination the follow-up helper settles on. Trace it under strace -fc bwrap … against the helper's argv once it lands and pin the profile to that delta if and when this matters.

The reason we don't do that today is maintenance, not impossibility: a profile file has to be shipped, distributed, kept in sync with bwrap's evolving syscall use, and debugged when it silently breaks something. The cost only earns its keep when typeclaw goes multi-tenant. Single-tenant deployments don't need the precision; seccomp=unconfined is the right cost/benefit point until that changes.

bwrap in baseline

BASELINE_APT_PACKAGES in src/init/dockerfile.ts includes bubblewrap (~132KB). Per-tool sandboxing is a runtime concern decided by the agent, not by the agent author, so it ships unconditionally rather than behind a docker.file.bwrap toggle. The flag pair (seccomp=unconfined on docker run + bubblewrap in baseline) is load-bearing together — dropping either breaks the sandbox with no test signal beyond the bwrap helper failing at runtime.

OrbStack `/proc` quirk

OrbStack's hardened VM kernel blocks mount("proc", ...) from user namespaces regardless of caps. Bare-metal Linux and Docker Desktop allow it; OrbStack does not. This is why the real-proc strategy (which unshare --mount-procs a fresh procfs) cannot be the default — on the most common host it is rejected outright. The default proc-bind strategy below sidesteps the mount entirely: it --ro-binds the container's already-real procfs, which needs no mount(2) and works on every host.

The historically-rejected alternative was --tmpfs /proc: the sandbox gets an empty /proc, which breaks ps/top//proc/cpuinfo readers, bash process substitution <(…) / /dev/fd/N, and — most importantly — the child process a JS package runner spawns, which reads /proc/self/fd/N and /proc/self/maps. That last case is why a --tmpfs /proc sandboxed bunx <pkg> / bun add <pkg> / bun run <pkg-bin> aborts with Bun's error: An internal error occurred (NotDir): the runner's own self-location can be patched with a /proc/self/exe symlink, but the spawned bin's /proc/self/fd lookup still hits the empty tmpfs. So --tmpfs /proc survives only as a last-resort fallback (no external packages); it is not a default.

`sandbox` default: `proc-bind` — real `/proc` without the leak, without the cap

The default per-tool strategy. It binds the container's already-real procfs straight into the sandbox:

bwrap --unshare-all … --ro-bind /proc /proc … bash -c <command>

--ro-bind /proc /proc gives the runner's child a real /proc/self/{fd,maps,exe}, so bunx/bun add/bun run <pkg-bin> stop aborting with NotDir. No unshare --mount-proc, no CAP_SYS_ADMIN — which is exactly why it works on OrbStack where real-proc is rejected.

It does not leak the agent runtime's /proc/<agent>/environ (FIREWORKS_API_KEY, GH_TOKEN). --unshare-all puts the sandboxed bash in a child user namespace, mapped-root rather than real-root. The kernel's PTRACE_MODE_READ_FSCREDS check on /proc/<pid>/environ requires the reader's userns to be the target's userns or an ancestor of it; a child userns is neither, so the read fails with EACCES. The same userns boundary blocks kill()/ptrace against those pids (EPERM, no CAP_KILL in the parent userns). The residual is non-secret metadata: other pids' cmdline/status/stat stay visible, because proc-bind (unlike real-proc) does not put the sandbox in a fresh PID namespace. On the single-tenant boundary that is an accepted trade — the API-key surface (environ, mem, ptrace, signals) is provably blocked, and the alternative (real-proc) buys PID-metadata hiding only at the cost of CAP_SYS_ADMIN on the outer container, a larger blast radius.

Because the no-leak property is a kernel fact rather than a typeclaw invariant, the consumer probes it before ever selecting proc-bind (canBindProcSafely, src/sandbox/availability.ts): it spawns a sentinel sibling in the parent userns holding a planted secret, enters the exact proc-bind bwrap shape, and asserts the sentinel's environ/maps are unreadable and it is unsignalable, while the sandbox's own /proc/self/{fd,maps} are readable. A host where any of those fail (an exotic runtime that preserves parent-userns creds) fails the probe and falls through to --tmpfs /proc — fail-closed, never a silent leak.

`sandbox.realProc`: opt-in PID isolation via a fresh procfs

typeclaw.json#sandbox.realProc (default false) opts into a two-phase form that adds full PID isolation on top of the environ guard:

unshare --pid --fork --mount --mount-proc -- \
  bwrap --unshare-user --unshare-ipc --unshare-uts --unshare-cgroup [--unshare-net] \
        … --ro-bind /proc /proc … bash -c <command>

The outer unshare creates a new PID + mount namespace as real root and mounts a fresh procfs scoped to that PID namespace. bwrap then runs inside it and must NOT re-unshare pid (a second PID ns with no matching procfs reintroduces the crash), so it unshares every namespace except pid explicitly instead of --unshare-all. Because the new procfs only contains the sandbox's own process tree, --ro-bind /proc /proc binds that scoped view — the agent runtime's pids are simply absent from the namespace (not merely unreadable, as under proc-bind), so even the non-secret PID metadata is hidden.

Two constraints make this opt-in rather than default. First, unshare --mount-proc needs real CAP_SYS_ADMIN (seccomp=unconfined only unblocks the unshare/clone syscalls; the kernel still rejects mount(2) of proc without the capability), so planStart() grants --cap-add=SYS_ADMIN only when realProc is set — a broader outer cap than the default proc-bind needs. Second, OrbStack rejects the mount even with the cap, so the container-side resolver probes canMountRealProc() and falls back to proc-bind there regardless. The net: realProc: true is worth it only on a host where the mount works AND hiding PID metadata from sandboxed bash is worth the CAP_SYS_ADMIN grant.

Strategy resolution order

applyBashSandbox (src/agent/plugin-tools.ts) picks per container boot: real-proc if sandbox.realProc is set AND canMountRealProc() (the cap-mount works) → else proc-bind if canBindProcSafely() (the no-leak probe passes) → else --tmpfs /proc (degraded; external packages unavailable). Both probes are cached process-globally, and the strategy is read from the boot-time config snapshot (not live getConfig()), so it stays coherent with the boot-time --cap-add=SYS_ADMIN decision across a typeclaw reload.

The proc-bind no-leak probe is a security check, not a capability check: it can return inconclusive (a probe timeout under a boot-time load spike, a sentinel race) which proves nothing about the host. resolveProcStrategy retries an inconclusive verdict on a staggered backoff (PROC_BIND_RETRY_BACKOFF_MS, src/sandbox/availability.ts) before degrading, so a transient hiccup on a capable host does not silently fall to --tmpfs /proc and break bun install. A definitive unsafe (a real cross-userns leak) still fails closed immediately with no retry.

When the strategy does resolve to --tmpfs /proc AND the command needs a real /proc (commandNeedsRealProc — a bun install/bunx/bun run), applyBashSandbox throws before running it, so the agent gets an actionable message instead of Bun's opaque NotDir deep in its install pipeline. Which error it throws depends on why it degraded — resolveProcStrategy returns the reason alongside the strategy:

definitive — the probe returned a proven unsafe (a real cross-userns leak). This is the only verdict the probe can establish as permanent, so it is the only definitive degrade: it fails closed for good and throws SandboxDegradedProcError ("this is an environment limit; retrying will not help"). It is also the only path that emits the once-per-process [sandbox] degraded /proc mode warning.
unverified — the safety probe never reached a definitive verdict within its retry budget. This covers both a transient boot-time CPU/IO storm (e.g. several containers starting at once, tripping the probe's own timeout) and a durably-incapable host (no usable namespaces, or a bwrap that starts but cannot set up its sandbox). The probe cannot prove a negative capability — only a leak is definitive — so an incapable host is not distinguishable from brief saturation and lands here too. This throws SandboxProcProbeUnverifiedError ("this is almost certainly temporary; retry the same command in a few seconds"), the opposite guidance. Because inconclusive is never cached, the next bash call re-probes from scratch: a capable-but-saturated host promotes to proc-bind once the spike passes, while a genuinely incapable one simply re-degrades each call (a cheap re-probe, never a permanent pin). Without this split, a single unlucky boot-storm probe degraded a fully-capable container and told the agent it was a permanent limit, so it gave up instead of retrying — leaving bunx/bun add broken until a manual typeclaw restart.

Re-exposing `/proc/self/exe` (the `--tmpfs /proc` fallback only)

Under the degraded --tmpfs /proc fallback (reached only when both real-proc and proc-bind are unavailable), one /proc entry is re-exposed: /proc/self/exe. JS package runners (bunx, and node's npx/pnpx) read it to locate their own interpreter — under an empty /proc that read fails and bunx panics in createFakeTemporaryNodeExecutable: error.FileNotFound. The sandbox resolves the running bun binary on the host (process.execPath, the concrete ELF /proc/self/exe would point at) and, after --tmpfs /proc, emits --ro-bind <bun> <bun> plus --symlink <bun> /proc/self/exe. Only that one symlink is restored; /proc/N/environ and the rest of /proc stay masked, so the FIREWORKS_API_KEY leak the --tmpfs /proc guard prevents is untouched.

A bare --ro-bind /proc/self/exe /proc/self/exe does not work and is the trap to avoid: bwrap resolves bind sources at setup time, when /proc/self is bwrap's own pid (it runs as PID 1 in the new namespace before forking the child), so the bind would capture the bwrap binary, not the runtime. The concrete-path + --symlink idiom is the only correct form. See src/sandbox/build.ts (the procSelfExe branch) and resolveProcSelfExe in src/sandbox/availability.ts.

The restored target is always the bun binary, and that is correct only because the container is bun-centric. The base image (oven/bun:1-slim) ships no real node: node is a symlink to bun, and bunx/npx/pnpx all resolve to bun via Bun's fake-node model (Bun creates a temp node shim pointing at itself and prepends it to PATH). So every process that reads /proc/self/exe inside the sandbox is bun — pointing it at bun is accurate, not a hack. The one case this would be wrong is a real node binary installed into the image and run without --bun: that process would read /proc/self/exe, get bun's path, and mis-resolve process.execPath (breaking child_process.fork, worker threads, and self-re-exec). The image does not include node today; if that ever changes, resolveProcSelfExe must resolve the actual interpreter per invocation instead of always returning bun.

Per-tool sandbox (bwrap)