Sandbox Guide¶

blq can enforce execution boundaries on registered commands using Linux namespace isolation. This prevents commands from accessing the network, writing outside the workspace, or spawning unconstrained processes.

Quick Start¶

# Register a command with a sandbox preset
blq commands register test "pytest" --sandbox test

# Or add sandbox to an existing command in .bird/commands.toml:
# [commands.test.sandbox]
# network = "none"
# filesystem = "readonly"

# Run — sandbox is automatically enforced
blq run test

# See what sandbox specs are in use
blq sandbox list

How It Works¶

When a command has a sandbox spec, blq wraps it in bubblewrap (bwrap), which creates Linux namespaces to isolate the command:

Dimension	What it controls	Enforcement
`network`	Network access	`--unshare-net` (full isolation)
`filesystem`	File writes	`--ro-bind` / `--bind` mount strategy
`processes`	Process visibility	`--unshare-pid`
`tmpfs`	Scratch space	`--tmpfs` with `--size` limit
`timeout`	Wall-clock time	Subprocess timeout
`memory`	Peak memory	Cgroup limit (systemd engine)
`cpu`	CPU time	Cgroup limit (systemd engine)

Safety flags --die-with-parent and --new-session are always applied.

Presets¶

Named presets cover common use cases:

Preset	network	filesystem	timeout	memory	processes
`readonly`	none	readonly	30s	256m	isolated
`test`	none	readonly	60s	512m	isolated
`build`	none	workspace_only	5m	2g	isolated
`integration`	localhost	workspace_only	10m	4g	visible
`unrestricted`	unrestricted	unrestricted	30m	-	visible
`none`	unrestricted	unrestricted	-	-	visible

# Use a preset
blq commands register test "pytest" --sandbox test

# Or in commands.toml
[commands.test]
cmd = "pytest"
sandbox = "test"

Custom Specs¶

For fine-grained control, use a [commands.NAME.sandbox] section:

[commands.test]
cmd = "pytest tests/"

[commands.test.sandbox]
network = "none"
filesystem = "readonly"
timeout = "120s"
memory = "1g"
processes = "isolated"
tmpfs = "200m"
paths_hidden = ["/home", "/root"]

Grading¶

Each sandbox spec maps to a formal grade on the Ma framework's lattice:

World coupling (grade_w): - sealed — no network, no reads beyond /usr - pinhole — no network, readonly workspace - scoped — no network, workspace writes only - broad — localhost network access - open — unrestricted

Effects ceiling: - Level 2 — readonly, no network (can read + compute only) - Level 4 — workspace writes, no network (can mutate files) - Level 7 — workspace writes + visible processes (can spawn daemons) - Level 8 — network access (can reach external services)

blq sandbox inspect test
# Grade W: pinhole
# Effects Ceiling: 2

Discovery Workflow¶

The recommended workflow for adding sandbox specs to a project:

1. Profile¶

Discover what a command actually accesses:

blq sandbox profile test

This wraps the command in strace (one-time, 2-10x overhead) and reports files read/written, network connections, and subprocess spawns.

2. Suggest¶

Combine strace profile with observed resource metrics:

blq sandbox suggest test

This queries past run data for memory peak and CPU usage, then suggests a spec with headroom (2x memory, 3x timeout).

3. Declare¶

Add the suggested spec to commands.toml:

[commands.test.sandbox]
network = "none"
filesystem = "readonly"
timeout = "2m"
memory = "1g"

4. Enforce¶

Run normally — the sandbox is automatically applied:

blq run test

If the command fails due to sandbox restrictions, blq generates a structured info event with the sandbox context, queryable via blq events.

5. Tighten¶

After accumulating runs, auto-narrow the spec based on observed resource usage:

blq sandbox tighten test
# Tightening sandbox spec for 'test' (from 15 runs):
#   memory: 512m -> 256m
#   timeout: 1m -> 30s
#   cpu: 30s -> 15s
# Updated commands.toml

Use --dry-run to preview changes without writing:

blq sandbox tighten test --dry-run

Tightening only reduces bounds — it never loosens them. It applies headroom (2x memory, 2x CPU, 3x timeout) to observed maximums. Requires at least 3 runs for reliable data.

6. Query¶

Check sandbox status across all commands:

blq sandbox list          # overview of all specs and grades
blq sandbox inspect test  # detailed spec for one command

Auto-Detection¶

When using blq init --detect, detected commands get default sandbox presets:

Command type	Default sandbox
test	`test` (readonly, no network)
build	`build` (workspace writes, no network)
lint	`readonly` (readonly, no network)
clean	`build` (needs to delete files)
format	`build` (modifies source files)

Commands without a matching type (e.g., docker-build, configure) get no sandbox by default.

MCP Integration¶

AI agents can query and manage sandbox specs:

// Query sandbox info for one command
{"tool": "sandbox_info", "command": "test"}

// Query all commands
{"tool": "sandbox_info"}

// Register with sandbox preset
{"tool": "register_command", "name": "test", "cmd": "pytest", "sandbox": "test"}

The sandbox_info tool returns the spec, grades, and observed resource metrics (memory peak, CPU usage, average duration) when monitoring data is available.

Annotators¶

Annotators are plugins that enrich stored events with additional context. They run after events are written to the database and add structured annotations to the metadata JSON column.

Each annotation has: - type — what kind of enrichment (source, provenance, diagnostic) - display — when to show it: inline (always), detail (inspect only), hidden (queryable only) - data — annotator-specific payload

Annotators declare whether they're eager (run during blq run) or deferred (run on demand). Eager annotators execute in Window 2 alongside event storage. Deferred annotators run when explicitly requested.

Annotators are discovered via Python entry points (blq.annotators group).

Requirements¶

bwrap (bubblewrap) for namespace isolation: sudo apt install bubblewrap
strace for profiling (optional): sudo apt install strace
systemd for cgroup resource limits (optional, for memory/CPU enforcement)

Engines¶

blq uses multiple enforcement engines that compose together:

Engine	Dimensions	Install
bwrap	network, filesystem, processes, tmpfs	`apt install bubblewrap`
systemd	memory, cpu	Built-in on systemd systems

Engines are discovered via Python entry points and selected based on which spec dimensions need enforcement.