Harbor — the agent-evaluation framework and official Terminal-Bench harness (Python). Run agent evals and build RL environments: `harbor run --agent … --model …`. Ships with Python 3.12, git, and the `harbor` CLI.
smolvm pack pull registry.smolmachines.com/library/harbor:latest Verified steps — with the smolvm CLI installed:
# Pull (Apple Silicon / arm64; use :amd64 on Intel/AMD)
smolvm pack pull registry.smolmachines.com/library/harbor:arm64 -o harbor.smolmachine # Run a one-off command — ephemeral, the VM is discarded when it exits
smolvm pack run --sidecar harbor.smolmachine harbor --version # Or run it as a persistent machine — create, start, exec, stop
smolvm machine create --name harbor --from harbor.smolmachine smolvm machine start --name harbor smolvm machine exec --name harbor -- harbor --version smolvm machine stop --name harbor Harbor drives coding agents (Claude Code, Codex, OpenHands) — provide the relevant API keys and an execution backend (Docker / Daytona / Modal) at runtime.
| tag | digest | size | platforms | created |
|---|---|---|---|---|
latest | sha256:2e98b226edf2… | 299 MB | linux/amd64linux/arm64 | 6/5/2026 |
python:3.12-slimcontainer/bin/sh/4 CPU, 8192 MBlinux/amd64multi0.9.0