Installation
Install CORAL and its dependencies.
Prerequisites
- Python 3.11+
- uv — fast Python package manager (install uv)
- git — for worktree management
- tmux — recommended for session management (CORAL auto-wraps in tmux by default)
You'll also need at least one supported agent runtime:
- Claude Code (default) — requires an Anthropic API key
- Codex — OpenAI's coding agent
- OpenCode — open-source alternative
For Harbor-based benchmarks (SWE-bench, terminal-bench), you additionally need:
- Docker — installed and running on the host
- Harbor CLI — invoked via
uvx harbor(no separate install needed if you haveuv)
Install from source
git clone https://github.com/yanyh528/CORAL.git
cd CORALBasic install
uv syncWith development tools
uv sync --extra dev # pytest, ruff, mypyEverything
uv sync --all-extras # All optional dependenciesVerify installation
coral --versionYou should see something like coral 0.2.0.
Optional dependencies
CORAL currently exposes two optional extras:
| Extra | Install | Purpose |
|---|---|---|
dev | uv sync --extra dev | Test and lint tools (pytest, ruff, mypy) |
ui | uv sync --extra ui | Web dashboard (Starlette + uvicorn) |
all | uv sync --extra all | Everything above |
Benchmark evaluators
Standardized coding benchmarks like SWE-bench Verified and terminal-bench are shipped as example tasks under examples/, not as Python extras. Their graders delegate evaluation to Harbor, which runs each instance inside its own Docker container.
Run them directly with coral start — no extra uv sync step is required:
# SWE-bench Verified (meta-solver optimization over 500 GitHub issues)
coral start -c examples/swebench-verified/task.yaml
# terminal-bench (meta-solver optimization over terminal/shell tasks)
coral start -c examples/terminal-bench/task.yamlEach task's seed includes a solve.py baseline (a Terminus2-based Harbor agent) and a grader that invokes uvx harbor run -d <dataset> with tiered evaluation (5 → 30 → all instances). See the Benchmarks guide for details.
Docker-in-Docker is not supported. Because Harbor spawns Docker containers, CORAL itself must run on the host machine (not inside a container) for these benchmarks. Other examples that don't use Harbor are unaffected.
Next steps
Once installed, head to the Quick Start to run your first task.
