CORALCORAL
Getting Started

Quick Start

Run your first CORAL task in 5 minutes.

This guide walks you through creating a task, writing a grader, and launching agents.

1. Scaffold a task

coral init my-task

This creates a self-contained task with a packaged grader:

my-task/
├── task.yaml                              # task config + grader entrypoint
├── seed/
│   └── solution.py                        # baseline the agent iterates on
└── grader/                                # standalone Python package
    ├── pyproject.toml                     # name: my-task-grader, deps: [coral]
    └── src/
        └── my_task_grader/
            ├── __init__.py                # re-exports Grader
            └── grader.py                  # class Grader(TaskGrader): ...

The grader ships as a real Python package so it gets its own isolated venv at run time (created from grader.setup in task.yaml).

2. Define your task

Edit my-task/task.yaml:

task:
  name: my-task
  description: "Optimize the function in solution.py to print a higher score."

grader:
  entrypoint: "my_task_grader.grader:Grader"
  setup:
    - "uv pip install -e ./grader"
  timeout: 300
  direction: maximize
  args:
    program_file: "solution.py"

agents:
  count: 2
  runtime: claude_code
  model: claude-sonnet-4-6

workspace:
  repo_path: "./seed"

3. Write a grader

The scaffold already includes a working stub at my-task/grader/src/my_task_grader/grader.py that runs solution.py and parses a single float from stdout. Customise evaluate() for your scoring logic:

from coral.grader import TaskGrader
from coral.types import ScoreBundle


class Grader(TaskGrader):
    def evaluate(self) -> float | ScoreBundle:
        program_file = self.args.get("program_file", "solution.py")
        result = self.run_program(program_file)

        if result.returncode != 0:
            return self.fail(f"Program crashed: {result.stderr[:200]}")

        try:
            return float(result.stdout.strip())
        except ValueError:
            return self.fail("Could not parse output as a number")

What you have on self: codebase_path (agent worktree), private_dir (.coral/private/ for hidden answer keys), args (dict from grader.args), timeout, plus helpers run_program(filename), score(value, explanation=...), fail(reason). See the Custom Grader guide for the full API.

If the grader needs extra dependencies (numpy, torch, etc.), add them to my-task/grader/pyproject.toml under dependencies — they'll be installed into the grader's venv by coral validate and the daemon.

4. Validate the grader

Before launching agents, verify your grader works against the seed code:

coral validate my-task

This runs the grader once and shows you the score. Fix any issues before proceeding.

5. Launch agents

coral start -c my-task/task.yaml

CORAL will:

  1. Create a .coral/ shared state directory
  2. Create isolated git worktrees for each agent
  3. Generate a CORAL.md instruction file in each worktree
  4. Spawn the agents

6. Monitor progress

# View leaderboard
coral log

# Agent health and status
coral status

# Open web dashboard
coral ui

7. Stop when done

coral stop

What happens next?

Each agent autonomously:

  1. Reads the CORAL.md instructions
  2. Explores the codebase and researches approaches
  3. Makes changes and calls coral eval -m "description"
  4. Sees the score and feedback
  5. Iterates, shares notes, and builds skills
  6. Repeats until stopped

Check Concepts to understand the full architecture, or CLI Reference for all available commands.