Running the tests

1. Nix shell for testing
2. The host test entry point
3. The test entry point
4. Test fixtures
5. Dagger wrapper for batch execution

Each dagger call bash block in the org files tangles to a command file under tests/commands/. The expected output tangles to tests/expected/. The test runner iterates over command files, runs each one, and compares stdout to the expected output.

1. Nix shell for testing

test-host.sh expects pytest and pytest-asyncio to be available on the host. This shell.nix provides them via nix-shell so the test suite can run without polluting the system Python:

nix-shell --run "./test-host.sh"

{ pkgs ? import <nixpkgs> {} }:

pkgs.mkShell {
  packages = with pkgs; [
    python313
    python313Packages.pytest
    python313Packages.pytest-asyncio
  ];
}

2. The host test entry point

The test script runs pytest directly. Tests call dagger call as subprocesses — no SDK connection needed.

set -eu
cd "$(dirname "$0")"
for arg in "$@"; do
    if [ "$arg" = "-v" ]; then
        export DAGGER_VERBOSE=1
    fi
done
exec pytest tests/test_sdk.py "$@"

3. The test entry point

Dogfooding: run the test suite inside a DinD container using the project's own dind-run-tests function.

set -eu
cd "$(dirname "$0")"
exec dagger ${DAGGER_EXTRA_ARGS:-} call dind-run-tests stdout

4. Test fixtures

Each test is a pair: a command file in tests/commands/ and an expected output file in tests/expected/. The test runner runs each command with bash and compares stdout to the expected output.

Library tests (under tests/commands/) run from the project root. Example tests (under examples/*/tests/commands/) run from the example directory so that dagger call loads the example module.

import os
import subprocess
from pathlib import Path

import pytest

ROOT = Path(__file__).resolve().parents[1]
TESTS_DIR = Path(__file__).parent


def _collect(commands_dir, cwd):
    """Collect (cmd_file, expected_dir, cwd) triples from a commands dir."""
    if not commands_dir.exists():
        return []
    expected_dir = commands_dir.parent / "expected"
    return [
        (f, expected_dir, cwd)
        for f in sorted(commands_dir.glob("*"))
    ]


def _test_id(entry):
    cmd_file, _, cwd = entry
    if cwd == ROOT:
        return cmd_file.name
    return f"{cwd.name}/{cmd_file.name}"


lib_tests = _collect(TESTS_DIR / "commands", ROOT)
example_tests = [
    entry
    for example_dir in sorted(ROOT.glob("examples/*"))
    for entry in _collect(example_dir / "tests" / "commands", example_dir)
]
all_tests = lib_tests + example_tests


@pytest.mark.parametrize(
    "cmd_file,expected_dir,cwd",
    all_tests,
    ids=[_test_id(e) for e in all_tests],
)
def test_command(tmp_path, cmd_file, expected_dir, cwd):
    env = {**os.environ, "TMP": str(tmp_path), "ROOT": str(ROOT)}
    env["PATH"] = str(ROOT / "tests") + ":" + env.get("PATH", "")
    result = subprocess.run(
        ["bash", str(cmd_file)],
        capture_output=True,
        text=True,
        cwd=str(cwd),
        env=env,
    )
    assert result.returncode == 0, (
        f"command {cmd_file.name} failed (exit {result.returncode}):\n"
        f"--- stdout ---\n{result.stdout}\n"
        f"--- stderr ---\n{result.stderr}"
    )
    expected_file = expected_dir / cmd_file.name
    expected = expected_file.read_text().rstrip("\n") if expected_file.exists() else ""
    assert result.stdout.rstrip("\n") == expected, (
        f"{cmd_file.name}: got {result.stdout.rstrip(chr(10))!r},"
        f" expected {expected!r}"
    )

5. Dagger wrapper for batch execution

Dagger's TUI writes terminal escape sequences (\e]11;?) directly to /dev/tty, which breaks Emacs comint sessions. This wrapper finds the real dagger binary, sets DARK=1 to skip the terminal background query, and redirects stderr to suppress progress output.

When DAGGER_VERBOSE=1 (set by test-host.sh -v), the wrapper uses --progress=plain instead and preserves stderr so test failures are diagnosable.

In non-verbose mode, stderr is preserved so test failures are always diagnosable. Pytest captures it and only shows it when a test fails, so successful runs stay clean.

export DAGGER_NO_NAG=1
export DARK=1
DAGGER_PATH="$(which -a dagger|grep -v /tests/|head -1)"
args=("$@")
if [ "${DAGGER_VERBOSE:-}" = "1" ]; then
    exec "$DAGGER_PATH" --progress=plain --silent "${args[@]}"
else
    exec "$DAGGER_PATH" --silent "${args[@]}"
fi