Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Logos Testing Framework

Declarative, multi-node blockchain testing for the Logos network

The Logos Testing Framework enables you to test consensus, data availability, and transaction workloads across local processes, Docker Compose, and Kubernetes deployments—all with a unified scenario API.

Get Started


Core Concept

Everything in this framework is a Scenario.

A Scenario is a controlled experiment over time, composed of:

  • Topology — The cluster shape (validators, executors, network layout)
  • Workloads — Traffic and conditions that exercise the system (transactions, DA, chaos)
  • Expectations — Success criteria verified after execution (liveness, inclusion, recovery)
  • Duration — The time window for the experiment

This single abstraction makes tests declarative, portable, and composable.


How It Works

flowchart LR
    Build[Define Scenario] --> Deploy[Deploy Topology]
    Deploy --> Execute[Run Workloads]
    Execute --> Evaluate[Check Expectations]
    
    style Build fill:#e1f5ff
    style Deploy fill:#fff4e1
    style Execute fill:#ffe1f5
    style Evaluate fill:#e1ffe1
  1. Define Scenario — Describe your test: topology, workloads, and success criteria
  2. Deploy Topology — Launch validators and executors using host, compose, or k8s runners
  3. Run Workloads — Drive transactions, DA traffic, and chaos operations
  4. Check Expectations — Verify consensus liveness, inclusion, and system health

Key Features

Declarative API

  • Express scenarios as topology + workloads + expectations
  • Reuse the same test definition across different deployment targets
  • Compose complex tests from modular components

Multiple Deployment Modes

  • Host Runner: Local processes for fast iteration
  • Compose Runner: Containerized environments with node control
  • Kubernetes Runner: Production-like cluster testing

Built-in Workloads

  • Transaction submission with configurable rates
  • Data availability (DA) blob dispersal and sampling
  • Chaos testing with controlled node restarts

Comprehensive Observability

  • Real-time block feed for monitoring consensus progress
  • Prometheus/Grafana integration for metrics
  • Per-node log collection and debugging

Quick Example

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_core::scenario::Deployer as _;
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(3)
            .executors(1)
    })
    .transactions_with(|tx| tx.rate(10).users(5))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&scenario).await?;
    runner.run(&mut scenario).await?;

    Ok(())
}

View complete examples


Choose Your Path

New to the Framework?

Start with the Quickstart Guide for a hands-on introduction that gets you running tests in minutes.

Ready to Write Tests?

Explore the User Guide to learn about authoring scenarios, workloads, expectations, and deployment strategies.

Setting Up CI/CD?

Jump to Operations & Deployment for prerequisites, environment configuration, and continuous integration patterns.

Extending the Framework?

Check the Developer Reference to implement custom workloads, expectations, and runners.


Project Context

Logos is a modular blockchain protocol composed of validators, executors, and a data-availability (DA) subsystem:

  • Validators participate in consensus and produce blocks
  • Executors are validators with the DA dispersal service enabled. They perform all validator functions plus submit blob data to the DA network
  • Data Availability (DA) ensures that blob data submitted via channel operations in transactions is published and retrievable by the network

These roles interact tightly, which is why meaningful testing must be performed in multi-node environments that include real networking, timing, and DA interaction.

The Logos Testing Framework provides the infrastructure to orchestrate these multi-node scenarios reliably across development, CI, and production-like environments.

Learn more about the protocol: Logos Project Documentation


Documentation Structure

SectionDescription
FoundationsArchitecture, philosophy, and design principles
User GuideWriting and running scenarios, workloads, and expectations
Developer ReferenceExtending the framework with custom components
Operations & DeploymentSetup, CI integration, and environment configuration
AppendixQuick reference, troubleshooting, FAQ, and glossary


Ready to start? Head to the Quickstart

What You Will Learn

This book gives you a clear mental model for Logos multi-node testing, shows how to author scenarios that pair realistic workloads with explicit expectations, and guides you to run them across local, containerized, and cluster environments without changing the plan.

By the End of This Book, You Will Be Able To:

Understand the Framework

  • Explain the six-phase scenario lifecycle (Build, Deploy, Capture, Execute, Evaluate, Cleanup)
  • Describe how Deployers, Runners, Workloads, and Expectations work together
  • Navigate the crate architecture and identify extension points
  • Understand when to use each runner (Host, Compose, Kubernetes)

Author and Run Scenarios

  • Define multi-node topologies with validators and executors
  • Configure transaction and DA workloads with appropriate rates
  • Add consensus liveness and inclusion expectations
  • Run scenarios across all three deployment modes
  • Use BlockFeed to monitor block production in real-time
  • Implement chaos testing with node restarts

Operate in Production

  • Set up prerequisites and dependencies correctly
  • Configure environment variables for different runners
  • Integrate tests into CI/CD pipelines (GitHub Actions)
  • Troubleshoot common failure scenarios
  • Collect and analyze logs from multi-node runs
  • Optimize test durations and resource usage

Extend the Framework

  • Implement custom Workload traits for new traffic patterns
  • Create custom Expectation traits for domain-specific checks
  • Add new Deployer implementations for different backends
  • Contribute topology helpers and DSL extensions

Learning Path

Beginner (0-2 hours)

Intermediate (2-8 hours)

Advanced (8+ hours)

What This Book Does NOT Cover

  • Logos node internals — This book focuses on testing infrastructure, not the blockchain protocol implementation. See the Logos node repository (nomos-node) for protocol documentation.
  • Consensus algorithm theory — We assume familiarity with basic blockchain concepts (validators, blocks, transactions, data availability).
  • Rust language basics — Examples use Rust, but we don’t teach the language. See The Rust Book if you’re new to Rust.
  • Kubernetes administration — We show how to use the K8s runner, but don’t cover cluster setup, networking, or operations.
  • Docker fundamentals — We assume basic Docker/Compose knowledge for the Compose runner.

Quickstart

Get a working example running quickly.

From Scratch (Complete Setup)

If you’re starting from zero, here’s everything you need:

# 1. Install Rust nightly
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup default nightly

# 2. Clone the repository
git clone https://github.com/logos-blockchain/logos-blockchain-testing.git
cd logos-blockchain-testing

# 3. Run your first scenario (downloads dependencies automatically)
POL_PROOF_DEV_MODE=true scripts/run/run-examples.sh -t 60 -v 1 -e 1 host

First run takes 5-10 minutes (downloads ~120MB circuit assets, builds binaries).

Windows users: Use WSL2 (Windows Subsystem for Linux). Native Windows is not supported.


Prerequisites

If you already have the repository cloned:

  • Rust toolchain (nightly)
  • Unix-like system (tested on Linux and macOS)
  • For Docker Compose examples: Docker daemon running
  • For Docker Desktop on Apple silicon (compose/k8s): set NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64 to avoid slow/fragile amd64 emulation builds
  • versions.env file at repository root (defines VERSION, NOMOS_NODE_REV, NOMOS_BUNDLE_VERSION)

Note: nomos-node binaries are built automatically on demand or can be provided via prebuilt bundles.

Important: The versions.env file is required by helper scripts. If missing, the scripts will fail with an error. The file should already exist in the repository root.

Your First Test

The framework ships with runnable example binaries in examples/src/bin/.

Recommended: Use the convenience script:

# From the logos-blockchain-testing directory
scripts/run/run-examples.sh -t 60 -v 1 -e 1 host

This handles circuit setup, binary building, and runs a complete scenario: 1 validator + 1 executor, mixed transaction + DA workload (5 tx/block + 1 channel + 1 blob), 60s duration.

Note: The DA workload attaches DaWorkloadExpectation, and channel/blob publishing is slower than tx submission. If you see DaWorkloadExpectation failures, rerun with a longer duration (e.g., -t 120), especially on CI or slower machines.

Alternative: Direct cargo run (requires manual setup):

# Requires circuits in place and NOMOS_NODE_BIN/NOMOS_EXECUTOR_BIN set
POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner

Core API Pattern (simplified example):

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn run_local_demo() -> Result<()> {
    // Define the scenario (1 validator + 1 executor, tx + DA workload)
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(1))
        .wallets(1_000)
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
                .users(500) // use 500 of the seeded wallets
        })
        .da_with(|da| {
            da.channel_rate(1) // 1 channel
                .blob_rate(1) // target 1 blob per block
                .headroom_percent(20) // default headroom when sizing channels
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(60))
        .build();

    // Deploy and run
    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

Note: The examples are binaries with #[tokio::main], not test functions. If you want to write integration tests, wrap this pattern in #[tokio::test] functions in your own test suite.

Important: POL_PROOF_DEV_MODE=true disables expensive Groth16 zero-knowledge proof generation for leader election. Without it, proof generation is CPU-intensive and tests will timeout. This is required for all runners (local, compose, k8s) for practical testing. Never use in production.

What you should see:

  • Nodes spawn as local processes
  • Consensus starts producing blocks
  • Scenario runs for the configured duration
  • Node state/logs written under a temporary per-run directory in the current working directory (removed after the run unless NOMOS_TESTS_KEEP_LOGS=1)
  • To write per-node log files to a stable location: set NOMOS_LOG_DIR=/path/to/logs (files will have prefix like nomos-node-0*, may include timestamps)

What Just Happened?

Let’s unpack the code:

1. Topology Configuration

use testing_framework_core::scenario::ScenarioBuilder;

pub fn step_1_topology() -> testing_framework_core::scenario::Builder<()> {
    ScenarioBuilder::topology_with(|t| {
        t.network_star() // Star topology: all nodes connect to seed
            .validators(1) // 1 validator node
            .executors(1) // 1 executor node (validator + DA dispersal)
    })
}

This defines what your test network looks like.

2. Wallet Seeding

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn step_2_wallets() -> testing_framework_core::scenario::Builder<()> {
    ScenarioBuilder::with_node_counts(1, 1).wallets(1_000) // Seed 1,000 funded wallet accounts
}

Provides funded accounts for transaction submission.

3. Workloads

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn step_3_workloads() -> testing_framework_core::scenario::Builder<()> {
    ScenarioBuilder::with_node_counts(1, 1)
        .wallets(1_000)
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
                .users(500) // Use 500 of the 1,000 wallets
        })
        .da_with(|da| {
            da.channel_rate(1) // 1 DA channel (more spawned with headroom)
                .blob_rate(1) // target 1 blob per block
                .headroom_percent(20) // default headroom when sizing channels
        })
}

Generates both transaction and DA traffic to stress both subsystems.

4. Expectation

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn step_4_expectation() -> testing_framework_core::scenario::Builder<()> {
    ScenarioBuilder::with_node_counts(1, 1).expect_consensus_liveness() // This says what success means: blocks must be produced continuously.
}

This says what success means: blocks must be produced continuously.

5. Run Duration

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;

pub fn step_5_run_duration() -> testing_framework_core::scenario::Builder<()> {
    ScenarioBuilder::with_node_counts(1, 1).with_run_duration(Duration::from_secs(60))
}

Run for 60 seconds (~27 blocks with default 2s slots, 0.9 coefficient). Framework ensures this is at least 2× the consensus slot duration. Adjust consensus timing via CONSENSUS_SLOT_TIME and CONSENSUS_ACTIVE_SLOT_COEFF.

6. Deploy and Execute

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;

pub async fn step_6_deploy_and_execute() -> Result<()> {
    let mut plan = ScenarioBuilder::with_node_counts(1, 1).build();

    let deployer = LocalDeployer::default(); // Use local process deployer
    let runner = deployer.deploy(&plan).await?; // Provision infrastructure
    let _handle = runner.run(&mut plan).await?; // Execute workloads & expectations

    Ok(())
}

Deployer provisions the infrastructure. Runner orchestrates execution.

Adjust the Topology

With run-examples.sh (recommended):

# Scale up to 3 validators + 2 executors, run for 2 minutes
scripts/run/run-examples.sh -t 120 -v 3 -e 2 host

With direct cargo run:

# Uses NOMOS_DEMO_* env vars (or legacy *_DEMO_* vars)
NOMOS_DEMO_VALIDATORS=3 \
NOMOS_DEMO_EXECUTORS=2 \
NOMOS_DEMO_RUN_SECS=120 \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

Try Docker Compose

Use the same API with a different deployer for reproducible containerized environment.

Recommended: Use the convenience script (handles everything):

scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose

This automatically:

  • Fetches circuit assets (to testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params)
  • Builds/uses prebuilt binaries (via NOMOS_BINARIES_TAR if available)
  • Builds the Docker image
  • Runs the compose scenario

Alternative: Direct cargo run with manual setup:

# Option 1: Use prebuilt bundle (recommended for compose/k8s)
scripts/build/build-bundle.sh --platform linux  # Creates .tmp/nomos-binaries-linux-v0.3.1.tar.gz
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz

# Option 2: Manual circuit/image setup (rebuilds during image build)
scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
scripts/build/build_test_image.sh

# Run with Compose
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

Benefit: Reproducible containerized environment (Dockerized nodes, repeatable deployments).

Optional: Prometheus + Grafana

The runner can integrate with external observability endpoints. For a ready-to-run local stack:

scripts/setup/setup-observability.sh compose up
eval "$(scripts/setup/setup-observability.sh compose env)"

Then run your compose scenario as usual (the environment variables enable PromQL querying and node OTLP metrics export).

Note: Compose expects KZG parameters at /kzgrs_test_params/kzgrs_test_params inside containers (the directory name is repeated as the filename).

In code: Just swap the deployer:

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;

pub async fn run_with_compose_deployer() -> Result<()> {
    // ... same scenario definition ...
    let mut plan = ScenarioBuilder::with_node_counts(1, 1).build();

    let deployer = ComposeDeployer::default(); // Use Docker Compose
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

Next Steps

Now that you have a working test:

Part I — Foundations

Conceptual chapters that establish the mental model for the framework and how it approaches multi-node testing.

Introduction

The Logos Testing Framework is a purpose-built toolkit for exercising Logos in realistic, multi-node environments. It solves the gap between small, isolated tests and full-system validation by letting teams describe a cluster layout, drive meaningful traffic, and assert the outcomes in one coherent plan.

It is for protocol engineers, infrastructure operators, and QA teams who need repeatable confidence that validators, executors, and data-availability components work together under network and timing constraints.

Multi-node integration testing is required because many Logos behaviors—block progress, data availability, liveness under churn—only emerge when several roles interact over real networking and time. This framework makes those checks declarative, observable, and portable across environments.

A Scenario in 20 Lines

Here’s the conceptual shape of every test you’ll write:

// 1. Define the cluster
let scenario = ScenarioBuilder::topology_with(|t| {
    t.network_star()
        .validators(3)
        .executors(2)
})
// 2. Add workloads (traffic)
.transactions_with(|tx| tx.rate(10).users(5))
.da_with(|da| da.channel_rate(2).blob_rate(2))

// 3. Define success criteria
.expect_consensus_liveness()

// 4. Set experiment duration
.with_run_duration(Duration::from_secs(60))
.build();

// 5. Deploy and run
let runner = deployer.deploy(&scenario).await?;
runner.run(&mut scenario).await?;

This pattern—topology, workloads, expectations, duration—repeats across all scenarios in this book.

Learn more: For protocol-level documentation and node internals, see the Logos Project Documentation.

Architecture Overview

The framework follows a clear flow: Topology → Scenario → Deployer → Runner → Workloads → Expectations.

Core Flow

flowchart LR
    A(Topology<br/>shape cluster) --> B(Scenario<br/>plan)
    B --> C(Deployer<br/>provision & readiness)
    C --> D(Runner<br/>orchestrate execution)
    D --> E(Workloads<br/>drive traffic)
    E --> F(Expectations<br/>verify outcomes)

Crate Architecture

flowchart TB
    subgraph Examples["Runner Examples"]
        LocalBin[local_runner.rs]
        ComposeBin[compose_runner.rs]
        K8sBin[k8s_runner.rs]
        CucumberBin[cucumber_*.rs]
    end
    
    subgraph Workflows["Workflows (Batteries Included)"]
        DSL[ScenarioBuilderExt<br/>Fluent API]
        TxWorkload[Transaction Workload]
        DAWorkload[DA Workload]
        ChaosWorkload[Chaos Workload]
        Expectations[Built-in Expectations]
    end
    
    subgraph Core["Core Framework"]
        ScenarioModel[Scenario Model]
        Traits[Deployer + Runner Traits]
        BlockFeed[BlockFeed]
        NodeClients[Node Clients]
        Topology[Topology Generation]
    end
    
    subgraph Deployers["Runner Implementations"]
        LocalDeployer[LocalDeployer]
        ComposeDeployer[ComposeDeployer]
        K8sDeployer[K8sDeployer]
    end
    
    subgraph Support["Supporting Crates"]
        Configs[Configs & Topology]
        Nodes[Node API Clients]
        Cucumber[Cucumber Extensions]
    end
    
    Examples --> Workflows
    Examples --> Deployers
    Workflows --> Core
    Deployers --> Core
    Deployers --> Support
    Core --> Support
    Workflows --> Support
    
    style Examples fill:#e1f5ff
    style Workflows fill:#e1ffe1
    style Core fill:#fff4e1
    style Deployers fill:#ffe1f5
    style Support fill:#f0f0f0

Layer Responsibilities

Runner Examples (Entry Points)

  • Executable binaries that demonstrate framework usage
  • Wire together deployers, scenarios, and execution
  • Provide CLI interfaces for different modes

Workflows (High-Level API)

  • ScenarioBuilderExt trait provides fluent DSL
  • Built-in workloads (transactions, DA, chaos)
  • Common expectations (liveness, inclusion)
  • Simplifies scenario authoring

Core Framework (Foundation)

  • Scenario model and lifecycle orchestration
  • Deployer and Runner traits (extension points)
  • BlockFeed for real-time block observation
  • RunContext providing node clients and metrics
  • Topology generation and validation

Runner Implementations

  • LocalDeployer - spawns processes on host
  • ComposeDeployer - orchestrates Docker Compose
  • K8sDeployer - deploys to Kubernetes cluster
  • Each implements Deployer trait

Supporting Crates

  • configs - Topology configuration and generation
  • nodes - HTTP/RPC client for node APIs
  • cucumber - BDD/Gherkin integration

Extension Points

flowchart LR
    Custom[Your Code] -.implements.-> Workload[Workload Trait]
    Custom -.implements.-> Expectation[Expectation Trait]
    Custom -.implements.-> Deployer[Deployer Trait]
    
    Workload --> Core[Core Framework]
    Expectation --> Core
    Deployer --> Core
    
    style Custom fill:#ffe1f5
    style Core fill:#fff4e1

Extend by implementing:

  • Workload - Custom traffic generation patterns
  • Expectation - Custom success criteria
  • Deployer - Support for new deployment targets

See Extending the Framework for details.

Components

  • Topology describes the cluster: how many nodes, their roles, and the high-level network and data-availability parameters they should follow.
  • Scenario combines that topology with the activities to run and the checks to perform, forming a single plan.
  • Deployer provisions infrastructure on the chosen backend (local processes, Docker Compose, or Kubernetes), waits for readiness, and returns a Runner.
  • Runner orchestrates scenario execution: starts workloads, observes signals, evaluates expectations, and triggers cleanup.
  • Workloads generate traffic and conditions that exercise the system.
  • Expectations observe the run and judge success or failure once activity completes.

Each layer has a narrow responsibility so that cluster shape, deployment choice, traffic generation, and health checks can evolve independently while fitting together predictably.

Entry Points

The framework is consumed via runnable example binaries in examples/src/bin/:

  • local_runner.rs — Spawns nodes as host processes
  • compose_runner.rs — Deploys via Docker Compose (requires NOMOS_TESTNET_IMAGE built)
  • k8s_runner.rs — Deploys via Kubernetes Helm (requires cluster + image)

Recommended: Use the convenience script:

scripts/run/run-examples.sh -t <duration> -v <validators> -e <executors> <mode>
# mode: host, compose, or k8s

This handles circuit setup, binary building/bundling, image building, and execution.

Alternative: Direct cargo run (requires manual setup):

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

Important: All runners require POL_PROOF_DEV_MODE=true to avoid expensive Groth16 proof generation that causes timeouts.

These binaries use the framework API (ScenarioBuilder) to construct and execute scenarios.

Builder API

Scenarios are defined using a fluent builder pattern:

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn scenario_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
        .wallets(50)
        .transactions_with(|txs| txs.rate(5).users(20))
        .da_with(|da| da.channel_rate(1).blob_rate(2))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build()
}

Key API Points:

  • Topology uses .topology_with(|t| { t.validators(N).executors(M) }) closure pattern
  • Workloads are configured via _with closures (transactions_with, da_with, chaos_with)
  • Chaos workloads require .enable_node_control() and a compatible runner

Deployers

Three deployer implementations:

DeployerBackendPrerequisitesNode Control
LocalDeployerHost processesBinaries (built on demand or via bundle)No
ComposeDeployerDocker ComposeImage with embedded assets/binariesYes
K8sDeployerKubernetes HelmCluster + image loadedNot yet

Compose-specific features:

  • Observability is external (set NOMOS_METRICS_QUERY_URL / NOMOS_METRICS_OTLP_INGEST_URL / NOMOS_GRAFANA_URL as needed)
  • Optional OTLP trace/metrics endpoints (NOMOS_OTLP_ENDPOINT, NOMOS_OTLP_METRICS_ENDPOINT)
  • Node control for chaos testing (restart validators/executors)

Assets and Images

Docker Image

Built via scripts/build/build_test_image.sh:

  • Embeds KZG circuit parameters and binaries from testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params
  • Includes runner scripts: run_nomos_node.sh, run_nomos_executor.sh
  • Tagged as NOMOS_TESTNET_IMAGE (default: logos-blockchain-testing:local)
  • Recommended: Use prebuilt bundle via scripts/build/build-bundle.sh --platform linux and set NOMOS_BINARIES_TAR before building image

Circuit Assets

KZG parameters required for DA workloads:

  • Host path: testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params (note repeated filename—directory contains file kzgrs_test_params)
  • Container path: /kzgrs_test_params/kzgrs_test_params (for compose/k8s)
  • Override: NOMOS_KZGRS_PARAMS_PATH=/custom/path/to/file (must point to file)
  • Fetch via: scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/circuits or use scripts/run/run-examples.sh

Compose Stack

Templates and configs in testing-framework/runners/compose/assets/:

  • docker-compose.yml.tera — Stack template (validators, executors)
  • Cfgsync config: testing-framework/assets/stack/cfgsync.yaml
  • Monitoring assets (not deployed by the framework): testing-framework/assets/stack/monitoring/

Logging Architecture

Two separate logging pipelines:

ComponentConfigurationOutput
Runner binariesRUST_LOGFramework orchestration logs
Node processesNOMOS_LOG_LEVEL, NOMOS_LOG_FILTER (+ NOMOS_LOG_DIR on host runner)Consensus, DA, mempool logs

Node logging:

  • Local runner: Writes to temporary directories by default (cleaned up). Set NOMOS_TESTS_TRACING=true + NOMOS_LOG_DIR for persistent files.
  • Compose runner: Default logs to container stdout/stderr (docker logs). To write per-node files, set tracing_settings.logger: !File in testing-framework/assets/stack/cfgsync.yaml (and mount a writable directory).
  • K8s runner: Logs to pod stdout/stderr (kubectl logs). To write per-node files, set tracing_settings.logger: !File in testing-framework/assets/stack/cfgsync.yaml (and mount a writable directory).

File naming: Per-node files use prefix nomos-node-{index} or nomos-executor-{index} (may include timestamps).

Observability

Prometheus-compatible metrics querying (optional):

  • The framework does not deploy Prometheus/Grafana.
  • Provide a Prometheus-compatible base URL (PromQL API) via NOMOS_METRICS_QUERY_URL.
  • Accessible in expectations when configured: ctx.telemetry().prometheus().map(|p| p.base_url())

Grafana dashboards (optional):

  • Dashboards live in testing-framework/assets/stack/monitoring/grafana/dashboards/ and can be imported into your Grafana.
  • If you set NOMOS_GRAFANA_URL, the deployer prints it in TESTNET_ENDPOINTS.

Node APIs:

  • HTTP endpoints per node for consensus info, network status, DA membership
  • Accessible in expectations: ctx.node_clients().validator_clients().get(0)

OTLP (optional):

  • Trace endpoint: NOMOS_OTLP_ENDPOINT=http://localhost:4317
  • Metrics endpoint: NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318
  • Disabled by default (no noise if unset)

For detailed logging configuration, see Logging & Observability.

Testing Philosophy

This framework embodies specific principles that shape how you author and run scenarios. Understanding these principles helps you write effective tests and interpret results correctly.

Declarative over Imperative

Describe what you want to test, not how to orchestrate it:

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn declarative_over_imperative() {
    // Good: declarative
    let _plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(2).executors(1))
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
        })
        .expect_consensus_liveness()
        .build();

    // Bad: imperative (framework doesn't work this way)
    // spawn_validator(); spawn_executor();
    // loop { submit_tx(); check_block(); }
}

Why it matters: The framework handles deployment, readiness, and cleanup. You focus on test intent, not infrastructure orchestration.

Protocol Time, Not Wall Time

Reason in blocks and consensus intervals, not wall-clock seconds.

Consensus defaults:

  • Slot duration: 2 seconds (NTP-synchronized, configurable via CONSENSUS_SLOT_TIME)
  • Active slot coefficient: 0.9 (90% block probability per slot, configurable via CONSENSUS_ACTIVE_SLOT_COEFF)
  • Expected rate: ~27 blocks per minute
use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn protocol_time_not_wall_time() {
    // Good: protocol-oriented thinking
    let _plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(2).executors(1))
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
        })
        .with_run_duration(Duration::from_secs(60)) // Let framework calculate expected blocks
        .expect_consensus_liveness() // "Did we produce the expected blocks?"
        .build();

    // Bad: wall-clock assumptions
    // "I expect exactly 30 blocks in 60 seconds"
    // This breaks on slow CI where slot timing might drift
}

Why it matters: Slot timing is fixed (2s by default, NTP-synchronized), so the expected number of blocks is predictable: ~27 blocks in 60s with the default 0.9 active slot coefficient. The framework calculates expected blocks from slot duration and run window, making assertions protocol-based rather than tied to specific wall-clock expectations. Assert on “blocks produced relative to slots” not “blocks produced in exact wall-clock seconds”.

Determinism First, Chaos When Needed

Default scenarios are repeatable:

  • Fixed topology
  • Predictable traffic rates
  • Deterministic checks

Chaos is opt-in:

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::{ChaosBuilderExt, ScenarioBuilderExt};

pub fn determinism_first() {
    // Separate: functional test (deterministic)
    let _plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(2).executors(1))
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
        })
        .expect_consensus_liveness()
        .build();

    // Separate: chaos test (introduces randomness)
    let _chaos_plan =
        ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
            .enable_node_control()
            .chaos_with(|c| {
                c.restart()
                    .min_delay(Duration::from_secs(30))
                    .max_delay(Duration::from_secs(60))
                    .target_cooldown(Duration::from_secs(45))
                    .apply()
            })
            .transactions_with(|txs| {
                txs.rate(5) // 5 transactions per block
            })
            .expect_consensus_liveness()
            .build();
}

Why it matters: Mixing determinism with chaos creates noisy, hard-to-debug failures. Separate concerns make failures actionable.

Observable Health Signals

Prefer user-facing signals over internal state:

Good checks:

  • Blocks progressing at expected rate (liveness)
  • Transactions included within N blocks (inclusion)
  • DA blobs retrievable (availability)

Avoid internal checks:

  • Memory pool size
  • Internal service state
  • Cache hit rates

Why it matters: User-facing signals reflect actual system health. Internal state can be “healthy” while the system is broken from a user perspective.

Minimum Run Windows

Always run long enough for meaningful block production:

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn minimum_run_windows() {
    // Bad: too short (~2 blocks with default 2s slots, 0.9 coeff)
    let _too_short = ScenarioBuilder::with_node_counts(1, 0)
        .with_run_duration(Duration::from_secs(5))
        .expect_consensus_liveness()
        .build();

    // Good: enough blocks for assertions (~27 blocks with default 2s slots, 0.9
    // coeff)
    let _good = ScenarioBuilder::with_node_counts(1, 0)
        .with_run_duration(Duration::from_secs(60))
        .expect_consensus_liveness()
        .build();
}

Note: Block counts assume default consensus parameters:

  • Slot duration: 2 seconds (configurable via CONSENSUS_SLOT_TIME)
  • Active slot coefficient: 0.9 (90% block probability per slot, configurable via CONSENSUS_ACTIVE_SLOT_COEFF)
  • Formula: blocks ≈ (duration / slot_duration) × active_slot_coeff

If upstream changes these parameters, adjust your duration expectations accordingly.

The framework enforces minimum durations (at least 2× slot duration), but be explicit. Very short runs risk false confidence—one lucky block doesn’t prove liveness.

Summary

These principles keep scenarios:

  • Portable across environments (protocol time, declarative)
  • Debuggable (determinism, separation of concerns)
  • Meaningful (observable signals, sufficient duration)

When authoring scenarios, ask: “Does this test the protocol behavior or my local environment quirks?”

Scenario Lifecycle

A scenario progresses through six distinct phases, each with a specific responsibility:

flowchart TB
    subgraph Phase1["1. Build Phase"]
        Build[Define Scenario]
        BuildDetails["• Declare topology<br/>• Attach workloads<br/>• Add expectations<br/>• Set run duration"]
        Build --> BuildDetails
    end
    
    subgraph Phase2["2. Deploy Phase"]
        Deploy[Provision Environment]
        DeployDetails["• Launch nodes<br/>• Wait for readiness<br/>• Establish connectivity<br/>• Return Runner"]
        Deploy --> DeployDetails
    end
    
    subgraph Phase3["3. Capture Phase"]
        Capture[Baseline Metrics]
        CaptureDetails["• Snapshot initial state<br/>• Start BlockFeed<br/>• Initialize expectations"]
        Capture --> CaptureDetails
    end
    
    subgraph Phase4["4. Execution Phase"]
        Execute[Drive Workloads]
        ExecuteDetails["• Submit transactions<br/>• Disperse DA blobs<br/>• Trigger chaos events<br/>• Run for duration"]
        Execute --> ExecuteDetails
    end
    
    subgraph Phase5["5. Evaluation Phase"]
        Evaluate[Check Expectations]
        EvaluateDetails["• Verify liveness<br/>• Check inclusion<br/>• Validate outcomes<br/>• Aggregate results"]
        Evaluate --> EvaluateDetails
    end
    
    subgraph Phase6["6. Cleanup Phase"]
        Cleanup[Teardown]
        CleanupDetails["• Stop nodes<br/>• Remove containers<br/>• Collect logs<br/>• Release resources"]
        Cleanup --> CleanupDetails
    end
    
    Phase1 --> Phase2
    Phase2 --> Phase3
    Phase3 --> Phase4
    Phase4 --> Phase5
    Phase5 --> Phase6
    
    style Phase1 fill:#e1f5ff
    style Phase2 fill:#fff4e1
    style Phase3 fill:#f0ffe1
    style Phase4 fill:#ffe1f5
    style Phase5 fill:#e1ffe1
    style Phase6 fill:#ffe1e1

Phase Details

1. Build the Plan

Declare a topology, attach workloads and expectations, and set the run window. The plan is the single source of truth for what will happen.

Key actions:

  • Define cluster shape (validators, executors, network topology)
  • Configure workloads (transaction rate, DA traffic, chaos patterns)
  • Attach expectations (liveness, inclusion, custom checks)
  • Set timing parameters (run duration, cooldown period)

Output: Immutable Scenario plan

2. Deploy

Hand the plan to a deployer. It provisions the environment on the chosen backend, waits for nodes to signal readiness, and returns a runner.

Key actions:

  • Provision infrastructure (processes, containers, or pods)
  • Launch validator and executor nodes
  • Wait for readiness probes (HTTP endpoints respond)
  • Establish node connectivity and metrics endpoints
  • Spawn BlockFeed for real-time block observation

Output: Runner + RunContext (with node clients, metrics, control handles)

3. Capture Baseline

Expectations snapshot initial state before workloads begin.

Key actions:

  • Record starting block height
  • Initialize counters and trackers
  • Subscribe to BlockFeed
  • Capture baseline metrics

Output: Captured state for later comparison

4. Drive Workloads

The runner starts traffic and behaviors for the planned duration.

Key actions:

  • Submit transactions at configured rates
  • Disperse and sample DA blobs
  • Trigger chaos events (node restarts)
  • Run concurrently for the specified duration
  • Observe blocks and metrics in real-time

Note: Network partitions/peer blocking are not yet supported by node control; today chaos is restart-based. See RunContext: BlockFeed & Node Control.

Duration: Controlled by with_run_duration()

5. Evaluate Expectations

Once activity stops (and optional cooldown completes), the runner checks liveness and workload-specific outcomes.

Key actions:

  • Verify consensus liveness (minimum block production)
  • Check transaction inclusion rates
  • Validate DA dispersal and sampling
  • Assess system recovery after chaos events
  • Aggregate pass/fail results

Output: Success or detailed failure report

6. Cleanup

Tear down resources so successive runs start fresh and do not inherit leaked state.

Key actions:

  • Stop all node processes/containers/pods
  • Remove temporary directories and volumes
  • Collect and archive logs (if NOMOS_TESTS_KEEP_LOGS=1)
  • Release ports and network resources
  • Cleanup observability stack (if spawned)

Guarantee: Runs even on panic via CleanupGuard

Design Rationale

  • Modular crates keep configuration, orchestration, workloads, and runners decoupled so each can evolve without breaking the others.
  • Pluggable runners let the same scenario run on a laptop, a Docker host, or a Kubernetes cluster, making validation portable across environments.
  • Separated workloads and expectations clarify intent: what traffic to generate versus how to judge success. This simplifies review and reuse.
  • Declarative topology makes cluster shape explicit and repeatable, reducing surprise when moving between CI and developer machines.
  • Maintainability through predictability: a clear flow from plan to deployment to verification lowers the cost of extending the framework and interpreting failures.

Part II — User Guide

Practical guidance for shaping scenarios, combining workloads and expectations, and running them across different environments.

Workspace Layout

The workspace focuses on multi-node integration testing and sits alongside a nomos-node checkout. Its crates separate concerns to keep scenarios repeatable and portable:

  • Configs: prepares high-level node, network, tracing, and wallet settings used across test environments.
  • Core scenario orchestration: the engine that holds topology descriptions, scenario plans, runtimes, workloads, and expectations.
  • Workflows: ready-made workloads (transactions, data-availability, chaos) and reusable expectations assembled into a user-facing DSL.
  • Runners: deployment backends for local processes, Docker Compose, and Kubernetes, all consuming the same scenario plan.
  • Runner Examples (crate name: runner-examples, path: examples/): runnable binaries (examples/src/bin/local_runner.rs, examples/src/bin/compose_runner.rs, examples/src/bin/k8s_runner.rs) that demonstrate complete scenario execution with each deployer.

This split keeps configuration, orchestration, reusable traffic patterns, and deployment adapters loosely coupled while sharing one mental model for tests.

Annotated Tree

Directory structure with key paths annotated:

logos-blockchain-testing/
├─ testing-framework/           # Core library crates
│  ├─ configs/                  # Node config builders, topology generation, tracing/logging config
│  ├─ core/                     # Scenario model (ScenarioBuilder), runtime (Runner, Deployer), topology, node spawning
│  ├─ workflows/                # Workloads (transactions, DA, chaos), expectations (liveness), builder DSL extensions
│  ├─ runners/                  # Deployment backends
│  │  ├─ local/                 # LocalDeployer (spawns local processes)
│  │  ├─ compose/               # ComposeDeployer (Docker Compose + Prometheus)
│  │  └─ k8s/                   # K8sDeployer (Kubernetes Helm)
│  └─ assets/                   # Docker/K8s stack assets
│     └─ stack/
│        ├─ kzgrs_test_params/  # KZG circuit parameters directory
│        │  └─ kzgrs_test_params  # Actual proving key file (note repeated name)
│        ├─ monitoring/         # Prometheus config
│        ├─ scripts/            # Container entrypoints
│        └─ cfgsync.yaml        # Config sync server template
│
├─ examples/                    # PRIMARY ENTRY POINT: runnable binaries
│  └─ src/bin/
│     ├─ local_runner.rs        # Host processes demo (LocalDeployer)
│     ├─ compose_runner.rs      # Docker Compose demo (ComposeDeployer)
│     └─ k8s_runner.rs          # Kubernetes demo (K8sDeployer)
│
├─ scripts/                     # Helper utilities
│  ├─ run-examples.sh           # Convenience script (handles setup + runs examples)
│  ├─ build-bundle.sh           # Build prebuilt binaries+circuits bundle
│  ├─ setup-circuits-stack.sh  # Fetch KZG parameters (Linux + host)
│  └─ setup-nomos-circuits.sh  # Legacy circuit fetcher
│
└─ book/                        # This documentation (mdBook)

Key Directories Explained

testing-framework/

Core library crates providing the testing API.

CratePurposeKey Exports
configsNode configuration buildersTopology generation, tracing config
coreScenario model & runtimeScenarioBuilder, Deployer, Runner
workflowsWorkloads & expectationsScenarioBuilderExt, ChaosBuilderExt
runners/localLocal process deployerLocalDeployer
runners/composeDocker Compose deployerComposeDeployer
runners/k8sKubernetes deployerK8sDeployer

testing-framework/assets/stack/

Docker/K8s deployment assets:

  • kzgrs_test_params/kzgrs_test_params: Circuit parameters file (note repeated name; override via NOMOS_KZGRS_PARAMS_PATH)
  • monitoring/: Prometheus config
  • scripts/: Container entrypoints

scripts/

Convenience utilities:

  • run-examples.sh: All-in-one script for host/compose/k8s modes (recommended)
  • build-bundle.sh: Create prebuilt binaries+circuits bundle for compose/k8s
  • build_test_image.sh: Build the compose/k8s Docker image (bakes in assets)
  • setup-circuits-stack.sh: Fetch KZG parameters for both Linux and host
  • cfgsync.yaml: Configuration sync server template

examples/ (Start Here!)

Runnable binaries demonstrating framework usage:

  • local_runner.rs — Local processes
  • compose_runner.rs — Docker Compose (requires NOMOS_TESTNET_IMAGE built)
  • k8s_runner.rs — Kubernetes (requires cluster + image)

Run with: POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

All runners require POL_PROOF_DEV_MODE=true to avoid expensive proof generation.

scripts/

Helper utilities:

  • setup-nomos-circuits.sh: Fetch KZG parameters from releases

Observability

Compose runner includes:

  • Prometheus at http://localhost:9090 (metrics scraping)
  • Node metrics exposed per validator/executor
  • Access in expectations: ctx.telemetry().prometheus().map(|p| p.base_url())

Logging controlled by:

  • NOMOS_LOG_DIR — Write per-node log files
  • NOMOS_LOG_LEVEL — Global log level (error/warn/info/debug/trace)
  • NOMOS_LOG_FILTER — Target-specific filtering (e.g., cryptarchia=trace,nomos_da_sampling=debug)
  • NOMOS_TESTS_TRACING — Enable file logging for local runner

See Logging & Observability for details.

To Do ThisGo Here
Run an exampleexamples/src/bin/cargo run -p runner-examples --bin <name>
Write a custom scenariotesting-framework/core/ → Implement using ScenarioBuilder
Add a new workloadtesting-framework/workflows/src/workloads/ → Implement Workload trait
Add a new expectationtesting-framework/workflows/src/expectations/ → Implement Expectation trait
Modify node configstesting-framework/configs/src/topology/configs/
Extend builder DSLtesting-framework/workflows/src/builder/ → Add trait methods
Add a new deployertesting-framework/runners/ → Implement Deployer trait

For detailed guidance, see Internal Crate Reference.

Authoring Scenarios

Creating a scenario is a declarative exercise. This page walks you through the core authoring loop with concrete examples, explains the units and timing model, and shows how to structure scenarios in Rust test suites.


The Core Authoring Loop

Every scenario follows the same pattern:

flowchart LR
    A[1. Topology] --> B[2. Workloads]
    B --> C[3. Expectations]
    C --> D[4. Duration]
    D --> E[5. Deploy & Run]
  1. Shape the topology — How many nodes, what roles, what network shape
  2. Attach workloads — What traffic to generate (transactions, blobs, chaos)
  3. Define expectations — What success looks like (liveness, inclusion, recovery)
  4. Set duration — How long to run the experiment
  5. Choose a runner — Where to execute (local, compose, k8s)

Hello Scenario: Your First Test

Let’s build a minimal consensus liveness test step-by-step.

Step 1: Shape the Topology

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

let scenario = ScenarioBuilder::topology_with(|t| {
    t.network_star()      // Star network (one gateway + nodes)
        .validators(3)     // 3 validator nodes
        .executors(1)      // 1 executor node
})

What goes in topology?

  • Node counts (validators, executors)
  • Network shape (network_star() is currently the only built-in layout)
  • Role split (validators vs. executors)

What does NOT go in topology?

  • Traffic rates (that’s workloads)
  • Success criteria (that’s expectations)
  • Runtime configuration (that’s duration/runner)

Step 2: Attach Workloads

.wallets(20) // Seed funded wallet accounts for transaction workloads
.transactions_with(|tx| {
    tx.rate(10)    // 10 transactions per block
      .users(5)    // distributed across 5 wallets
})

What goes in workloads?

  • Transaction traffic (rate, users)
  • DA traffic (channels, blobs)
  • Chaos injection (restarts, delays)

Units explained:

  • .rate(10) = 10 transactions per block (not per second!)
  • .users(5) = use 5 distinct wallet accounts
  • The framework adapts to block time automatically

Step 3: Define Expectations

.expect_consensus_liveness()

What goes in expectations?

  • Health checks that run after the scenario completes
  • Liveness (blocks produced)
  • Inclusion (workload activity landed on-chain)
  • Recovery (system survived chaos)

When do expectations run? After the duration window ends, during the evaluation phase of the scenario lifecycle.

Step 4: Set Duration

use std::time::Duration;

.with_run_duration(Duration::from_secs(60))

How long is enough?

  • Minimum: 2× the expected block time × number of blocks you want
  • For consensus liveness: 30-60 seconds
  • For transaction inclusion: 60-120 seconds
  • For chaos recovery: 2-5 minutes

What happens during this window?

  • Nodes are running
  • Workloads generate traffic
  • Metrics/logs are collected
  • BlockFeed broadcasts observations in real-time

Step 5: Build and Deploy

.build();

// Choose a runner
use testing_framework_core::scenario::Deployer;
use testing_framework_runner_local::LocalDeployer;

let deployer = LocalDeployer::default();
let runner = deployer.deploy(&scenario).await?;
let _result = runner.run(&mut scenario).await?;

Complete “Hello Scenario”

Putting it all together:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

#[tokio::test]
async fn hello_consensus_liveness() -> Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(3)
            .executors(1)
    })
    .wallets(20)
    .transactions_with(|tx| tx.rate(10).users(5))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&scenario).await?;
    runner.run(&mut scenario).await?;

    Ok(())
}

Run it:

POL_PROOF_DEV_MODE=true cargo test hello_consensus_liveness

Understanding Units & Timing

Transaction Rate: Per-Block, Not Per-Second

Wrong mental model: .rate(10) = 10 tx/second

Correct mental model: .rate(10) = 10 tx/block

Why? The blockchain produces blocks at variable rates depending on consensus timing. The framework submits the configured rate per block to ensure predictable load regardless of block time.

Example:

  • Block time = 2 seconds
  • .rate(10) → 10 tx/block → 5 tx/second average
  • Block time = 5 seconds
  • .rate(10) → 10 tx/block → 2 tx/second average

Duration: Wall-Clock Time

.with_run_duration(Duration::from_secs(60)) means the scenario runs for 60 seconds of real time, not 60 blocks.

How many blocks will be produced? Depends on consensus timing (slot time, active slot coefficient). Typical: 1-2 seconds per block.

Rule of thumb:

  • 60 seconds → ~30-60 blocks
  • 120 seconds → ~60-120 blocks

Structuring Scenarios in a Test Suite

Pattern 1: Integration Test Module

// tests/integration_test.rs
use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

#[tokio::test]
async fn test_consensus_liveness() -> Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| {
        t.network_star().validators(3).executors(1)
    })
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(30))
    .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&scenario).await?;
    runner.run(&mut scenario).await?;
    Ok(())
}

#[tokio::test]
async fn test_transaction_inclusion() -> Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| {
        t.network_star().validators(2).executors(1)
    })
    .wallets(10)
    .transactions_with(|tx| tx.rate(5).users(5))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&scenario).await?;
    runner.run(&mut scenario).await?;
    Ok(())
}

Pattern 2: Shared Scenario Builders

Extract common topology patterns:

// tests/helpers.rs
use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn minimal_topology() -> ScenarioBuilder {
    ScenarioBuilder::topology_with(|t| {
        t.network_star().validators(2).executors(1)
    })
}

pub fn production_like_topology() -> ScenarioBuilder {
    ScenarioBuilder::topology_with(|t| {
        t.network_star().validators(7).executors(3)
    })
}

// tests/consensus_tests.rs
use std::time::Duration;

use helpers::*;

#[tokio::test]
async fn small_cluster_liveness() -> anyhow::Result<()> {
    let mut scenario = minimal_topology()
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(30))
        .build();
    // ... deploy and run
    Ok(())
}

#[tokio::test]
async fn large_cluster_liveness() -> anyhow::Result<()> {
    let mut scenario = production_like_topology()
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(60))
        .build();
    // ... deploy and run
    Ok(())
}

Pattern 3: Parameterized Scenarios

Test the same behavior across different scales:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

async fn test_liveness_with_topology(validators: usize, executors: usize) -> Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| {
        t.network_star()
            .validators(validators)
            .executors(executors)
    })
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&scenario).await?;
    runner.run(&mut scenario).await?;
    Ok(())
}

#[tokio::test]
async fn liveness_small() -> Result<()> {
    test_liveness_with_topology(2, 1).await
}

#[tokio::test]
async fn liveness_medium() -> Result<()> {
    test_liveness_with_topology(5, 2).await
}

#[tokio::test]
async fn liveness_large() -> Result<()> {
    test_liveness_with_topology(10, 3).await
}

What Belongs Where?

Topology

Do include:

  • Node counts (.validators(3), .executors(1))
  • Network shape (.network_star())
  • Role split (validators vs. executors)

Don’t include:

  • Traffic rates (workload concern)
  • Expected outcomes (expectation concern)
  • Runtime behavior (runner/duration concern)

Workloads

Do include:

  • Transaction traffic (.transactions_with(|tx| ...))
  • DA traffic (.da_with(|da| ...))
  • Chaos injection (.with_workload(RandomRestartWorkload::new(...)))
  • Rates, users, timing

Don’t include:

  • Node configuration (topology concern)
  • Success criteria (expectation concern)

Expectations

Do include:

  • Health checks (.expect_consensus_liveness())
  • Inclusion verification (built-in to workloads)
  • Custom assertions (.with_expectation(MyExpectation::new()))

Don’t include:

  • Traffic generation (workload concern)
  • Cluster shape (topology concern)

Best Practices

  1. Keep scenarios focused: One scenario = one behavior under test
  2. Start small: 2-3 validators, 1 executor, 30-60 seconds
  3. Use descriptive names: test_consensus_survives_validator_restart not test_1
  4. Extract common patterns: Shared topology builders, helper functions
  5. Document intent: Add comments explaining what you’re testing and why
  6. Mind the units: .rate(N) is per-block, .with_run_duration() is wall-clock
  7. Set realistic durations: Allow enough time for multiple blocks + workload effects

Next Steps

Core Content: Workloads & Expectations

Workloads describe the activity a scenario generates; expectations describe the signals that must hold when that activity completes. This page is the canonical reference for all built-in workloads and expectations, including configuration knobs, defaults, prerequisites, and debugging guidance.


Overview

flowchart TD
    I[Inputs<br/>topology + wallets + rates] --> Init[Workload init]
    Init --> Drive[Drive traffic]
    Drive --> Collect[Collect signals]
    Collect --> Eval[Expectations evaluate]

Key concepts:

  • Workloads run during the execution phase (generate traffic)
  • Expectations run during the evaluation phase (check health signals)
  • Each workload can attach its own expectations automatically
  • Expectations can also be added explicitly

Built-in Workloads

1. Transaction Workload

Submits user-level transactions at a configurable rate to exercise transaction processing and inclusion paths.

Import:

use testing_framework_workflows::workloads::transaction::Workload;

Configuration

ParameterTypeDefaultDescription
rateu64RequiredTransactions per block (not per second!)
usersOption<usize>All walletsNumber of distinct wallet accounts to use

DSL Usage

use testing_framework_workflows::ScenarioBuilderExt;

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .wallets(20)  // Seed 20 wallet accounts
    .transactions_with(|tx| {
        tx.rate(10)   // 10 transactions per block
          .users(5)   // Use only 5 of the 20 wallets
    })
    .with_run_duration(Duration::from_secs(60))
    .build();

Direct Instantiation

use testing_framework_workflows::workloads::transaction;

let tx_workload = transaction::Workload::with_rate(10)
    .expect("transaction rate must be non-zero");

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .wallets(20)
    .with_workload(tx_workload)
    .with_run_duration(Duration::from_secs(60))
    .build();

Prerequisites

  1. Wallet accounts must be seeded:

    .wallets(N)  // Before .transactions_with()

    The workload will fail during init() if no wallets are configured.

  2. Proof generation must be fast:

    export POL_PROOF_DEV_MODE=true
    

    Without this, proof generation takes ~30-60 seconds per transaction, causing timeouts.

  3. Circuit artifacts must be available:

    • Automatically staged by scripts/run/run-examples.sh
    • Or manually via scripts/setup/setup-circuits-stack.sh (recommended) / scripts/setup/setup-nomos-circuits.sh

Attached Expectation

TxInclusionExpectation — Verifies that submitted transactions were included in blocks.

What it checks:

  • At least N transactions were included on-chain (where N = rate × user count × expected block count)
  • Uses BlockFeed to count transactions across all observed blocks

Failure modes:

  • “Expected >= X transactions, observed Y” (Y < X)
  • Common causes: proof generation timeouts, node crashes, insufficient duration

What Failure Looks Like

Error: Expectation failed: TxInclusionExpectation
  Expected: >= 600 transactions (10 tx/block × 60 blocks)
  Observed: 127 transactions
  
  Possible causes:
  - POL_PROOF_DEV_MODE not set (proof generation too slow)
  - Duration too short (nodes still syncing)
  - Node crashes (check logs for panics/OOM)
  - Wallet accounts not seeded (check topology config)

How to debug:

  1. Check logs for proof generation timing:
    grep "proof generation" $NOMOS_LOG_DIR/executor-0/*.log
    
  2. Verify POL_PROOF_DEV_MODE=true was set
  3. Increase duration: .with_run_duration(Duration::from_secs(120))
  4. Reduce rate: .rate(5) instead of .rate(10)

2. Data Availability (DA) Workload

Drives blob and channel activity to exercise data availability paths and storage.

Import:

use testing_framework_workflows::workloads::da::Workload;

Configuration

ParameterTypeDefaultDescription
blob_rate_per_blockNonZeroU64RequiredBlobs to publish per block
channel_rate_per_blockNonZeroU64RequiredChannels to create per block
headroom_percentu6420Extra capacity for channel planning (avoids saturation)

DSL Usage

use testing_framework_workflows::ScenarioBuilderExt;

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .da_with(|da| {
        da.channel_rate(2)  // 2 channels per block
          .blob_rate(4)     // 4 blobs per block
    })
    .with_run_duration(Duration::from_secs(120))
    .build();

Direct Instantiation

use std::num::NonZeroU64;
use testing_framework_workflows::workloads::da;

let da_workload = da::Workload::with_rate(
    NonZeroU64::new(4).unwrap(),   // blob_rate_per_block
    NonZeroU64::new(2).unwrap(),   // channel_rate_per_block
    20,                            // headroom_percent
);

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .with_workload(da_workload)
    .with_run_duration(Duration::from_secs(120))
    .build();

Prerequisites

  1. Executors must be present:

    .executors(N)  // At least 1 executor

    DA workload requires executor nodes to handle blob publishing.

  2. Sufficient duration: Channel creation and blob publishing are slower than transaction submission. Allow 120+ seconds.

  3. Circuit artifacts: Same as transaction workload (POL_PROOF_DEV_MODE, circuits staged).

Attached Expectation

DaWorkloadExpectation — Verifies blobs and channels were created and published.

What it checks:

  • At least N channels were created (where N = channel_rate × expected blocks)
  • At least M blobs were published (where M = blob_rate × expected blocks × headroom)
  • Uses BlockFeed and executor API to verify

Failure modes:

  • “Expected >= X channels, observed Y” (Y < X)
  • “Expected >= X blobs, observed Y” (Y < X)
  • Common causes: executor crashes, insufficient duration, DA saturation

What Failure Looks Like

Error: Expectation failed: DaWorkloadExpectation
  Expected: >= 60 channels (2 channels/block × 30 blocks)
  Observed: 23 channels
  
  Possible causes:
  - Executors crashed or restarted (check executor logs)
  - Duration too short (channels still being created)
  - Blob publishing failed (check executor API errors)
  - Network issues (check validator/executor connectivity)

How to debug:

  1. Check executor logs:
    grep "channel\|blob" $NOMOS_LOG_DIR/executor-0/*.log
    
  2. Verify executors stayed running:
    grep "panic\|killed" $NOMOS_LOG_DIR/executor-*/*.log
    
  3. Increase duration: .with_run_duration(Duration::from_secs(180))
  4. Reduce rates: .channel_rate(1).blob_rate(2)

3. Chaos Workload (Random Restart)

Triggers controlled node restarts to test resilience and recovery behaviors.

Import:

use testing_framework_workflows::workloads::chaos::RandomRestartWorkload;

Configuration

ParameterTypeDefaultDescription
min_delayDurationRequiredMinimum time between restart attempts
max_delayDurationRequiredMaximum time between restart attempts
target_cooldownDurationRequiredMinimum time before restarting same node again
include_validatorsboolRequiredWhether to restart validators
include_executorsboolRequiredWhether to restart executors

Usage

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::{ScenarioBuilderExt, workloads::chaos::RandomRestartWorkload};

let scenario = ScenarioBuilder::topology_with(|t| {
    t.network_star().validators(3).executors(2)
})
.enable_node_control()  // REQUIRED for chaos
.with_workload(RandomRestartWorkload::new(
    Duration::from_secs(45),   // min_delay
    Duration::from_secs(75),   // max_delay
    Duration::from_secs(120),  // target_cooldown
    true,                      // include_validators
    true,                      // include_executors
))
.expect_consensus_liveness()
.with_run_duration(Duration::from_secs(180))
.build();

Prerequisites

  1. Node control must be enabled:

    .enable_node_control()

    This adds NodeControlCapability to the scenario.

  2. Runner must support node control:

    • Compose runner: Supported
    • Local runner: Not supported
    • K8s runner: Not yet implemented
  3. Sufficient topology:

    • For validators: Need >1 validator (workload skips if only 1)
    • For executors: Can restart all executors
  4. Realistic timing:

    • Total duration should be 2-3× the max_delay + cooldown
    • Example: max_delay=75s, cooldown=120s → duration >= 180s

Attached Expectation

None. You must explicitly add expectations (typically .expect_consensus_liveness()).

Why? Chaos workloads are about testing recovery under disruption. The appropriate expectation depends on what you’re testing:

  • Consensus survives restarts → .expect_consensus_liveness()
  • Height converges after chaos → Custom expectation checking BlockFeed

What Failure Looks Like

Error: Workload failed: chaos_restart
  Cause: NodeControlHandle not available
  
  Possible causes:
  - Forgot .enable_node_control() in scenario builder
  - Using local runner (doesn't support node control)
  - Using k8s runner (doesn't support node control)

Or:

Error: Expectation failed: ConsensusLiveness
  Expected: >= 20 blocks
  Observed: 8 blocks
  
  Possible causes:
  - Restart frequency too high (nodes can't recover)
  - Consensus timing too slow (increase duration)
  - Too many validators restarted simultaneously
  - Nodes crashed after restart (check logs)

How to debug:

  1. Check restart events in logs:
    grep "restarting\|restart complete" $NOMOS_LOG_DIR/*/*.log
    
  2. Verify node control is enabled:
    grep "NodeControlHandle" $NOMOS_LOG_DIR/*/*.log
    
  3. Increase cooldown: Duration::from_secs(180)
  4. Reduce restart scope: include_validators = false (test executors only)
  5. Increase duration: .with_run_duration(Duration::from_secs(300))

Built-in Expectations

1. Consensus Liveness

Verifies the system continues to produce blocks during the execution window.

Import:

use testing_framework_workflows::ScenarioBuilderExt;

DSL Usage

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

What It Checks

  • At least N blocks were produced (where N = duration / expected_block_time)
  • Uses BlockFeed to count observed blocks
  • Compares against a minimum threshold (typically 50% of theoretical max)

Failure Modes

Error: Expectation failed: ConsensusLiveness
  Expected: >= 30 blocks
  Observed: 3 blocks
  
  Possible causes:
  - Nodes crashed or never started (check logs)
  - Consensus timing misconfigured (CONSENSUS_SLOT_TIME too high)
  - Insufficient validators (need >= 2 for BFT consensus)
  - Duration too short (nodes still syncing)

How to Debug

  1. Check if nodes started:
    grep "node started\|listening on" $NOMOS_LOG_DIR/*/*.log
    
  2. Check block production:
    grep "block.*height" $NOMOS_LOG_DIR/validator-*/*.log
    
  3. Check consensus participation:
    grep "consensus.*slot\|proposal" $NOMOS_LOG_DIR/validator-*/*.log
    
  4. Increase duration: .with_run_duration(Duration::from_secs(120))
  5. Check env vars: echo $CONSENSUS_SLOT_TIME $CONSENSUS_ACTIVE_SLOT_COEFF

2. Workload-Specific Expectations

Each workload automatically attaches its own expectation:

WorkloadExpectationWhat It Checks
TransactionTxInclusionExpectationTransactions were included in blocks
DADaWorkloadExpectationBlobs and channels were created/published
Chaos(None)Add .expect_consensus_liveness() explicitly

These expectations are added automatically when using the DSL (.transactions_with(), .da_with()).


Configuration Quick Reference

Transaction Workload

.wallets(20)
.transactions_with(|tx| tx.rate(10).users(5))
WhatValueUnit
Rate10tx/block
Users5wallet accounts
Wallets20total seeded

DA Workload

.da_with(|da| da.channel_rate(2).blob_rate(4))
WhatValueUnit
Channel rate2channels/block
Blob rate4blobs/block
Headroom20percent

Chaos Workload

.enable_node_control()
.with_workload(RandomRestartWorkload::new(
    Duration::from_secs(45),   // min
    Duration::from_secs(75),   // max
    Duration::from_secs(120),  // cooldown
    true,  // validators
    true,  // executors
))

Common Patterns

Pattern 1: Multiple Workloads

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .wallets(20)
    .transactions_with(|tx| tx.rate(5).users(10))
    .da_with(|da| da.channel_rate(2).blob_rate(2))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(120))
    .build();

All workloads run concurrently. Expectations for each workload run after the execution window ends.

Pattern 2: Custom Expectation

use testing_framework_core::scenario::Expectation;

struct MyCustomExpectation;

#[async_trait]
impl Expectation for MyCustomExpectation {
    async fn evaluate(&self, ctx: &RunContext) -> Result<(), DynError> {
        // Access BlockFeed, metrics, topology, etc.
        let block_count = ctx.block_feed()?.count();
        if block_count < 10 {
            return Err("Not enough blocks".into());
        }
        Ok(())
    }
}

ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .with_expectation(MyCustomExpectation)
    .with_run_duration(Duration::from_secs(60))
    .build();

Debugging Checklist

When a workload or expectation fails:

  1. Check logs: $NOMOS_LOG_DIR/*/ or docker compose logs or kubectl logs
  2. Verify environment variables: POL_PROOF_DEV_MODE, NOMOS_NODE_BIN, etc.
  3. Check prerequisites: wallets, executors, node control, circuits
  4. Increase duration: Double the run duration and retry
  5. Reduce rates: Half the traffic rates and retry
  6. Check metrics: Prometheus queries for block height, tx count, DA stats
  7. Reproduce locally: Use local runner for faster iteration

See Also

Core Content: ScenarioBuilderExt Patterns

When should I read this? After writing 2-3 scenarios. This page documents patterns that emerge from real usage—come back when you’re refactoring or standardizing your test suite.

Patterns that keep scenarios readable and reusable:

  • Topology-first: start by shaping the cluster (counts, layout) so later steps inherit a clear foundation.
  • Bundle defaults: use the DSL helpers to attach common expectations (like liveness) whenever you add a matching workload, reducing forgotten checks.
  • Intentional rates: express traffic in per-block terms to align with protocol timing rather than wall-clock assumptions.
  • Opt-in chaos: enable restart patterns only in scenarios meant to probe resilience; keep functional smoke tests deterministic.
  • Wallet clarity: seed only the number of actors you need; it keeps transaction scenarios deterministic and interpretable.

These patterns make scenario definitions self-explanatory while staying aligned with the framework’s block-oriented timing model.

Best Practices

This page collects proven patterns for authoring, running, and maintaining test scenarios that are reliable, maintainable, and actionable.

Scenario Design

State your intent

  • Document the goal of each scenario (throughput, DA validation, resilience) so expectation choices are obvious
  • Use descriptive variable names that explain topology purpose (e.g., star_topology_3val_2exec vs topology)
  • Add comments explaining why specific rates or durations were chosen

Keep runs meaningful

  • Choose durations that allow multiple blocks and make timing-based assertions trustworthy
  • Use FAQ: Run Duration Calculator to estimate minimum duration
  • Avoid runs shorter than 30 seconds unless testing startup behavior specifically

Separate concerns

  • Start with deterministic workloads for functional checks
  • Add chaos in dedicated resilience scenarios to avoid noisy failures
  • Don’t mix high transaction load with aggressive chaos in the same test (hard to debug)

Start small, scale up

  • Begin with minimal topology (1-2 validators) to validate scenario logic
  • Gradually increase topology size and workload rates
  • Use Host runner for fast iteration, then validate on Compose before production

Code Organization

Reuse patterns

  • Standardize on shared topology and workload presets so results are comparable across environments and teams
  • Extract common topology builders into helper functions
  • Create workspace-level constants for standard rates and durations

Example: Topology preset

pub fn standard_da_topology() -> GeneratedTopology {
    TopologyBuilder::new()
        .network_star()
        .validators(3)
        .executors(2)
        .generate()
}

Example: Shared constants

pub const STANDARD_TX_RATE: f64 = 10.0;
pub const STANDARD_DA_CHANNEL_RATE: f64 = 2.0;
pub const SHORT_RUN_DURATION: Duration = Duration::from_secs(60);
pub const LONG_RUN_DURATION: Duration = Duration::from_secs(300);

Debugging & Observability

Observe first, tune second

  • Rely on liveness and inclusion signals to interpret outcomes before tweaking rates or topology
  • Enable detailed logging (RUST_LOG=debug, NOMOS_LOG_LEVEL=debug) only after initial failure
  • Use NOMOS_TESTS_KEEP_LOGS=1 to persist logs when debugging failures

Use BlockFeed effectively

  • Subscribe to BlockFeed in expectations for real-time block monitoring
  • Track block production rate to detect liveness issues early
  • Use block statistics (block_feed.stats().total_transactions()) to verify inclusion

Collect metrics

  • Set up Prometheus/Grafana via scripts/setup/setup-observability.sh compose up for visualizing node behavior
  • Use metrics to identify bottlenecks before adding more load
  • Monitor mempool size, block size, and consensus timing

Environment & Runner Selection

Environment fit

  • Pick runners that match the feedback loop you need:
    • Host: Fast iteration during development, quick CI smoke tests
    • Compose: Reproducible environments (recommended for CI), chaos testing
    • K8s: Production-like fidelity, large topologies (10+ nodes)

Runner-specific considerations

RunnerWhen to UseWhen to Avoid
HostDevelopment iteration, fast feedbackChaos testing, container-specific issues
ComposeCI pipelines, chaos tests, reproducibilityVery large topologies (>10 nodes)
K8sProduction-like testing, cluster behaviorsLocal development, fast iteration

Minimal surprises

  • Seed only necessary wallets and keep configuration deltas explicit when moving between CI and developer machines
  • Use versions.env to pin node versions consistently across environments
  • Document non-default environment variables in scenario comments or README

CI/CD Integration

Use matrix builds

strategy:
  matrix:
    runner: [host, compose]
    topology: [small, medium]

Cache aggressively

  • Cache Rust build artifacts (target/)
  • Cache circuit parameters (assets/stack/kzgrs_test_params/)
  • Cache Docker layers (use BuildKit cache)

Collect logs on failure

- name: Collect logs on failure
  if: failure()
  run: |
    mkdir -p test-logs
    find /tmp -name "nomos-*.log" -exec cp {} test-logs/ \;
- uses: actions/upload-artifact@v3
  if: failure()
  with:
    name: test-logs-${{ matrix.runner }}
    path: test-logs/

Time limits

  • Set job timeout to prevent hung runs: timeout-minutes: 30
  • Use shorter durations in CI (60s) vs local testing (300s)
  • Run expensive tests (k8s, large topologies) only on main branch or release tags

See also: CI Integration for complete workflow examples

Anti-Patterns to Avoid

DON’T: Run without POL_PROOF_DEV_MODE

# BAD: Will hang/timeout on proof generation
cargo run -p runner-examples --bin local_runner

# GOOD: Fast mode for testing
POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner

DON’T: Use tiny durations

// BAD: Not enough time for blocks to propagate
.with_run_duration(Duration::from_secs(5))

// GOOD: Allow multiple consensus rounds
.with_run_duration(Duration::from_secs(60))

DON’T: Ignore cleanup failures

// BAD: Next run inherits leaked state
runner.run(&mut scenario).await?;
// forgot to call cleanup or use CleanupGuard

// GOOD: Cleanup via guard (automatic on panic)
let _cleanup = CleanupGuard::new(runner.clone());
runner.run(&mut scenario).await?;

DON’T: Mix concerns in one scenario

// BAD: Hard to debug when it fails
.transactions_with(|tx| tx.rate(50).users(100))  // high load
.chaos_with(|c| c.restart().min_delay(...))        // AND chaos
.da_with(|da| da.channel_rate(10).blob_rate(20))  // AND DA stress

// GOOD: Separate tests for each concern
// Test 1: High transaction load only
// Test 2: Chaos resilience only
// Test 3: DA stress only

DON’T: Hardcode paths or ports

// BAD: Breaks on different machines
let path = PathBuf::from("/home/user/circuits/kzgrs_test_params");
let port = 9000; // might conflict

// GOOD: Use env vars and dynamic allocation
let path = std::env::var("NOMOS_KZGRS_PARAMS_PATH")
    .unwrap_or_else(|_| "assets/stack/kzgrs_test_params/kzgrs_test_params".to_string());
let port = get_available_tcp_port();

DON’T: Ignore resource limits

# BAD: Large topology without checking resources
scripts/run/run-examples.sh -v 20 -e 10 compose
# (might OOM or exhaust ulimits)

# GOOD: Scale gradually and monitor resources
scripts/run/run-examples.sh -v 3 -e 2 compose  # start small
docker stats  # monitor resource usage
# then increase if resources allow

Scenario Design Heuristics

Minimal viable topology

  • Consensus: 3 validators (minimum for Byzantine fault tolerance)
  • DA: 2+ executors (test dispersal and sampling)
  • Network: Star topology (simplest for debugging)

Workload rate selection

  • Start with 1-5 tx/s per user, then increase
  • DA: 1-2 channels, 1-3 blobs/channel initially
  • Chaos: 30s+ intervals between restarts (allow recovery)

Duration guidelines

Test TypeMinimum DurationTypical Duration
Smoke test30s60s
Integration test60s120s
Load test120s300s
Resilience test120s300s
Soak test600s (10m)3600s (1h)

Expectation selection

Test GoalExpectations
Basic functionalityexpect_consensus_liveness()
Transaction handlingexpect_consensus_liveness() + custom inclusion check
DA correctnessexpect_consensus_liveness() + DA dispersal/sampling checks
Resilienceexpect_consensus_liveness() + recovery time measurement

Testing the Tests

Validate scenarios before committing

  1. Run on Host runner first (fast feedback)
  2. Run on Compose runner (reproducibility check)
  3. Check logs for warnings or errors
  4. Verify cleanup (no leaked processes/containers)
  5. Run 2-3 times to check for flakiness

Handling flaky tests

  • Increase run duration (timing-sensitive assertions need longer runs)
  • Reduce workload rates (might be saturating nodes)
  • Check resource limits (CPU/RAM/ulimits)
  • Add debugging output to identify race conditions
  • Consider if test is over-specified (too strict expectations)

See also:

Usage Patterns

  • Shape a topology, pick a runner: choose local for quick iteration, compose for reproducible multi-node stacks with observability, or k8s for cluster-grade validation.
  • Compose workloads deliberately: pair transactions and data-availability traffic for end-to-end coverage; add chaos only when assessing recovery and resilience.
  • Align expectations with goals: use liveness-style checks to confirm the system keeps up with planned activity, and add workload-specific assertions for inclusion or availability.
  • Reuse plans across environments: keep the scenario constant while swapping runners to compare behavior between developer machines and CI clusters.
  • Iterate with clear signals: treat expectation outcomes as the primary pass/fail indicator, and adjust topology or workloads based on what those signals reveal.

Examples

Concrete scenario shapes that illustrate how to combine topologies, workloads, and expectations.

View Complete Source Code:

Runnable examples: The repo includes complete binaries in examples/src/bin/:

  • local_runner.rs — Host processes (local)
  • compose_runner.rs — Docker Compose (requires image built)
  • k8s_runner.rs — Kubernetes (requires cluster access and image loaded)

Recommended: Use scripts/run/run-examples.sh -t <duration> -v <validators> -e <executors> <mode> where mode is host, compose, or k8s.

Alternative: Direct cargo run: POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin <name>

All runners require POL_PROOF_DEV_MODE=true to avoid expensive proof generation.

Code patterns below show how to build scenarios. Wrap these in #[tokio::test] functions for integration tests, or #[tokio::main] for binaries.

Simple consensus liveness

Minimal test that validates basic block production:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn simple_consensus() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(0))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(30))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: smoke tests for consensus on minimal hardware.

Transaction workload

Test consensus under transaction load:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn transaction_workload() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(2).executors(0))
        .wallets(20)
        .transactions_with(|txs| txs.rate(5).users(10))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(60))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: validate transaction submission and inclusion.

DA + transaction workload

Combined test stressing both transaction and DA layers:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn da_and_transactions() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
        .wallets(30)
        .transactions_with(|txs| txs.rate(5).users(15))
        .da_with(|da| da.channel_rate(2).blob_rate(2))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: end-to-end coverage of transaction and DA layers.

Chaos resilience

Test system resilience under node restarts:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::{ChaosBuilderExt, ScenarioBuilderExt};

pub async fn chaos_resilience() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(4).executors(2))
        .enable_node_control()
        .wallets(20)
        .transactions_with(|txs| txs.rate(3).users(10))
        .chaos_with(|c| {
            c.restart()
                .min_delay(Duration::from_secs(20))
                .max_delay(Duration::from_secs(40))
                .target_cooldown(Duration::from_secs(30))
                .apply()
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(120))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: resilience validation and operational readiness drills.

Note: Chaos tests require ComposeDeployer or another runner with node control support.

Advanced Examples

When should I read this? Skim now to see what’s possible, revisit later when you need load testing, chaos scenarios, or custom extensions. Start with basic examples first.

Realistic advanced scenarios demonstrating framework capabilities for production testing.

Adapt from Complete Source:

Summary

ExampleTopologyWorkloadsDeployerKey Feature
Load Progression3 validators + 2 executorsIncreasing tx rateComposeDynamic load testing
Sustained Load4 validators + 2 executorsHigh tx + DA rateComposeStress testing
Aggressive Chaos4 validators + 2 executorsFrequent restarts + trafficComposeResilience validation

Load Progression Test

Test consensus under progressively increasing transaction load:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn load_progression_test() -> Result<()> {
    for rate in [5, 10, 20, 30] {
        println!("Testing with rate: {}", rate);

        let mut plan =
            ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
                .wallets(50)
                .transactions_with(|txs| txs.rate(rate).users(20))
                .expect_consensus_liveness()
                .with_run_duration(Duration::from_secs(60))
                .build();

        let deployer = ComposeDeployer::default();
        let runner = deployer.deploy(&plan).await?;
        let _handle = runner.run(&mut plan).await?;
    }

    Ok(())
}

When to use: Finding the maximum sustainable transaction rate for a given topology.

Sustained Load Test

Run high transaction and DA load for extended duration:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn sustained_load_test() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(4).executors(2))
        .wallets(100)
        .transactions_with(|txs| txs.rate(15).users(50))
        .da_with(|da| da.channel_rate(2).blob_rate(3))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(300))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: Validating stability under continuous high load over extended periods.

Aggressive Chaos Test

Frequent node restarts with active traffic:

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_workflows::{ChaosBuilderExt, ScenarioBuilderExt};

pub async fn aggressive_chaos_test() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(4).executors(2))
        .enable_node_control()
        .wallets(50)
        .transactions_with(|txs| txs.rate(10).users(20))
        .chaos_with(|c| {
            c.restart()
                .min_delay(Duration::from_secs(10))
                .max_delay(Duration::from_secs(20))
                .target_cooldown(Duration::from_secs(15))
                .apply()
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(180))
        .build();

    let deployer = ComposeDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

When to use: Validating recovery and liveness under aggressive failure conditions.

Note: Requires ComposeDeployer for node control support.

Extension Ideas

These scenarios require custom implementations but demonstrate framework extensibility:

Mempool & Transaction Handling

Transaction Propagation & Inclusion Test

Concept: Submit the same batch of independent transactions to different nodes in randomized order/offsets, then verify all transactions are included and final state matches across nodes.

Requirements:

  • Custom workload: Generates a fixed batch of transactions and submits the same set to different nodes via ctx.node_clients(), with randomized submission order and timing offsets per node
  • Custom expectation: Verifies all transactions appear in blocks (order may vary), final state matches across all nodes (compare balances or state roots), and no transactions are dropped

Why useful: Exercises mempool propagation, proposer fairness, and transaction inclusion guarantees under realistic race conditions. Tests that the protocol maintains consistency regardless of which node receives transactions first.

Implementation notes: Requires both a custom Workload implementation (to submit same transactions to multiple nodes with jitter) and a custom Expectation implementation (to verify inclusion and state consistency).

Cross-Validator Mempool Divergence & Convergence

Concept: Drive different transaction subsets into different validators (or differing arrival orders) to create temporary mempool divergence, then verify mempools/blocks converge to contain the union (no permanent divergence).

Requirements:

  • Custom workload: Targets specific nodes via ctx.node_clients() with disjoint or jittered transaction batches
  • Custom expectation: After a convergence window, verifies that all transactions appear in blocks (order may vary) or that mempool contents converge across nodes
  • Run normal workloads during convergence period

Expectations:

  • Temporary mempool divergence is acceptable (different nodes see different transactions initially)
  • After convergence window, all transactions appear in blocks or mempools converge
  • No transactions are permanently dropped despite initial divergence
  • Mempool gossip/reconciliation mechanisms work correctly

Why useful: Exercises mempool gossip and reconciliation under uneven input or latency. Ensures no node “drops” transactions seen elsewhere, validating that mempool synchronization mechanisms correctly propagate transactions across the network even when they arrive at different nodes in different orders.

Implementation notes: Requires both a custom Workload implementation (to inject disjoint/jittered batches per node) and a custom Expectation implementation (to verify mempool convergence or block inclusion). Uses existing ctx.node_clients() capability—no new infrastructure needed.

Adaptive Mempool Pressure Test

Concept: Ramp transaction load over time to observe mempool growth, fee prioritization/eviction, and block saturation behavior, detecting performance regressions and ensuring backpressure/eviction work under increasing load.

Requirements:

  • Custom workload: Steadily increases transaction rate over time (optional: use fee tiers)
  • Custom expectation: Monitors mempool size, evictions, and throughput (blocks/txs per slot), flagging runaway growth or stalls
  • Run for extended duration to observe pressure buildup

Expectations:

  • Mempool size grows predictably with load (not runaway growth)
  • Fee prioritization/eviction mechanisms activate under pressure
  • Block saturation behavior is acceptable (blocks fill appropriately)
  • Throughput (blocks/txs per slot) remains stable or degrades gracefully
  • No stalls or unbounded mempool growth

Why useful: Detects performance regressions in mempool management. Ensures backpressure and eviction mechanisms work correctly under increasing load, preventing memory exhaustion or unbounded growth. Validates that fee prioritization correctly selects high-value transactions when mempool is full.

Implementation notes: Can be built with current workload model (ramping rate). Requires custom Expectation implementation that reads mempool metrics (via node HTTP APIs or Prometheus) and monitors throughput to judge behavior. No new infrastructure needed—uses existing observability capabilities.

Invalid Transaction Fuzzing

Concept: Submit malformed transactions and verify they’re rejected properly.

Implementation approach:

  • Custom workload that generates invalid transactions (bad signatures, insufficient funds, malformed structure)
  • Expectation verifies mempool rejects them and they never appear in blocks
  • Test mempool resilience and filtering

Why useful: Ensures mempool doesn’t crash or include invalid transactions under fuzzing.

Network & Gossip

Gossip Latency Gradient Scenario

Concept: Test consensus robustness under skewed gossip delays by partitioning nodes into latency tiers (tier A ≈10ms, tier B ≈100ms, tier C ≈300ms) and observing propagation lag, fork rate, and eventual convergence.

Requirements:

  • Partition nodes into three groups (tiers)
  • Apply per-group network delay via chaos: netem/iptables in compose; NetworkPolicy + netem sidecar in k8s
  • Run standard workload (transactions/block production)
  • Optional: Remove delays at end to check recovery

Expectations:

  • Propagation: Messages reach all tiers within acceptable bounds
  • Safety: No divergent finalized heads; fork rate stays within tolerance
  • Liveness: Chain keeps advancing; convergence after delays relaxed (if healed)

Why useful: Real networks have heterogeneous latency. This stress-tests proposer selection and fork resolution when some peers are “far” (high latency), validating that consensus remains safe and live under realistic network conditions.

Current blocker: Runner support for per-group delay injection (network delay via netem/iptables) is not present today. Would require new chaos plumbing in compose/k8s deployers to inject network delays per node group.

Byzantine Gossip Flooding (libp2p Peer)

Concept: Spin up a custom workload/sidecar that runs a libp2p host, joins the cluster’s gossip mesh, and publishes a high rate of syntactically valid but useless/stale messages to selected topics, testing gossip backpressure, scoring, and queue handling under a “malicious” peer.

Requirements:

  • Custom workload/sidecar that implements a libp2p host
  • Join the cluster’s gossip mesh as a peer
  • Publish high-rate syntactically valid but useless/stale messages to selected gossip topics
  • Run alongside normal workloads (transactions/block production)

Expectations:

  • Gossip backpressure mechanisms prevent message flooding from overwhelming nodes
  • Peer scoring correctly identifies and penalizes the malicious peer
  • Queue handling remains stable under flood conditions
  • Normal consensus operation continues despite malicious peer

Why useful: Tests Byzantine behavior (malicious peer) which is critical for consensus protocol robustness. More realistic than RPC spam since it uses the actual gossip protocol. Validates that gossip backpressure, peer scoring, and queue management correctly handle adversarial peers without disrupting consensus.

Current blocker: Requires adding gossip-capable helper (libp2p integration) to the framework. Would need a custom workload/sidecar implementation that can join the gossip mesh and inject messages. The rest of the scenario can use existing runners/workloads.

Network Partition Recovery

Concept: Test consensus recovery after network partitions.

Requirements:

  • Needs block_peer() / unblock_peer() methods in NodeControlHandle
  • Partition subsets of validators, wait, then restore connectivity
  • Verify chain convergence after partition heals

Why useful: Tests the most realistic failure mode in distributed systems.

Current blocker: Node control doesn’t yet support network-level actions (only process restarts).

Time & Timing

Time-Shifted Blocks (Clock Skew Test)

Concept: Test consensus and timestamp handling when nodes run with skewed clocks (e.g., +1s, −1s, +200ms jitter) to surface timestamp validation issues, reorg sensitivity, and clock drift handling.

Requirements:

  • Assign per-node time offsets (e.g., +1s, −1s, +200ms jitter)
  • Run normal workload (transactions/block production)
  • Observe whether blocks are accepted/propagated and the chain stays consistent

Expectations:

  • Blocks with skewed timestamps are handled correctly (accepted or rejected per protocol rules)
  • Chain remains consistent across nodes despite clock differences
  • No unexpected reorgs or chain splits due to timestamp validation issues

Why useful: Clock skew is a common real-world issue in distributed systems. This validates that consensus correctly handles timestamp validation and maintains safety/liveness when nodes have different clock offsets, preventing timestamp-based attacks or failures.

Current blocker: Runner ability to skew per-node clocks (e.g., privileged containers with libfaketime/chrony or time-offset netns) is not available today. Would require a new chaos/time-skew hook in deployers to inject clock offsets per node.

Block Timing Consistency

Concept: Verify block production intervals stay within expected bounds.

Implementation approach:

  • Custom expectation that consumes BlockFeed
  • Collect block timestamps during run
  • Assert intervals are within (slot_duration * active_slot_coeff) ± tolerance

Why useful: Validates consensus timing under various loads.

Topology & Membership

Dynamic Topology (Churn) Scenario

Concept: Nodes join and leave mid-run (new identities/addresses added; some nodes permanently removed) to exercise peer discovery, bootstrapping, reputation, and load balancing under churn.

Requirements:

  • Runner must be able to spin up new nodes with fresh keys/addresses at runtime
  • Update peer lists and bootstraps dynamically as nodes join/leave
  • Optionally tear down nodes permanently (not just restart)
  • Run normal workloads (transactions/block production) during churn

Expectations:

  • New nodes successfully discover and join the network
  • Peer discovery mechanisms correctly handle dynamic topology changes
  • Reputation systems adapt to new/removed peers
  • Load balancing adjusts to changing node set
  • Consensus remains safe and live despite topology churn

Why useful: Real networks experience churn (nodes joining/leaving). Unlike restarts (which preserve topology), churn changes the actual topology size and peer set, testing how the protocol handles dynamic membership. This exercises peer discovery, bootstrapping, reputation systems, and load balancing under realistic conditions.

Current blocker: Runner support for dynamic node addition/removal at runtime is not available today. Chaos today only restarts existing nodes; churn would require the ability to spin up new nodes with fresh identities/addresses, update peer lists/bootstraps dynamically, and permanently remove nodes. Would need new topology management capabilities in deployers.

API & External Interfaces

API DoS/Stress Test

Concept: Adversarial workload floods node HTTP/WS APIs with high QPS and malformed/bursty requests; expectation checks nodes remain responsive or rate-limit without harming consensus.

Requirements:

  • Custom workload: Targets node HTTP/WS API endpoints with mixed valid/invalid requests at high rate
  • Custom expectation: Monitors error rates, latency, and confirms block production/liveness unaffected
  • Run alongside normal workloads (transactions/block production)

Expectations:

  • Nodes remain responsive or correctly rate-limit under API flood
  • Error rates/latency are acceptable (rate limiting works)
  • Block production/liveness unaffected by API abuse
  • Consensus continues normally despite API stress

Why useful: Validates API hardening under abuse and ensures control/telemetry endpoints don’t destabilize the node. Tests that API abuse is properly isolated from consensus operations, preventing DoS attacks on API endpoints from affecting blockchain functionality.

Implementation notes: Requires custom Workload implementation that directs high-QPS traffic to node APIs (via ctx.node_clients() or direct HTTP clients) and custom Expectation implementation that monitors API responsiveness metrics and consensus liveness. Uses existing node API access—no new infrastructure needed.

State & Correctness

Wallet Balance Verification

Concept: Track wallet balances and verify state consistency.

Description: After transaction workload completes, query all wallet balances via node API and verify total supply is conserved. Requires tracking initial state, submitted transactions, and final balances. Validates that the ledger maintains correctness under load (no funds lost or created). This is a state assertion expectation that checks correctness, not just liveness.

Cucumber/BDD Interface

The Logos testing repo includes a small Cucumber (Gherkin) harness for “smoke” scenarios. It is useful when you want readable acceptance-style checks, but it intentionally exposes a limited surface area compared to Rust scenarios.


What Exists Today

  • Step definitions live in testing-framework/cucumber.
  • The runnable entrypoints are binaries in examples (crate runner-examples):
    • cucumber_host (local/host deployer)
    • cucumber_compose (compose deployer)
  • Feature files live in examples/cucumber/features/.
  • Supported deployers: local and compose (no k8s runner integration in Cucumber yet).

Example Feature (Matches Current Steps)

This is the shape used by the repo’s smoke features:

Feature: Testing Framework - Local Runner

  Scenario: Run a local smoke scenario (tx + DA + liveness)
    Given deployer is "local"
    And topology has 1 validators and 1 executors
    And run duration is 60 seconds
    And wallets total funds is 1000000000 split across 50 users
    And transactions rate is 1 per block
    And data availability channel rate is 1 per block and blob rate is 1 per block
    And expect consensus liveness
    When run scenario
    Then scenario should succeed

Running The Smoke Features

Local runner smoke:

POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin cucumber_host

Compose runner smoke:

POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin cucumber_compose

Available Steps (Current)

Topology / runner selection:

  • Given deployer is "local"|"compose"
  • Given topology has <validators> validators and <executors> executors

Run configuration:

  • Given run duration is <seconds> seconds
  • Given wallets total funds is <funds> split across <users> users

Workloads:

  • Given transactions rate is <rate> per block
  • Given transactions rate is <rate> per block using <users> users
  • Given data availability channel rate is <channel_rate> per block and blob rate is <blob_rate> per block

Expectations:

  • Given expect consensus liveness
  • Given consensus liveness lag allowance is <blocks>

Execution + assertion:

  • When run scenario
  • Then scenario should succeed

Notes

  • The Cucumber harness builds scenarios using the same core + workflow builder APIs as the Rust examples, so the same prerequisites apply (notably POL_PROOF_DEV_MODE=true for practical runs).
  • If you need more flexibility (custom workloads/expectations, richer checks, node control/chaos), write Rust scenarios instead: see Examples and Extending the Framework.

Running Scenarios

This page focuses on how scenarios are executed (deploy → run → evaluate → cleanup), what artifacts you get back, and how that differs across runners.

For “just run something that works” commands, see Running Examples.


Execution Flow (High Level)

When you run a built scenario via a deployer, the run follows the same shape:

flowchart TD
    Build[Scenario built] --> Deploy[Deploy]
    Deploy --> Capture[Capture]
    Capture --> Execute[Execute]
    Execute --> Evaluate[Evaluate]
    Evaluate --> Cleanup[Cleanup]
  • Deploy: provision infrastructure and start nodes (processes/containers/pods)
  • Capture: establish clients/observability and capture initial state
  • Execute: run workloads for the configured wall-clock duration
  • Evaluate: run expectations (after the execution window ends)
  • Cleanup: stop resources and finalize artifacts

The Core API

use std::time::Duration;

use testing_framework_core::scenario::{Deployer as _, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

async fn run_once() -> anyhow::Result<()> {
    let mut scenario = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
        .wallets(20)
        .transactions_with(|tx| tx.rate(1).users(5))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(60))
        .build()?;

    let runner = LocalDeployer::default().deploy(&scenario).await?;
    runner.run(&mut scenario).await?;

    Ok(())
}

Notes:

  • with_run_duration(...) is wall-clock time, not “number of blocks”.
  • .transactions_with(...) rates are per-block.
  • Most users should run scenarios via scripts/run/run-examples.sh unless they are embedding the framework in their own test crate.

Runner Differences

Local (Host) Runner

  • Best for: fast iteration and debugging
  • Logs/state: stored under a temporary run directory unless you set NOMOS_TESTS_KEEP_LOGS=1 and/or NOMOS_LOG_DIR=...
  • Limitations: no node-control capability (chaos workflows that require node control won’t work here)

Run the built-in local examples:

POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

Compose Runner

  • Best for: reproducible multi-node environments and node control
  • Logs: primarily via docker compose logs (and any node-level log configuration you apply)
  • Debugging: set COMPOSE_RUNNER_PRESERVE=1 to keep the environment up after a run

Run the built-in compose examples:

POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

K8s Runner

  • Best for: production-like behavior, cluster scheduling/networking
  • Logs: kubectl logs ...
  • Debugging: set K8S_RUNNER_PRESERVE=1 and K8S_RUNNER_NAMESPACE=... to keep resources around

Run the built-in k8s examples:

POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

Artifacts & Where to Look

  • Node logs: configure via NOMOS_LOG_DIR, NOMOS_LOG_LEVEL, NOMOS_LOG_FILTER (see Logging & Observability)
  • Runner logs: controlled by RUST_LOG (runner process only)
  • Keep run directories: set NOMOS_TESTS_KEEP_LOGS=1
  • Compose environment preservation: set COMPOSE_RUNNER_PRESERVE=1
  • K8s environment preservation: set K8S_RUNNER_PRESERVE=1

See Also

Runners

Runners turn a scenario plan into a live environment while keeping the plan unchanged. Choose based on feedback speed, reproducibility, and fidelity. For environment and operational considerations, see Operations Overview.

Important: All runners require POL_PROOF_DEV_MODE=true to avoid expensive Groth16 proof generation that causes timeouts.

Host runner (local processes)

  • Launches node processes directly on the host (via LocalDeployer).
  • Binary: local_runner.rs, script mode: host
  • Fastest feedback loop and minimal orchestration overhead.
  • Best for development-time iteration and debugging.
  • Can run in CI for fast smoke tests.
  • Node control: Not supported (chaos workloads not available)

Run with: scripts/run/run-examples.sh -t 60 -v 1 -e 1 host

Docker Compose runner

  • Starts nodes in containers to provide a reproducible multi-node stack on a single machine (via ComposeDeployer).
  • Binary: compose_runner.rs, script mode: compose
  • Discovers service ports and wires observability for convenient inspection.
  • Good balance between fidelity and ease of setup.
  • Recommended for CI pipelines (isolated environment, reproducible).
  • Node control: Supported (can restart nodes for chaos testing)

Run with: scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose

Kubernetes runner

  • Deploys nodes onto a cluster for higher-fidelity, longer-running scenarios (via K8sDeployer).
  • Binary: k8s_runner.rs, script mode: k8s
  • Suits CI with cluster access or shared test environments where cluster behavior and scheduling matter.
  • Node control: Not supported yet (chaos workloads not available)

Run with: scripts/run/run-examples.sh -t 60 -v 1 -e 1 k8s

Common expectations

  • All runners require at least one validator and, for transaction scenarios, access to seeded wallets.
  • Readiness probes gate workload start so traffic begins only after nodes are reachable.
  • Environment flags can relax timeouts or increase tracing when diagnostics are needed.

Runner Comparison

flowchart TB
    subgraph Host["Host Runner (Local)"]
        H1["Speed: Fast"]
        H2["Isolation: Shared host"]
        H3["Setup: Minimal"]
        H4["Chaos: Not supported"]
        H5["CI: Quick smoke tests"]
    end
    
    subgraph Compose["Compose Runner (Docker)"]
        C1["Speed: Medium"]
        C2["Isolation: Containerized"]
        C3["Setup: Image build required"]
        C4["Chaos: Supported"]
        C5["CI: Recommended"]
    end
    
    subgraph K8s["K8s Runner (Cluster)"]
        K1["Speed: Slower"]
        K2["Isolation: Pod-level"]
        K3["Setup: Cluster + image"]
        K4["Chaos: Not yet supported"]
        K5["CI: Large-scale tests"]
    end
    
    Decision{Choose Based On}
    Decision -->|Fast iteration| Host
    Decision -->|Reproducibility| Compose
    Decision -->|Production-like| K8s
    
    style Host fill:#e1f5ff
    style Compose fill:#e1ffe1
    style K8s fill:#ffe1f5

Detailed Feature Matrix

FeatureHostComposeK8s
SpeedFastestMediumSlowest
Setup Time< 1 min2-5 min5-10 min
IsolationProcess-levelContainerPod + namespace
Node ControlNoYesNot yet
ObservabilityBasicExternal stackCluster-wide
CI IntegrationSmoke testsRecommendedHeavy tests
Resource UsageLowMediumHigh
ReproducibilityEnvironment-dependentHighHighest
Network FidelityLocalhost onlyVirtual networkReal cluster
Parallel RunsPort conflictsIsolatedNamespace isolation

Decision Guide

flowchart TD
    Start[Need to run tests?] --> Q1{Local development?}
    Q1 -->|Yes| Q2{Testing chaos?}
    Q1 -->|No| Q5{Have cluster access?}
    
    Q2 -->|Yes| UseCompose[Use Compose]
    Q2 -->|No| Q3{Need isolation?}
    
    Q3 -->|Yes| UseCompose
    Q3 -->|No| UseHost[Use Host]
    
    Q5 -->|Yes| Q6{Large topology?}
    Q5 -->|No| Q7{CI pipeline?}
    
    Q6 -->|Yes| UseK8s[Use K8s]
    Q6 -->|No| UseCompose
    
    Q7 -->|Yes| Q8{Docker available?}
    Q7 -->|No| UseHost
    
    Q8 -->|Yes| UseCompose
    Q8 -->|No| UseHost
    
    style UseHost fill:#e1f5ff
    style UseCompose fill:#e1ffe1
    style UseK8s fill:#ffe1f5

Quick Recommendations

Use Host Runner when:

  • Iterating rapidly during development
  • Running quick smoke tests
  • Testing on a laptop with limited resources
  • Don’t need chaos testing

Use Compose Runner when:

  • Need reproducible test environments
  • Testing chaos scenarios (node restarts)
  • Running in CI pipelines
  • Want containerized isolation

Use K8s Runner when:

  • Testing large-scale topologies (10+ nodes)
  • Need production-like environment
  • Have cluster access in CI
  • Testing cluster-specific behaviors

RunContext: BlockFeed & Node Control

The deployer supplies a RunContext that workloads and expectations share. It provides:

  • Topology descriptors (GeneratedTopology)
  • Client handles (NodeClients / ClusterClient) for HTTP/RPC calls
  • Metrics (RunMetrics, Metrics) and block feed
  • Optional NodeControlHandle for managing nodes

BlockFeed: Observing Block Production

The BlockFeed is a broadcast stream of block observations that allows workloads and expectations to monitor blockchain progress in real-time. It polls a validator node continuously and broadcasts new blocks to all subscribers.

What BlockFeed Provides

Real-time block stream:

  • Subscribe to receive BlockRecord notifications as blocks are produced
  • Each record includes the block header (HeaderId) and full block payload
  • Backed by a background task that polls node storage every second

Block statistics:

  • Track total transactions across all observed blocks
  • Access via block_feed.stats().total_transactions()

Broadcast semantics:

  • Multiple subscribers can receive the same blocks independently
  • Late subscribers start receiving from current block (no history replay)
  • Lagged subscribers skip missed blocks automatically

Accessing BlockFeed

BlockFeed is available through RunContext:

let block_feed = ctx.block_feed();

Usage in Expectations

Expectations typically use BlockFeed to verify block production and inclusion of transactions/data.

Example: Counting blocks during a run

use std::sync::{
    Arc,
    atomic::{AtomicU64, Ordering},
};

use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, Expectation, RunContext};

struct MinimumBlocksExpectation {
    min_blocks: u64,
    captured_blocks: Option<Arc<AtomicU64>>,
}

#[async_trait]
impl Expectation for MinimumBlocksExpectation {
    fn name(&self) -> &'static str {
        "minimum_blocks"
    }

    async fn start_capture(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        let block_count = Arc::new(AtomicU64::new(0));
        let block_count_task = Arc::clone(&block_count);
        
        // Subscribe to block feed
        let mut receiver = ctx.block_feed().subscribe();
        
        // Spawn a task to count blocks
        tokio::spawn(async move {
            loop {
                match receiver.recv().await {
                    Ok(_record) => {
                        block_count_task.fetch_add(1, Ordering::Relaxed);
                    }
                    Err(tokio::sync::broadcast::error::RecvError::Lagged(skipped)) => {
                        tracing::debug!(skipped, "receiver lagged, skipping blocks");
                    }
                    Err(tokio::sync::broadcast::error::RecvError::Closed) => {
                        tracing::debug!("block feed closed");
                        break;
                    }
                }
            }
        });
        
        self.captured_blocks = Some(block_count);
        Ok(())
    }

    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        let blocks = self.captured_blocks
            .as_ref()
            .expect("start_capture must be called first")
            .load(Ordering::Relaxed);
        
        if blocks < self.min_blocks {
            return Err(format!(
                "expected at least {} blocks, observed {}",
                self.min_blocks, blocks
            ).into());
        }
        
        tracing::info!(blocks, min = self.min_blocks, "minimum blocks expectation passed");
        Ok(())
    }
}

Example: Inspecting block contents

use testing_framework_core::scenario::{DynError, RunContext};

async fn start_capture(ctx: &RunContext) -> Result<(), DynError> {
    let mut receiver = ctx.block_feed().subscribe();
    
    tokio::spawn(async move {
        loop {
            match receiver.recv().await {
                Ok(record) => {
                    // Access block header
                    let header_id = &record.header;
                    
                    // Access full block
                    let tx_count = record.block.transactions().len();
                    
                    tracing::debug!(
                        ?header_id,
                        tx_count,
                        "observed block"
                    );
                    
                    // Process transactions, DA blobs, etc.
                }
                Err(tokio::sync::broadcast::error::RecvError::Closed) => break,
                Err(_) => continue,
            }
        }
    });
    
    Ok(())
}

Usage in Workloads

Workloads can use BlockFeed to coordinate timing or wait for specific conditions before proceeding.

Example: Wait for N blocks before starting

use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, RunContext, Workload};

struct DelayedWorkload {
    wait_blocks: usize,
}

#[async_trait]
impl Workload for DelayedWorkload {
    fn name(&self) -> &str {
        "delayed_workload"
    }

    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        tracing::info!(wait_blocks = self.wait_blocks, "waiting for blocks before starting");
        
        // Subscribe to block feed
        let mut receiver = ctx.block_feed().subscribe();
        let mut count = 0;
        
        // Wait for N blocks
        while count < self.wait_blocks {
            match receiver.recv().await {
                Ok(_) => count += 1,
                Err(tokio::sync::broadcast::error::RecvError::Lagged(_)) => continue,
                Err(tokio::sync::broadcast::error::RecvError::Closed) => {
                    return Err("block feed closed before reaching target".into());
                }
            }
        }
        
        tracing::info!("warmup complete, starting actual workload");
        
        // Now do the actual work
        // ...
        
        Ok(())
    }
}

Example: Rate limiting based on block production

use testing_framework_core::scenario::{DynError, RunContext};

async fn generate_request() -> Option<()> {
    None
}

async fn start(ctx: &RunContext) -> Result<(), DynError> {
    let clients = ctx.node_clients().validator_clients();
    let mut receiver = ctx.block_feed().subscribe();
    let mut pending_requests: Vec<()> = Vec::new();

    loop {
        tokio::select! {
            // Issue a batch on each new block.
            Ok(_record) = receiver.recv() => {
                if !pending_requests.is_empty() {
                    tracing::debug!(count = pending_requests.len(), "issuing requests on new block");
                    for _req in pending_requests.drain(..) {
                        let _info = clients[0].consensus_info().await?;
                    }
                }
            }

            // Generate work continuously.
            Some(req) = generate_request() => {
                pending_requests.push(req);
            }
        }
    }
}

BlockFeed vs Direct Polling

Use BlockFeed when:

  • You need to react to blocks as they’re produced
  • Multiple components need to observe the same blocks
  • You want automatic retry/reconnect logic
  • You’re tracking statistics across many blocks

Use direct polling when:

  • You need to query specific historical blocks
  • You’re checking final state after workloads complete
  • You need transaction receipts or other indexed data
  • You’re implementing a one-time health check

Example direct polling in expectations:

use testing_framework_core::scenario::{DynError, RunContext};

async fn evaluate(ctx: &RunContext) -> Result<(), DynError> {
    let client = &ctx.node_clients().validator_clients()[0];
    
    // Poll current height once
    let info = client.consensus_info().await?;
    tracing::info!(height = info.height, "final block height");
    
    // This is simpler than BlockFeed for one-time checks
    Ok(())
}

Block Statistics

Access aggregated statistics without subscribing to the feed:

use testing_framework_core::scenario::{DynError, RunContext};

async fn evaluate(ctx: &RunContext, expected_min: u64) -> Result<(), DynError> {
    let stats = ctx.block_feed().stats();
    let total_txs = stats.total_transactions();
    
    tracing::info!(total_txs, "transactions observed across all blocks");
    
    if total_txs < expected_min {
        return Err(format!(
            "expected at least {} transactions, observed {}",
            expected_min, total_txs
        ).into());
    }
    
    Ok(())
}

Important Notes

Subscription timing:

  • Subscribe in start_capture() for expectations
  • Subscribe in start() for workloads
  • Late subscribers miss historical blocks (no replay)

Lagged receivers:

  • If your subscriber is too slow, it may lag behind
  • Handle RecvError::Lagged(skipped) gracefully
  • Consider increasing processing speed or reducing block rate

Feed lifetime:

  • BlockFeed runs for the entire scenario duration
  • Automatically cleaned up when the run completes
  • Closed channels signal graceful shutdown

Performance:

  • BlockFeed polls nodes every 1 second
  • Broadcasts to all subscribers with minimal overhead
  • Suitable for scenarios with hundreds of blocks

Real-World Examples

The framework’s built-in expectations use BlockFeed extensively:

  • ConsensusLiveness: Doesn’t directly subscribe but uses block feed stats to verify progress
  • DataAvailabilityExpectation: Subscribes to inspect DA blobs in each block and track inscription/dispersal
  • TransactionInclusion: Subscribes to find specific transactions in blocks

See Examples and Workloads & Expectations for more patterns.


Current Chaos Capabilities and Limitations

The framework currently supports process-level chaos (node restarts) for resilience testing:

Supported:

  • Restart validators (restart_validator)
  • Restart executors (restart_executor)
  • Random restart workload via .chaos().restart()

Not Yet Supported:

  • Network partitions (blocking peers, packet loss)
  • Resource constraints (CPU throttling, memory limits)
  • Byzantine behavior injection (invalid blocks, bad signatures)
  • Selective peer blocking/unblocking

For network partition testing, see Extension Ideas which describes the proposed block_peer/unblock_peer API (not yet implemented).

Accessing node control in workloads/expectations

Check for control support and use it conditionally:

use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, RunContext, Workload};

struct RestartWorkload;

#[async_trait]
impl Workload for RestartWorkload {
    fn name(&self) -> &str {
        "restart_workload"
    }

    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        if let Some(control) = ctx.node_control() {
            // Restart the first validator (index 0) if supported.
            control.restart_validator(0).await?;
        }
        Ok(())
    }
}

When chaos workloads need control, require enable_node_control() in the scenario builder and deploy with a runner that supports it.

Current API surface

The NodeControlHandle trait currently provides:

use async_trait::async_trait;
use testing_framework_core::scenario::DynError;

#[async_trait]
pub trait NodeControlHandle: Send + Sync {
    async fn restart_validator(&self, index: usize) -> Result<(), DynError>;
    async fn restart_executor(&self, index: usize) -> Result<(), DynError>;
}

Future extensions may include peer blocking/unblocking or other control operations. For now, focus on restart-based chaos patterns as shown in the chaos workload examples.

Considerations

  • Always guard control usage: not all runners expose NodeControlHandle.
  • Treat control as best-effort: failures should surface as test failures, but workloads should degrade gracefully when control is absent.
  • Combine control actions with expectations (e.g., restart then assert height convergence) to keep scenarios meaningful.

Chaos Workloads

When should I read this? You don’t need chaos testing to be productive with the framework. Focus on basic scenarios first—chaos is for resilience validation and operational readiness drills once your core tests are stable.

Chaos in the framework uses node control to introduce failures and validate recovery. The built-in restart workload lives in testing_framework_workflows::workloads::chaos::RandomRestartWorkload.

How it works

  • Requires NodeControlCapability (enable_node_control() in the scenario builder) and a runner that provides a NodeControlHandle.
  • Randomly selects nodes (validators, executors) to restart based on your include/exclude flags.
  • Respects min/max delay between restarts and a target cooldown to avoid flapping the same node too frequently.
  • Runs alongside other workloads; expectations should account for the added disruption.
  • Support varies by runner: node control is not provided by the local runner and is not yet implemented for the k8s runner. Use a runner that advertises NodeControlHandle support (e.g., compose) for chaos workloads.

Usage

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::{ScenarioBuilderExt, workloads::chaos::RandomRestartWorkload};

pub fn random_restart_plan() -> testing_framework_core::scenario::Scenario<
    testing_framework_core::scenario::NodeControlCapability,
> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(2).executors(1))
        .enable_node_control()
        .with_workload(RandomRestartWorkload::new(
            Duration::from_secs(45),  // min delay
            Duration::from_secs(75),  // max delay
            Duration::from_secs(120), // target cooldown
            true,                     // include validators
            true,                     // include executors
        ))
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(150))
        .build()
}

Expectations to pair

  • Consensus liveness: ensure blocks keep progressing despite restarts.
  • Height convergence: optionally check all nodes converge after the chaos window.
  • Any workload-specific inclusion checks if you’re also driving tx/DA traffic.

Best practices

  • Keep delays/cooldowns realistic; avoid back-to-back restarts that would never happen in production.
  • Limit chaos scope: toggle validators vs executors based on what you want to test.
  • Combine with observability: monitor metrics/logs to explain failures.

Topology & Chaos Patterns

This page focuses on cluster manipulation: node control, chaos patterns, and what the tooling supports today.

Node control availability

  • Supported: restart control via NodeControlHandle (compose runner).
  • Not supported: local runner does not expose node control; k8s runner does not support it yet.
  • Not yet supported: peer blocking/unblocking and network partitions.

See also: RunContext: BlockFeed & Node Control for the current node-control API surface and limitations.

Chaos patterns to consider

  • Restarts: random restarts with minimum delay/cooldown to test recovery.
  • Partitions (planned): block/unblock peers to simulate partial isolation, then assert height convergence after healing.
  • Validator churn (planned): stop one validator and start another (new key) mid-run to test membership changes; expect convergence.
  • Load SLOs: push tx/DA rates and assert inclusion/availability budgets instead of only liveness.
  • API probes: poll HTTP/RPC endpoints during chaos to ensure external contracts stay healthy (shape + latency).

Expectations to pair

  • Liveness/height convergence after chaos windows.
  • SLO checks: inclusion latency, DA responsiveness, API latency/shape.
  • Recovery checks: ensure nodes that were isolated or restarted catch up to cluster height within a timeout.

Guidance

  • Keep chaos realistic: avoid flapping or patterns you wouldn’t operate in prod.
  • Scope chaos: choose validators vs executors intentionally; don’t restart all nodes at once unless you’re testing full outages.
  • Combine chaos with observability: capture block feed/metrics and API health so failures are diagnosable.

Part III — Developer Reference

Deep dives for contributors who extend the framework, evolve its abstractions, or maintain the crate set.

Scenario Model (Developer Level)

The scenario model defines clear, composable responsibilities:

  • Topology: a declarative description of the cluster—how many nodes, their roles, and the broad network and data-availability characteristics. It represents the intended shape of the system under test.
  • Scenario: a plan combining topology, workloads, expectations, and a run window. Building a scenario validates prerequisites (like seeded wallets) and ensures the run lasts long enough to observe meaningful block progression.
  • Workloads: asynchronous tasks that generate traffic or conditions. They use shared context to interact with the deployed cluster and may bundle default expectations.
  • Expectations: post-run assertions. They can capture baselines before workloads start and evaluate success once activity stops.
  • Runtime: coordinates workloads and expectations for the configured duration, enforces cooldowns when control actions occur, and ensures cleanup so runs do not leak resources.

Developers extending the model should keep these boundaries strict: topology describes, scenarios assemble, deployers provision, runners orchestrate, workloads drive, and expectations judge outcomes. For guidance on adding new capabilities, see Extending the Framework.

API Levels: Builder DSL vs. Direct Instantiation

The framework supports two styles for constructing scenarios:

  1. High-level Builder DSL (recommended): fluent helper methods (e.g. .transactions_with(...))
  2. Low-level direct instantiation: construct workload/expectation types explicitly, then attach them

Both styles produce the same runtime behavior because they ultimately call the same core builder APIs.

The DSL is implemented as extension traits (primarily testing_framework_workflows::ScenarioBuilderExt) on the core scenario builder.

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

let plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .wallets(5)
    .transactions_with(|txs| txs.rate(5).users(3))
    .da_with(|da| da.channel_rate(1).blob_rate(1).headroom_percent(20))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(60))
    .build();

When to use:

  • Most test code (smoke, regression, CI)
  • When you want sensible defaults and minimal boilerplate

Low-Level Direct Instantiation

Direct instantiation gives you explicit control over the concrete types you attach:

use std::{
    num::{NonZeroU64, NonZeroUsize},
    time::Duration,
};

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::{
    expectations::ConsensusLiveness,
    workloads::{da, transaction},
};

let tx_workload = transaction::Workload::with_rate(5)
    .expect("transaction rate must be non-zero")
    .with_user_limit(NonZeroUsize::new(3));

let da_workload = da::Workload::with_rate(
    NonZeroU64::new(1).unwrap(),  // blob rate per block
    NonZeroU64::new(1).unwrap(),  // channel rate per block
    da::Workload::default_headroom_percent(),
);

let plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .wallets(5)
    .with_workload(tx_workload)
    .with_workload(da_workload)
    .with_expectation(ConsensusLiveness::default())
    .with_run_duration(Duration::from_secs(60))
    .build();

When to use:

  • Custom workload/expectation implementations
  • Reusing preconfigured workload instances across multiple scenarios
  • Debugging / exploring the underlying workload types

Method Correspondence

High-Level DSLLow-Level Direct
.transactions_with(|txs| txs.rate(5).users(3)).with_workload(transaction::Workload::with_rate(5).expect(...).with_user_limit(...))
.da_with(|da| da.blob_rate(1).channel_rate(1)).with_workload(da::Workload::with_rate(...))
.expect_consensus_liveness().with_expectation(ConsensusLiveness::default())

Bundled Expectations (Important)

Workloads can bundle expectations by implementing Workload::expectations().

These bundled expectations are attached automatically whenever you call .with_workload(...) (including when you use the DSL), because the core builder expands workload expectations during attachment.

Mixing Both Styles

Mixing is common: use the DSL for built-ins, and direct instantiation for custom pieces.

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::{ScenarioBuilderExt, workloads::transaction};

let tx_workload = transaction::Workload::with_rate(5)
    .expect("transaction rate must be non-zero");

let plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
    .wallets(5)
    .with_workload(tx_workload)          // direct instantiation
    .expect_consensus_liveness()         // DSL
    .with_run_duration(Duration::from_secs(60))
    .build();

Implementation Detail (How the DSL Works)

The DSL methods are thin wrappers. For example:

builder.transactions_with(|txs| txs.rate(5).users(3))

is roughly equivalent to:

builder.transactions().rate(5).users(3).apply()

Troubleshooting

DSL method not found

  • Ensure the extension traits are in scope, e.g. use testing_framework_workflows::ScenarioBuilderExt;
  • Cross-check method names in Builder API Quick Reference

See Also

Extending the Framework

This guide shows how to extend the framework with custom workloads, expectations, runners, and topology helpers. Each section includes the trait outline and a minimal code example.

Adding a Workload

Steps:

  1. Implement testing_framework_core::scenario::Workload
  2. Provide a name and any bundled expectations
  3. Use init to derive inputs from topology/metrics; fail fast if prerequisites missing
  4. Use start to drive async traffic using RunContext clients
  5. Expose from testing-framework/workflows and optionally add a DSL helper

Trait outline:

use async_trait::async_trait;
use testing_framework_core::scenario::{
    DynError, Expectation, RunContext, RunMetrics, Workload,
};
use testing_framework_core::topology::generation::GeneratedTopology;

struct MyExpectation;

#[async_trait]
impl Expectation for MyExpectation {
    fn name(&self) -> &str {
        "my_expectation"
    }

    async fn evaluate(&mut self, _ctx: &RunContext) -> Result<(), DynError> {
        Ok(())
    }
}

pub struct MyWorkload {
    // Configuration fields
    target_rate: u64,
}

impl MyWorkload {
    pub fn new(target_rate: u64) -> Self {
        Self { target_rate }
    }
}

#[async_trait]
impl Workload for MyWorkload {
    fn name(&self) -> &str {
        "my_workload"
    }

    fn expectations(&self) -> Vec<Box<dyn Expectation>> {
        // Return bundled expectations that should run with this workload
        vec![Box::new(MyExpectation)]
    }

    fn init(
        &mut self,
        topology: &GeneratedTopology,
        _run_metrics: &RunMetrics,
    ) -> Result<(), DynError> {
        // Validate prerequisites (e.g., enough nodes, wallet data present)
        if topology.validators().is_empty() {
            return Err("no validators available".into());
        }
        Ok(())
    }

    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        // Drive async activity: submit transactions, query nodes, etc.
        let clients = ctx.node_clients().validator_clients();
        
        for client in clients {
            let info = client.consensus_info().await?;
            tracing::info!(height = info.height, "workload queried node");
        }
        
        Ok(())
    }
}

Key points:

  • name() identifies the workload in logs
  • expectations() bundles default checks (can be empty)
  • init() validates topology before run starts
  • start() executes concurrently with other workloads; it should complete before run duration expires

See Example: New Workload & Expectation for a complete, runnable example.

Adding an Expectation

Steps:

  1. Implement testing_framework_core::scenario::Expectation
  2. Use start_capture to snapshot baseline metrics (optional)
  3. Use evaluate to assert outcomes after workloads finish
  4. Return descriptive errors; the runner aggregates them
  5. Export from testing-framework/workflows if reusable

Trait outline:

use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, Expectation, RunContext};

pub struct MyExpectation {
    expected_value: u64,
    captured_baseline: Option<u64>,
}

impl MyExpectation {
    pub fn new(expected_value: u64) -> Self {
        Self {
            expected_value,
            captured_baseline: None,
        }
    }
}

#[async_trait]
impl Expectation for MyExpectation {
    fn name(&self) -> &str {
        "my_expectation"
    }

    async fn start_capture(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        // Optional: capture baseline state before workloads start
        let client = ctx.node_clients().validator_clients().first()
            .ok_or("no validators")?;
        
        let info = client.consensus_info().await?;
        self.captured_baseline = Some(info.height);
        
        tracing::info!(baseline = self.captured_baseline, "captured baseline");
        Ok(())
    }

    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        // Assert the expected condition holds after workloads finish
        let client = ctx.node_clients().validator_clients().first()
            .ok_or("no validators")?;
        
        let info = client.consensus_info().await?;
        let final_height = info.height;
        
        let baseline = self.captured_baseline.unwrap_or(0);
        let delta = final_height.saturating_sub(baseline);
        
        if delta < self.expected_value {
            return Err(format!(
                "expected at least {} blocks, got {}",
                self.expected_value, delta
            ).into());
        }
        
        tracing::info!(delta, "expectation passed");
        Ok(())
    }
}

Key points:

  • name() identifies the expectation in logs
  • start_capture() runs before workloads start (optional)
  • evaluate() runs after workloads finish; return descriptive errors
  • Expectations run sequentially; keep them fast

Adding a Runner (Deployer)

Steps:

  1. Implement testing_framework_core::scenario::Deployer<Caps> for your capability type
  2. Deploy infrastructure and return a Runner
  3. Construct NodeClients and spawn a BlockFeed
  4. Build a RunContext and provide a CleanupGuard for teardown

Trait outline:

use async_trait::async_trait;
use testing_framework_core::scenario::{
    CleanupGuard, Deployer, DynError, Metrics, NodeClients, RunContext, Runner, Scenario,
    spawn_block_feed,
};
use testing_framework_core::topology::deployment::Topology;

pub struct MyDeployer {
    // Configuration: cluster connection details, etc.
}

impl MyDeployer {
    pub fn new() -> Self {
        Self {}
    }
}

#[async_trait]
impl Deployer<()> for MyDeployer {
    type Error = DynError;

    async fn deploy(&self, scenario: &Scenario<()>) -> Result<Runner, Self::Error> {
        // 1. Launch nodes using scenario.topology()
        // 2. Wait for readiness (e.g., consensus info endpoint responds)
        // 3. Build NodeClients for validators/executors
        // 4. Spawn a block feed for expectations (optional but recommended)
        // 5. Create NodeControlHandle if you support restarts (optional)
        // 6. Return a Runner wrapping RunContext + CleanupGuard

        tracing::info!("deploying scenario with MyDeployer");

        let topology: Option<Topology> = None; // Some(topology) if you spawned one
        let node_clients = NodeClients::default(); // Or NodeClients::from_topology(...)

        let client = node_clients
            .any_client()
            .ok_or("no api clients available")?
            .clone();
        let (block_feed, block_feed_guard) = spawn_block_feed(client).await?;

        let telemetry = Metrics::empty(); // or Metrics::from_prometheus(...)
        let node_control = None; // or Some(Arc<dyn NodeControlHandle>)

        let context = RunContext::new(
            scenario.topology().clone(),
            topology,
            node_clients,
            scenario.duration(),
            telemetry,
            block_feed,
            node_control,
        );

        // If you also have other resources to clean up (containers/pods/etc),
        // wrap them in your own CleanupGuard implementation and call
        // CleanupGuard::cleanup(Box::new(block_feed_guard)) inside it.
        Ok(Runner::new(context, Some(Box::new(block_feed_guard))))
    }
}

Key points:

  • deploy() must return a fully prepared Runner
  • Block until nodes are ready before returning (avoid false negatives)
  • Use a CleanupGuard to tear down resources on failure (and on RunHandle drop)
  • If you want chaos workloads, also provide a NodeControlHandle via RunContext

Adding Topology Helpers

Steps:

  1. Extend testing_framework_core::topology::config::TopologyBuilder with new layouts
  2. Keep defaults safe: ensure at least one participant, clamp dispersal factors
  3. Consider adding configuration presets for specialized parameters

Example:

use testing_framework_core::topology::{
    config::TopologyBuilder,
    configs::network::Libp2pNetworkLayout,
};

pub trait TopologyBuilderExt {
    fn network_full(self) -> Self;
}

impl TopologyBuilderExt for TopologyBuilder {
    fn network_full(self) -> Self {
        self.with_network_layout(Libp2pNetworkLayout::Full)
    }
}

Key points:

  • Maintain method chaining (return &mut Self)
  • Validate inputs: clamp factors, enforce minimums
  • Document assumptions (e.g., “requires at least 4 nodes”)

Adding a DSL Helper

To expose your custom workload through the high-level DSL, add a trait extension:

use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, RunContext, ScenarioBuilder, Workload};

#[derive(Default)]
pub struct MyWorkloadBuilder {
    target_rate: u64,
    some_option: bool,
}

impl MyWorkloadBuilder {
    pub const fn target_rate(mut self, target_rate: u64) -> Self {
        self.target_rate = target_rate;
        self
    }

    pub const fn some_option(mut self, some_option: bool) -> Self {
        self.some_option = some_option;
        self
    }

    pub const fn build(self) -> MyWorkload {
        MyWorkload {
            target_rate: self.target_rate,
            some_option: self.some_option,
        }
    }
}

pub struct MyWorkload {
    target_rate: u64,
    some_option: bool,
}

#[async_trait]
impl Workload for MyWorkload {
    fn name(&self) -> &str {
        "my_workload"
    }

    async fn start(&self, _ctx: &RunContext) -> Result<(), DynError> {
        Ok(())
    }
}

pub trait MyWorkloadDsl {
    fn my_workload_with(
        self,
        f: impl FnOnce(MyWorkloadBuilder) -> MyWorkloadBuilder,
    ) -> Self;
}

impl MyWorkloadDsl for ScenarioBuilder {
    fn my_workload_with(
        self,
        f: impl FnOnce(MyWorkloadBuilder) -> MyWorkloadBuilder,
    ) -> Self {
        let builder = f(MyWorkloadBuilder::default());
        self.with_workload(builder.build())
    }
}

Users can then call:

ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(1))
    .my_workload_with(|w| {
        w.target_rate(10)
         .some_option(true)
    })
    .build()

See Also

Example: New Workload & Expectation (Rust)

A minimal, end-to-end illustration of adding a custom workload and matching expectation. This shows the shape of the traits and where to plug into the framework; expand the logic to fit your real test.

Workload: simple reachability probe

Key ideas:

  • name: identifies the workload in logs.
  • expectations: workloads can bundle defaults so callers don’t forget checks.
  • init: derive inputs from the generated topology (e.g., pick a target node).
  • start: drive async activity using the shared RunContext.
use async_trait::async_trait;
use testing_framework_core::{
    scenario::{DynError, Expectation, RunContext, RunMetrics, Workload},
    topology::generation::GeneratedTopology,
};

pub struct ReachabilityWorkload {
    target_idx: usize,
}

impl ReachabilityWorkload {
    pub fn new(target_idx: usize) -> Self {
        Self { target_idx }
    }
}

#[async_trait]
impl Workload for ReachabilityWorkload {
    fn name(&self) -> &str {
        "reachability_workload"
    }

    fn expectations(&self) -> Vec<Box<dyn Expectation>> {
        vec![Box::new(
            crate::custom_workload_example_expectation::ReachabilityExpectation::new(
                self.target_idx,
            ),
        )]
    }

    fn init(
        &mut self,
        topology: &GeneratedTopology,
        _run_metrics: &RunMetrics,
    ) -> Result<(), DynError> {
        if topology.validators().get(self.target_idx).is_none() {
            return Err(Box::new(std::io::Error::new(
                std::io::ErrorKind::Other,
                "no validator at requested index",
            )));
        }
        Ok(())
    }

    async fn start(&self, ctx: &RunContext) -> Result<(), DynError> {
        let client = ctx
            .node_clients()
            .validator_clients()
            .get(self.target_idx)
            .ok_or_else(|| {
                Box::new(std::io::Error::new(
                    std::io::ErrorKind::Other,
                    "missing target client",
                )) as DynError
            })?;

        // Lightweight API call to prove reachability.
        client
            .consensus_info()
            .await
            .map(|_| ())
            .map_err(|e| e.into())
    }
}

Expectation: confirm the target stayed reachable

Key ideas:

  • start_capture: snapshot baseline if needed (not used here).
  • evaluate: assert the condition after workloads finish.
use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, Expectation, RunContext};

pub struct ReachabilityExpectation {
    target_idx: usize,
}

impl ReachabilityExpectation {
    pub fn new(target_idx: usize) -> Self {
        Self { target_idx }
    }
}

#[async_trait]
impl Expectation for ReachabilityExpectation {
    fn name(&self) -> &str {
        "target_reachable"
    }

    async fn evaluate(&mut self, ctx: &RunContext) -> Result<(), DynError> {
        let client = ctx
            .node_clients()
            .validator_clients()
            .get(self.target_idx)
            .ok_or_else(|| {
                Box::new(std::io::Error::new(
                    std::io::ErrorKind::Other,
                    "missing target client",
                )) as DynError
            })?;

        client
            .consensus_info()
            .await
            .map(|_| ())
            .map_err(|e| e.into())
    }
}

How to wire it

  • Build your scenario as usual and call .with_workload(ReachabilityWorkload::new(0)).
  • The bundled expectation is attached automatically; you can add more with .with_expectation(...) if needed.
  • Keep the logic minimal and fast for smoke tests; grow it into richer probes for deeper scenarios.

Internal Crate Reference

High-level roles of the crates that make up the framework:

  • Configs (testing-framework/configs/): Prepares reusable configuration primitives for nodes, networking, tracing, data availability, and wallets, shared by all scenarios and runners. Includes topology generation and circuit asset resolution.

  • Core scenario orchestration (testing-framework/core/): Houses the topology and scenario model, runtime coordination, node clients, and readiness/health probes. Defines Deployer and Runner traits, ScenarioBuilder, and RunContext.

  • Workflows (testing-framework/workflows/): Packages workloads (transaction, DA, chaos) and expectations (consensus liveness) into reusable building blocks. Offers fluent DSL extensions (ScenarioBuilderExt, ChaosBuilderExt).

  • Runners (testing-framework/runners/{local,compose,k8s}/): Implements deployment backends (local host, Docker Compose, Kubernetes) that all consume the same scenario plan. Each provides a Deployer implementation (LocalDeployer, ComposeDeployer, K8sDeployer).

  • Runner Examples (crate name: runner-examples, path: examples/): Runnable binaries demonstrating framework usage and serving as living documentation. These are the primary entry point for running scenarios (examples/src/bin/local_runner.rs, examples/src/bin/compose_runner.rs, examples/src/bin/k8s_runner.rs).

Where to Add New Capabilities

What You’re AddingWhere It GoesExamples
Node config parametertesting-framework/configs/src/topology/configs/Slot duration, log levels, DA params
Topology featuretesting-framework/core/src/topology/New network layouts, node roles
Scenario capabilitytesting-framework/core/src/scenario/New capabilities, context methods
Workloadtesting-framework/workflows/src/workloads/New traffic generators
Expectationtesting-framework/workflows/src/expectations/New success criteria
Builder APItesting-framework/workflows/src/builder/DSL extensions, fluent methods
Deployertesting-framework/runners/New deployment backends
Example scenarioexamples/src/bin/Demonstration binaries

Extension Workflow

Adding a New Workload

  1. Define the workload in testing-framework/workflows/src/workloads/your_workload.rs:
use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, RunContext, Workload};

pub struct YourWorkload;

#[async_trait]
impl Workload for YourWorkload {
    fn name(&self) -> &'static str {
        "your_workload"
    }

    async fn start(&self, _ctx: &RunContext) -> Result<(), DynError> {
        // implementation
        Ok(())
    }
}
  1. Add builder extension in testing-framework/workflows/src/builder/mod.rs:
pub struct YourWorkloadBuilder;

impl YourWorkloadBuilder {
    pub fn some_config(self) -> Self {
        self
    }
}

pub trait ScenarioBuilderExt: Sized {
    fn your_workload(self) -> YourWorkloadBuilder;
}
  1. Use in examples in examples/src/bin/your_scenario.rs:
use testing_framework_core::scenario::ScenarioBuilder;

pub struct YourWorkloadBuilder;

impl YourWorkloadBuilder {
    pub fn some_config(self) -> Self {
        self
    }
}

pub trait YourWorkloadDslExt: Sized {
    fn your_workload_with<F>(self, configurator: F) -> Self
    where
        F: FnOnce(YourWorkloadBuilder) -> YourWorkloadBuilder;
}

impl<Caps> YourWorkloadDslExt for testing_framework_core::scenario::Builder<Caps> {
    fn your_workload_with<F>(self, configurator: F) -> Self
    where
        F: FnOnce(YourWorkloadBuilder) -> YourWorkloadBuilder,
    {
        let _ = configurator(YourWorkloadBuilder);
        self
    }
}

pub fn use_in_examples() {
    let _plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(0))
        .your_workload_with(|w| w.some_config())
        .build();
}

Adding a New Expectation

  1. Define the expectation in testing-framework/workflows/src/expectations/your_expectation.rs:
use async_trait::async_trait;
use testing_framework_core::scenario::{DynError, Expectation, RunContext};

pub struct YourExpectation;

#[async_trait]
impl Expectation for YourExpectation {
    fn name(&self) -> &'static str {
        "your_expectation"
    }

    async fn evaluate(&mut self, _ctx: &RunContext) -> Result<(), DynError> {
        // implementation
        Ok(())
    }
}
  1. Add builder extension in testing-framework/workflows/src/builder/mod.rs:
use testing_framework_core::scenario::ScenarioBuilder;

pub trait YourExpectationDslExt: Sized {
    fn expect_your_condition(self) -> Self;
}

impl<Caps> YourExpectationDslExt for testing_framework_core::scenario::Builder<Caps> {
    fn expect_your_condition(self) -> Self {
        self
    }
}

pub fn use_in_examples() {
    let _plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(0))
        .expect_your_condition()
        .build();
}

Adding a New Deployer

  1. Implement Deployer trait in testing-framework/runners/your_runner/src/deployer.rs:
use async_trait::async_trait;
use testing_framework_core::scenario::{Deployer, Runner, Scenario};

#[derive(Debug)]
pub struct YourError;

pub struct YourDeployer;

#[async_trait]
impl Deployer for YourDeployer {
    type Error = YourError;

    async fn deploy(&self, _scenario: &Scenario<()>) -> Result<Runner, Self::Error> {
        // Provision infrastructure
        // Wait for readiness
        // Return Runner
        todo!()
    }
}
  1. Provide cleanup and handle node control if supported.

  2. Add example in examples/src/bin/your_runner.rs.

For detailed examples, see Extending the Framework and Custom Workload Example.

Part IV — Operations & Deployment

This section covers operational aspects of running the testing framework: prerequisites, deployment configuration, continuous integration, and observability.

What You’ll Learn

  • Prerequisites & Setup: Required files, binaries, circuit assets, and environment configuration
  • Running Examples: How to execute scenarios across host, compose, and k8s runners
  • CI Integration: Automating tests in continuous integration pipelines with caching and matrix testing
  • Environment Variables: Complete reference of all configuration variables
  • Logging & Observability: Log collection strategies, metrics integration, and debugging techniques

Who This Section Is For

  • Operators setting up the framework for the first time
  • DevOps Engineers integrating tests into CI/CD pipelines
  • Developers debugging test failures or performance issues
  • Platform Engineers deploying across different environments (local, Docker, Kubernetes)

This section is organized for progressive depth:

  1. Start with Operations Overview for the big picture
  2. Follow Prerequisites & Setup to prepare your environment
  3. Use Running Examples to execute your first scenarios
  4. Integrate with CI Integration for automated testing
  5. Reference Environment Variables for complete configuration options
  6. Debug with Logging & Observability when issues arise

Key Principles

Operational Hygiene: Assets present, prerequisites satisfied, observability reachable

Environment Fit: Choose the right deployment target based on isolation, reproducibility, and resource needs

Clear Signals: Verify runners report node readiness before starting workloads

Failure Triage: Map failures to specific causes—missing prerequisites, platform issues, or unmet expectations


Ready to get started? Begin with Operations Overview

Operations & Deployment Overview

Operational readiness focuses on prerequisites, environment fit, and clear signals that ensure your test scenarios run reliably across different deployment targets.

Core Principles

  • Prerequisites First: Ensure all required files, binaries, and assets are in place before attempting to run scenarios
  • Environment Fit: Choose the right deployment target (host, compose, k8s) based on your isolation, reproducibility, and resource needs
  • Clear Signals: Verify runners report node readiness before starting workloads to avoid false negatives
  • Failure Triage: Map failures to specific causes—missing prerequisites, platform issues, or unmet expectations

Key Operational Concerns

Prerequisites:

  • versions.env file at repository root (required by helper scripts)
  • Node binaries (nomos-node, nomos-executor) available or built on demand
  • Platform requirements met (Docker for compose, cluster access for k8s)
  • Circuit assets for DA workloads

Artifacts:

  • KZG parameters (circuit assets) for Data Availability scenarios
  • Docker images for compose/k8s deployments
  • Binary bundles for reproducible builds

Environment Configuration:

  • POL_PROOF_DEV_MODE=true is REQUIRED for all runners to avoid expensive proof generation
  • Logging configured via NOMOS_LOG_* variables
  • Observability endpoints (Prometheus, Grafana) optional but useful

Readiness & Health:

  • Runners verify node readiness before starting workloads
  • Health checks prevent premature workload execution
  • Consensus liveness expectations validate basic operation

Runner-Agnostic Design

The framework is intentionally runner-agnostic: the same scenario plan runs across all deployment targets. Understanding which operational concerns apply to each runner helps you choose the right fit.

ConcernHostComposeKubernetes
TopologyFull supportFull supportFull support
WorkloadsAll workloadsAll workloadsAll workloads
ExpectationsAll expectationsAll expectationsAll expectations
Chaos / Node ControlNot supportedSupportedNot yet
Metrics / ObservabilityManual setupExternal stackCluster-wide
Log CollectionTemp filesContainer logsPod logs
IsolationProcess-levelContainerPod + namespace
Setup Time< 1 min2-5 min5-10 min
CI Recommended?Smoke testsPrimaryLarge-scale only

Key insight: Operational concerns (prerequisites, environment variables) are largely consistent across runners, while deployment-specific concerns (isolation, chaos support) vary by backend.

Operational Workflow

flowchart LR
    Setup[Prerequisites & Setup] --> Run[Run Scenarios]
    Run --> Monitor[Monitor & Observe]
    Monitor --> Debug{Success?}
    Debug -->|No| Triage[Failure Triage]
    Triage --> Setup
    Debug -->|Yes| Done[Complete]
  1. Setup: Verify prerequisites, configure environment, prepare assets
  2. Run: Execute scenarios using appropriate runner (host/compose/k8s)
  3. Monitor: Collect logs, metrics, and observability signals
  4. Triage: When failures occur, map to root causes and fix prerequisites

Documentation Structure

This Operations & Deployment section covers:

Philosophy: Treat operational hygiene—assets present, prerequisites satisfied, observability reachable—as the first step to reliable scenario outcomes.

Prerequisites & Setup

This page covers everything you need before running your first scenario.

Required Files

versions.env (Required)

All helper scripts require a versions.env file at the repository root:

VERSION=v0.3.1
NOMOS_NODE_REV=abc123def456789
NOMOS_BUNDLE_VERSION=v1

What it defines:

  • VERSION — Circuit release tag for KZG parameters
  • NOMOS_NODE_REV — Git revision of nomos-node to build/fetch
  • NOMOS_BUNDLE_VERSION — Bundle schema version

Where it’s used:

  • scripts/run/run-examples.sh
  • scripts/build/build-bundle.sh
  • scripts/setup/setup-nomos-circuits.sh
  • CI workflows

Error if missing:

ERROR: versions.env not found at repository root
This file is required and should define:
  VERSION=<circuit release tag>
  NOMOS_NODE_REV=<nomos-node git revision>
  NOMOS_BUNDLE_VERSION=<bundle schema version>

Fix: Ensure you’re in the repository root. The file should already exist in the checked-out repo.

Node Binaries

Scenarios need compiled nomos-node and nomos-executor binaries.

scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

This automatically:

  • Clones/updates nomos-node checkout
  • Builds required binaries
  • Sets NOMOS_NODE_BIN / NOMOS_EXECUTOR_BIN

Option 2: Manual Build

If you have a sibling nomos-node checkout:

cd ../nomos-node
cargo build --release --bin nomos-node --bin nomos-executor

# Set environment variables
export NOMOS_NODE_BIN=$PWD/target/release/nomos-node
export NOMOS_EXECUTOR_BIN=$PWD/target/release/nomos-executor

# Return to testing framework
cd ../nomos-testing

Option 3: Prebuilt Bundles (CI)

CI workflows use prebuilt artifacts:

- name: Download nomos binaries
  uses: actions/download-artifact@v3
  with:
    name: nomos-binaries-linux
    path: .tmp/

- name: Extract bundle
  run: |
    tar -xzf .tmp/nomos-binaries-linux-*.tar.gz -C .tmp/
    export NOMOS_NODE_BIN=$PWD/.tmp/nomos-node
    export NOMOS_EXECUTOR_BIN=$PWD/.tmp/nomos-executor

Circuit Assets (KZG Parameters)

Data Availability (DA) workloads require KZG cryptographic parameters.

Asset Location

Default path: testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params

Note: The directory kzgrs_test_params/ contains a file named kzgrs_test_params. This is the proving key file (~120MB).

Container path (compose/k8s): /kzgrs_test_params/kzgrs_test_params

Getting Assets

Option 1: Use helper script (recommended):

# Fetch circuits
scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits

# Copy to default location
mkdir -p testing-framework/assets/stack/kzgrs_test_params
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/

# Verify (should be ~120MB)
ls -lh testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params

Option 2: Let run-examples.sh handle it:

scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

This automatically fetches and places assets.

Override Path

Set NOMOS_KZGRS_PARAMS_PATH to use a custom location:

NOMOS_KZGRS_PARAMS_PATH=/custom/path/to/kzgrs_test_params \
cargo run -p runner-examples --bin local_runner

When Are Assets Needed?

RunnerWhen Required
Host (local)Always (for DA workloads)
ComposeDuring image build (baked into image)
K8sDuring image build + mounted via hostPath

Error without assets:

Error: Custom { kind: NotFound, error: "Circuit file not found at: testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params" }

Platform Requirements

Host Runner (Local Processes)

Requires:

  • Rust nightly toolchain
  • Node binaries built
  • KZG circuit assets (for DA workloads)
  • Available ports (18080+, 3100+, etc.)

No Docker required.

Best for:

  • Quick iteration
  • Development
  • Smoke tests

Compose Runner (Docker Compose)

Requires:

  • Docker daemon running
  • Docker image built: logos-blockchain-testing:local
  • KZG assets baked into image
  • Docker Desktop (macOS) or Docker Engine (Linux)

Platform notes (macOS / Apple silicon):

  • Prefer NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64 for native performance
  • Use linux/amd64 only if targeting amd64 environments (slower via emulation)

Best for:

  • Reproducible environments
  • CI testing
  • Chaos workloads (node control support)

K8s Runner (Kubernetes)

Requires:

  • Kubernetes cluster (Docker Desktop K8s, minikube, kind, or remote)
  • kubectl configured
  • Docker image built and loaded/pushed
  • KZG assets baked into image + mounted via hostPath

Local cluster setup:

# Docker Desktop: Enable Kubernetes in settings

# OR: Use kind
kind create cluster
kind load docker-image logos-blockchain-testing:local

# OR: Use minikube
minikube start
minikube image load logos-blockchain-testing:local

Remote cluster: Push image to registry and set NOMOS_TESTNET_IMAGE.

Best for:

  • Production-like testing
  • Resource isolation
  • Large topologies

Critical Environment Variable

POL_PROOF_DEV_MODE=true is REQUIRED for ALL runners!

Without this, proof generation uses expensive Groth16 proving, causing:

  • Tests “hang” for minutes
  • CPU spikes to 100%
  • Timeouts and failures

Always set:

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
POL_PROOF_DEV_MODE=true scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose
# etc.

Or add to your shell profile:

# ~/.bashrc or ~/.zshrc
export POL_PROOF_DEV_MODE=true

Quick Setup Check

Run this checklist before your first scenario:

# 1. Verify versions.env exists
cat versions.env

# 2. Check circuit assets (for DA workloads)
ls -lh testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params

# 3. Verify POL_PROOF_DEV_MODE is set
echo $POL_PROOF_DEV_MODE  # Should print: true

# 4. For compose/k8s: verify Docker is running
docker ps

# 5. For compose/k8s: verify image exists
docker images | grep logos-blockchain-testing

# 6. For host runner: verify node binaries (if not using scripts)
$NOMOS_NODE_BIN --version
$NOMOS_EXECUTOR_BIN --version

The easiest path is to let the helper scripts handle everything:

# Host runner
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

# Compose runner
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

# K8s runner
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

These scripts:

  • Verify versions.env exists
  • Clone/build nomos-node if needed
  • Fetch circuit assets if missing
  • Build Docker images (compose/k8s)
  • Load images into cluster (k8s)
  • Run the scenario with proper environment

Next Steps:

Running Examples

The framework provides three runner modes: host (local processes), compose (Docker Compose), and k8s (Kubernetes).

Use scripts/run/run-examples.sh for all modes—it handles all setup automatically:

# Host mode (local processes)
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

# Compose mode (Docker Compose)
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

# K8s mode (Kubernetes)
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

Parameters:

  • -t 60 — Run duration in seconds
  • -v 3 — Number of validators
  • -e 1 — Number of executors
  • host|compose|k8s — Deployment mode

This script handles:

  • Circuit asset setup
  • Binary building/bundling
  • Image building (compose/k8s)
  • Image loading into cluster (k8s)
  • Execution with proper environment

Note: For k8s runs against non-local clusters (e.g. EKS), the cluster pulls images from a registry. In that case, build + push your image separately (see scripts/build/build_test_image.sh) and set NOMOS_TESTNET_IMAGE to the pushed reference.

Quick Smoke Matrix

For a small “does everything still run?” matrix across all runners:

scripts/run/run-test-matrix.sh -t 120 -v 1 -e 1

This runs host, compose, and k8s modes with various image-build configurations. Useful after making runner/image/script changes. Forwards --metrics-* options through to scripts/run/run-examples.sh.

Common options:

  • --modes host,compose,k8s — Restrict which modes run
  • --no-clean — Skip scripts/ops/clean.sh step
  • --no-bundles — Skip scripts/build/build-bundle.sh (reuses existing .tmp tarballs)
  • --no-image-build — Skip the “rebuild image” variants in the matrix (compose/k8s)
  • --allow-nonzero-progress — Soft-pass expectation failures if logs show non-zero progress (local iteration only)
  • --force-k8s-image-build — Allow the k8s image-build variant even on non-docker-desktop clusters

Environment overrides:

  • VERSION=v0.3.1 — Circuit version
  • NOMOS_NODE_REV=<commit> — nomos-node git revision
  • NOMOS_BINARIES_TAR=path/to/bundle.tar.gz — Use prebuilt bundle
  • NOMOS_SKIP_IMAGE_BUILD=1 — Skip image rebuild inside run-examples.sh (compose/k8s)
  • NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64|linux/amd64 — Docker platform for bundle builds (macOS/Windows)
  • COMPOSE_CIRCUITS_PLATFORM=linux-aarch64|linux-x86_64 — Circuits platform for image builds
  • SLOW_TEST_ENV=true — Doubles built-in readiness timeouts (useful in CI / constrained laptops)
  • TESTNET_PRINT_ENDPOINTS=1 — Print TESTNET_ENDPOINTS / TESTNET_PPROF lines during deploy

Dev Workflow: Updating nomos-node Revision

The repo pins a nomos-node revision in versions.env for reproducible builds. To update it or point to a local checkout:

# Pin to a new git revision (updates versions.env + Cargo.toml git revs)
scripts/ops/update-nomos-rev.sh --rev <git_sha>

# Use a local nomos-node checkout instead (for development)
scripts/ops/update-nomos-rev.sh --path /path/to/nomos-node

# If Cargo.toml was marked skip-worktree, clear it
scripts/ops/update-nomos-rev.sh --unskip-worktree

Notes:

  • Don’t commit absolute NOMOS_NODE_PATH values; prefer --rev for shared history/CI
  • After changing rev/path, expect Cargo.lock to update on the next cargo build/cargo test

Cleanup Helper

If you hit Docker build failures, I/O errors, or disk space issues:

scripts/ops/clean.sh

For extra Docker cache cleanup:

scripts/ops/clean.sh --docker

Host Runner (Direct Cargo Run)

For manual control, run the local_runner binary directly:

POL_PROOF_DEV_MODE=true \
NOMOS_NODE_BIN=/path/to/nomos-node \
NOMOS_EXECUTOR_BIN=/path/to/nomos-executor \
cargo run -p runner-examples --bin local_runner

Host Runner Environment Variables

VariableDefaultEffect
NOMOS_DEMO_VALIDATORS1Number of validators (legacy: LOCAL_DEMO_VALIDATORS)
NOMOS_DEMO_EXECUTORS1Number of executors (legacy: LOCAL_DEMO_EXECUTORS)
NOMOS_DEMO_RUN_SECS60Run duration in seconds (legacy: LOCAL_DEMO_RUN_SECS)
NOMOS_NODE_BINPath to nomos-node binary (required)
NOMOS_EXECUTOR_BINPath to nomos-executor binary (required)
NOMOS_LOG_DIRNoneDirectory for per-node log files
NOMOS_TESTS_KEEP_LOGS0Keep per-run temporary directories (useful for debugging/CI)
NOMOS_TESTS_TRACINGfalseEnable debug tracing preset
NOMOS_LOG_LEVELinfoGlobal log level: error, warn, info, debug, trace
NOMOS_LOG_FILTERNoneFine-grained module filtering (e.g., cryptarchia=trace,nomos_da_sampling=debug)
POL_PROOF_DEV_MODEREQUIRED: Set to true for all runners

Note: Requires circuit assets and host binaries. Use scripts/run/run-examples.sh host to handle setup automatically.


Compose Runner (Direct Cargo Run)

For manual control, run the compose_runner binary directly. Compose requires a Docker image with embedded assets.

# 1. Build a Linux bundle (includes binaries + circuits)
scripts/build/build-bundle.sh --platform linux
# Creates .tmp/nomos-binaries-linux-v0.3.1.tar.gz

# 2. Build image (embeds bundle assets)
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
scripts/build/build_test_image.sh

# 3. Run
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

Option 2: Manual Circuit/Image Setup

# Fetch and copy circuits
scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/

# Build image
scripts/build/build_test_image.sh

# Run
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

Platform Note (macOS / Apple Silicon)

  • Docker Desktop runs a linux/arm64 engine by default
  • For native performance: NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64 (recommended for local testing)
  • For amd64 targets: NOMOS_BUNDLE_DOCKER_PLATFORM=linux/amd64 (slower via emulation)

Compose Runner Environment Variables

VariableDefaultEffect
NOMOS_TESTNET_IMAGEImage tag (required, must match built image)
POL_PROOF_DEV_MODEREQUIRED: Set to true for all runners
NOMOS_DEMO_VALIDATORS1Number of validators
NOMOS_DEMO_EXECUTORS1Number of executors
NOMOS_DEMO_RUN_SECS60Run duration in seconds
COMPOSE_NODE_PAIRSAlternative topology format: “validators×executors” (e.g., 3x2)
NOMOS_METRICS_QUERY_URLNonePrometheus-compatible base URL for runner to query
NOMOS_METRICS_OTLP_INGEST_URLNoneFull OTLP HTTP ingest URL for node metrics export
NOMOS_GRAFANA_URLNoneGrafana base URL for printing/logging
COMPOSE_RUNNER_HOST127.0.0.1Host address for port mappings
COMPOSE_RUNNER_PRESERVE0Keep containers running after test
NOMOS_LOG_LEVELinfoNode log level (stdout/stderr)
NOMOS_LOG_FILTERNoneFine-grained module filtering

Config file option: testing-framework/assets/stack/cfgsync.yaml (tracing_settings.logger) — Switch node logs between stdout/stderr and file output

Compose-Specific Features

  • Node control support: Only runner that supports chaos testing (.enable_node_control() + chaos workloads)
  • External observability: Set NOMOS_METRICS_* / NOMOS_GRAFANA_URL to enable telemetry links and querying
    • Quickstart: scripts/setup/setup-observability.sh compose up then scripts/setup/setup-observability.sh compose env

Important:

  • Containers expect KZG parameters at /kzgrs_test_params/kzgrs_test_params (note the repeated filename)
  • Use scripts/run/run-examples.sh compose to handle all setup automatically

K8s Runner (Direct Cargo Run)

For manual control, run the k8s_runner binary directly. K8s requires the same image setup as Compose.

Prerequisites

  1. Kubernetes cluster with kubectl configured
  2. Test image built (same as Compose, preferably with prebuilt bundle)
  3. Image available in cluster (loaded or pushed to registry)

Build and Load Image

# 1. Build image with bundle (recommended)
scripts/build/build-bundle.sh --platform linux
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
scripts/build/build_test_image.sh

# 2. Load into cluster (choose one)
export NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local

# For kind:
kind load docker-image logos-blockchain-testing:local

# For minikube:
minikube image load logos-blockchain-testing:local

# For remote cluster (push to registry):
docker tag logos-blockchain-testing:local your-registry/logos-blockchain-testing:latest
docker push your-registry/logos-blockchain-testing:latest
export NOMOS_TESTNET_IMAGE=your-registry/logos-blockchain-testing:latest

Run the Example

export NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local
export POL_PROOF_DEV_MODE=true
cargo run -p runner-examples --bin k8s_runner

K8s Runner Environment Variables

VariableDefaultEffect
NOMOS_TESTNET_IMAGEImage tag (required)
POL_PROOF_DEV_MODEREQUIRED: Set to true for all runners
NOMOS_DEMO_VALIDATORS1Number of validators
NOMOS_DEMO_EXECUTORS1Number of executors
NOMOS_DEMO_RUN_SECS60Run duration in seconds
NOMOS_METRICS_QUERY_URLNonePrometheus-compatible base URL for runner to query (PromQL)
NOMOS_METRICS_OTLP_INGEST_URLNoneFull OTLP HTTP ingest URL for node metrics export
NOMOS_GRAFANA_URLNoneGrafana base URL for printing/logging
K8S_RUNNER_NAMESPACERandomKubernetes namespace (pin for debugging)
K8S_RUNNER_RELEASERandomHelm release name (pin for debugging)
K8S_RUNNER_NODE_HOSTNodePort host resolution for non-local clusters
K8S_RUNNER_DEBUG0Log Helm stdout/stderr for install commands
K8S_RUNNER_PRESERVE0Keep namespace/release after run (for debugging)

K8s + Observability (Optional)

export NOMOS_METRICS_QUERY_URL=http://your-prometheus:9090
# Prometheus OTLP receiver example:
export NOMOS_METRICS_OTLP_INGEST_URL=http://your-prometheus:9090/api/v1/otlp/v1/metrics
# Optional: print Grafana link in TESTNET_ENDPOINTS
export NOMOS_GRAFANA_URL=http://your-grafana:3000
cargo run -p runner-examples --bin k8s_runner

Notes:

  • NOMOS_METRICS_QUERY_URL must be reachable from the runner process (often via kubectl port-forward)
  • NOMOS_METRICS_OTLP_INGEST_URL must be reachable from nodes (pods/containers) and is backend-specific
    • Quickstart installer: scripts/setup/setup-observability.sh k8s install then scripts/setup/setup-observability.sh k8s env
    • Optional dashboards: scripts/setup/setup-observability.sh k8s dashboards
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s \
  --metrics-query-url http://your-prometheus:9090 \
  --metrics-otlp-ingest-url http://your-prometheus:9090/api/v1/otlp/v1/metrics

In Code (Optional)

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ObservabilityBuilderExt as _;

let plan = ScenarioBuilder::with_node_counts(1, 1)
    .with_metrics_query_url_str("http://your-prometheus:9090")
    .with_metrics_otlp_ingest_url_str("http://your-prometheus:9090/api/v1/otlp/v1/metrics")
    .build();

Important K8s Notes

  • K8s runner mounts testing-framework/assets/stack/kzgrs_test_params as a hostPath volume
  • File path inside pods: /kzgrs_test_params/kzgrs_test_params
  • No node control support yet: Chaos workloads (.enable_node_control()) will fail
  • Optimized for local clusters (Docker Desktop K8s / minikube / kind)
    • Remote clusters require additional setup (registry push, PV/CSI for assets, etc.)
  • Use scripts/run/run-examples.sh k8s to handle all setup automatically

Next Steps

CI Integration

Both LocalDeployer and ComposeDeployer work well in CI environments. Choose based on your tradeoffs.

Runner Comparison for CI

LocalDeployer (Host Runner):

  • Faster startup (no Docker overhead)
  • Good for quick smoke tests
  • Trade-off: Less isolation (processes share host resources)

ComposeDeployer (Recommended for CI):

  • Better isolation (containerized)
  • Reproducible environment
  • Can integrate with external Prometheus/Grafana (optional)
  • Trade-offs: Slower startup (Docker image build), requires Docker daemon

K8sDeployer:

  • Production-like environment
  • Full resource isolation
  • Trade-offs: Slowest (cluster setup + image loading), requires cluster access
  • Best for nightly/weekly runs or production validation

Existing Examples:

See .github/workflows/lint.yml (jobs: host_smoke, compose_smoke) for CI examples running the demo scenarios in this repository.

Complete CI Workflow Example

Here’s a comprehensive GitHub Actions workflow demonstrating host and compose runners with caching, matrix testing, and log collection:

name: Testing Framework CI

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

env:
  POL_PROOF_DEV_MODE: true
  CARGO_TERM_COLOR: always
  RUST_BACKTRACE: 1

jobs:
  # Quick smoke test with host runner (no Docker)
  host_smoke:
    name: Host Runner Smoke Test
    runs-on: ubuntu-latest
    timeout-minutes: 15
    
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      
      - name: Set up Rust toolchain
        uses: actions-rs/toolchain@v1
        with:
          profile: minimal
          toolchain: nightly
          override: true
      
      - name: Cache Rust dependencies
        uses: actions/cache@v3
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ runner.os }}-cargo-host-${{ hashFiles('**/Cargo.lock') }}
          restore-keys: |
            ${{ runner.os }}-cargo-host-
      
      - name: Cache nomos-node build
        uses: actions/cache@v3
        with:
          path: |
            ../nomos-node/target/release/nomos-node
            ../nomos-node/target/release/nomos-executor
          key: ${{ runner.os }}-nomos-${{ hashFiles('../nomos-node/**/Cargo.lock') }}
          restore-keys: |
            ${{ runner.os }}-nomos-
      
      - name: Run host smoke test
        run: |
          # Use run-examples.sh which handles setup automatically
          scripts/run/run-examples.sh -t 120 -v 3 -e 1 host
      
      - name: Upload logs on failure
        if: failure()
        uses: actions/upload-artifact@v3
        with:
          name: host-runner-logs
          path: |
            .tmp/
            *.log
          retention-days: 7

  # Compose runner matrix (with Docker)
  compose_matrix:
    name: Compose Runner (${{ matrix.topology }})
    runs-on: ubuntu-latest
    timeout-minutes: 25
    
    strategy:
      fail-fast: false
      matrix:
        topology:
          - "3v1e"
          - "5v1e"
    
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      
      - name: Set up Rust toolchain
        uses: actions-rs/toolchain@v1
        with:
          profile: minimal
          toolchain: nightly
          override: true
      
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      
      - name: Cache Rust dependencies
        uses: actions/cache@v3
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ runner.os }}-cargo-compose-${{ hashFiles('**/Cargo.lock') }}
          restore-keys: |
            ${{ runner.os }}-cargo-compose-
      
      - name: Cache Docker layers
        uses: actions/cache@v3
        with:
          path: /tmp/.buildx-cache
          key: ${{ runner.os }}-buildx-${{ hashFiles('Dockerfile', 'scripts/build/build_test_image.sh') }}
          restore-keys: |
            ${{ runner.os }}-buildx-
      
      - name: Run compose test
        env:
          TOPOLOGY: ${{ matrix.topology }}
        run: |
          # Build and run with the specified topology
          scripts/run/run-examples.sh -t 120 -v ${TOPOLOGY:0:1} -e ${TOPOLOGY:2:1} compose
      
      - name: Collect Docker logs on failure
        if: failure()
        run: |
          mkdir -p logs
          for container in $(docker ps -a --filter "name=nomos-compose-" -q); do
            docker logs $container > logs/$(docker inspect --format='{{.Name}}' $container).log 2>&1
          done
      
      - name: Upload logs and artifacts
        if: failure()
        uses: actions/upload-artifact@v3
        with:
          name: compose-${{ matrix.topology }}-logs
          path: |
            logs/
            .tmp/
          retention-days: 7
      
      - name: Clean up Docker resources
        if: always()
        run: |
          docker compose down -v 2>/dev/null || true
          docker ps -a --filter "name=nomos-compose-" -q | xargs -r docker rm -f

  # Cucumber/BDD integration tests (if enabled)
  cucumber_tests:
    name: Cucumber BDD Tests
    runs-on: ubuntu-latest
    timeout-minutes: 20
    
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      
      - name: Set up Rust toolchain
        uses: actions-rs/toolchain@v1
        with:
          profile: minimal
          toolchain: nightly
          override: true
      
      - name: Cache dependencies
        uses: actions/cache@v3
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ runner.os }}-cargo-cucumber-${{ hashFiles('**/Cargo.lock') }}
          restore-keys: |
            ${{ runner.os }}-cargo-cucumber-
      
      - name: Run Cucumber tests
        run: |
          # Build prerequisites
          scripts/build/build-bundle.sh --platform linux
          export NOMOS_BINARIES_TAR=$(ls -t .tmp/nomos-binaries-linux-*.tar.gz | head -1)
          
          # Run Cucumber tests (host runner)
          cargo test -p runner-examples --bin cucumber_host
      
      - name: Upload test report
        if: always()
        uses: actions/upload-artifact@v3
        with:
          name: cucumber-report
          path: |
            target/cucumber-reports/
          retention-days: 14

  # Summary job (requires all tests to pass)
  ci_success:
    name: CI Success
    needs: [host_smoke, compose_matrix, cucumber_tests]
    runs-on: ubuntu-latest
    if: always()
    
    steps:
      - name: Check all jobs
        run: |
          if [[ "${{ needs.host_smoke.result }}" != "success" ]] || \
             [[ "${{ needs.compose_matrix.result }}" != "success" ]] || \
             [[ "${{ needs.cucumber_tests.result }}" != "success" ]]; then
            echo "One or more CI jobs failed"
            exit 1
          fi
          echo "All CI jobs passed!"

Workflow Features

  1. Matrix Testing: Runs compose tests with different topologies (3v1e, 5v1e)
  2. Caching: Caches Rust dependencies, Docker layers, and nomos-node builds for faster runs
  3. Log Collection: Automatically uploads logs and artifacts when tests fail
  4. Timeout Protection: Reasonable timeouts prevent jobs from hanging indefinitely
  5. Cucumber Integration: Shows how to integrate BDD tests into CI
  6. Clean Teardown: Ensures Docker resources are cleaned up even on failure

Customization Points

Topology Matrix:

Add more topologies for comprehensive testing:

matrix:
  topology:
    - "3v1e"
    - "5v1e"
    - "10v2e"  # Larger scale

Timeout Adjustments:

Increase timeout-minutes for longer-running scenarios or slower environments:

timeout-minutes: 30  # Instead of 15

Artifact Retention:

Change retention-days based on your storage needs:

retention-days: 14  # Keep logs for 2 weeks

Conditional Execution:

Run expensive tests only on merge to main:

if: github.event_name == 'push' && github.ref == 'refs/heads/main'

Best Practices

Required: Set POL_PROOF_DEV_MODE

Always set POL_PROOF_DEV_MODE=true globally in your workflow env:

env:
  POL_PROOF_DEV_MODE: true  # REQUIRED!

Without this, tests will hang due to expensive proof generation.

Use Helper Scripts

Prefer scripts/run/run-examples.sh which handles all setup automatically:

scripts/run/run-examples.sh -t 120 -v 3 -e 1 host

This is more reliable than manual cargo run commands.

Cache Aggressively

Cache Rust dependencies, nomos-node builds, and Docker layers to speed up CI:

- name: Cache Rust dependencies
  uses: actions/cache@v3
  with:
    path: |
      ~/.cargo/bin/
      ~/.cargo/registry/index/
      ~/.cargo/registry/cache/
      ~/.cargo/git/db/
      target/
    key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}

Collect Logs on Failure

Always upload logs when tests fail for easier debugging:

- name: Upload logs on failure
  if: failure()
  uses: actions/upload-artifact@v3
  with:
    name: test-logs
    path: |
      .tmp/
      *.log
    retention-days: 7

Split Workflows for Faster Iteration

For large projects, split host/compose/k8s into separate workflow files:

  • .github/workflows/test-host.yml — Fast smoke tests
  • .github/workflows/test-compose.yml — Reproducible integration tests
  • .github/workflows/test-k8s.yml — Production-like validation (nightly)

Run K8s Tests Less Frequently

K8s tests are slower. Consider running them only on main branch or scheduled:

on:
  push:
    branches: [main]
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM

Platform-Specific Notes

Ubuntu Runners

  • Docker pre-installed and running
  • Best for compose/k8s runners
  • Most common choice

macOS Runners

  • Docker Desktop not installed by default
  • Slower and more expensive
  • Use only if testing macOS-specific issues

Self-Hosted Runners

  • Cache Docker images locally for faster builds
  • Set resource limits (SLOW_TEST_ENV=true if needed)
  • Ensure cleanup scripts run (docker system prune)

Debugging CI Failures

Enable Debug Logging

Add debug environment variables temporarily:

env:
  RUST_LOG: debug
  NOMOS_LOG_LEVEL: debug

Preserve Containers (Compose)

Set COMPOSE_RUNNER_PRESERVE=1 to keep containers running for inspection:

- name: Run compose test (preserve on failure)
  env:
    COMPOSE_RUNNER_PRESERVE: 1
  run: scripts/run/run-examples.sh -t 120 -v 3 -e 1 compose

Access Artifacts

Download uploaded artifacts from the GitHub Actions UI to inspect logs locally.

Next Steps

Environment Variables Reference

Complete reference of environment variables used by the testing framework, organized by category.

Critical Variables

These MUST be set for successful test runs:

VariableRequiredDefaultEffect
POL_PROOF_DEV_MODEYESREQUIRED for all runners. Set to true to use fast dev-mode proving instead of expensive Groth16. Without this, tests will hang/timeout.

Example:

export POL_PROOF_DEV_MODE=true

Or add to your shell profile (~/.bashrc, ~/.zshrc):

# Required for nomos-testing framework
export POL_PROOF_DEV_MODE=true

Runner Selection & Topology

Control which runner to use and the test topology:

VariableDefaultEffect
NOMOS_DEMO_VALIDATORS1Number of validators (all runners)
NOMOS_DEMO_EXECUTORS1Number of executors (all runners)
NOMOS_DEMO_RUN_SECS60Run duration in seconds (all runners)
LOCAL_DEMO_VALIDATORSLegacy: Number of validators (host runner only)
LOCAL_DEMO_EXECUTORSLegacy: Number of executors (host runner only)
LOCAL_DEMO_RUN_SECSLegacy: Run duration (host runner only)
COMPOSE_NODE_PAIRSCompose-specific topology format: “validators×executors” (e.g., 3x2)

Example:

# Run with 5 validators, 2 executors, for 120 seconds
NOMOS_DEMO_VALIDATORS=5 \
NOMOS_DEMO_EXECUTORS=2 \
NOMOS_DEMO_RUN_SECS=120 \
scripts/run/run-examples.sh -t 120 -v 5 -e 2 host

Node Binaries (Host Runner)

Required for host runner when not using helper scripts:

VariableRequiredDefaultEffect
NOMOS_NODE_BINYes (host)Path to nomos-node binary
NOMOS_EXECUTOR_BINYes (host)Path to nomos-executor binary
NOMOS_NODE_PATHNoPath to nomos-node git checkout (dev workflow)

Example:

export NOMOS_NODE_BIN=/path/to/nomos-node/target/release/nomos-node
export NOMOS_EXECUTOR_BIN=/path/to/nomos-node/target/release/nomos-executor

Docker Images (Compose / K8s)

Required for compose and k8s runners:

VariableRequiredDefaultEffect
NOMOS_TESTNET_IMAGEYes (compose/k8s)logos-blockchain-testing:localDocker image tag for node containers
NOMOS_TESTNET_IMAGE_PULL_POLICYNoIfNotPresent (local) / Always (ECR)K8s imagePullPolicy used by the runner
NOMOS_BINARIES_TARNoPath to prebuilt bundle (.tar.gz) for image build
NOMOS_SKIP_IMAGE_BUILDNo0Skip image rebuild (compose/k8s); assumes image already exists
NOMOS_FORCE_IMAGE_BUILDNo0Force rebuilding the image even when the script would normally skip it (e.g. non-local k8s)

Example:

# Using prebuilt bundle
export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
export NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local
scripts/build/build_test_image.sh

# Using pre-existing image (skip build)
export NOMOS_SKIP_IMAGE_BUILD=1
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

Circuit Assets (KZG Parameters)

Circuit asset configuration for DA workloads:

VariableDefaultEffect
NOMOS_KZGRS_PARAMS_PATHtesting-framework/assets/stack/kzgrs_test_params/kzgrs_test_paramsPath to KZG proving key file
NOMOS_KZG_DIR_RELtesting-framework/assets/stack/kzgrs_test_paramsDirectory containing KZG assets (relative to workspace root)
NOMOS_KZG_FILEkzgrs_test_paramsFilename of the proving key within NOMOS_KZG_DIR_REL
NOMOS_KZG_CONTAINER_PATH/kzgrs_test_params/kzgrs_test_paramsFile path where the node expects KZG params inside containers
NOMOS_KZG_MODERunner-specificK8s only: hostPath (mount from host) or inImage (embed into image)
NOMOS_KZG_IN_IMAGE_PARAMS_PATH/opt/nomos/kzg-params/kzgrs_test_paramsK8s inImage mode: where the proving key is stored inside the image
VERSIONFrom versions.envCircuit release tag (used by helper scripts)
NOMOS_CIRCUITSDirectory containing fetched circuit bundles (set by scripts/setup/setup-circuits-stack.sh)
NOMOS_CIRCUITS_VERSIONLegacy alias for VERSION (supported by some build scripts)
NOMOS_CIRCUITS_PLATFORMAuto-detectedOverride circuits platform (e.g. linux-x86_64, macos-aarch64)
NOMOS_CIRCUITS_HOST_DIR_REL.tmp/nomos-circuits-hostOutput dir for host circuits bundle (relative to repo root)
NOMOS_CIRCUITS_LINUX_DIR_REL.tmp/nomos-circuits-linuxOutput dir for linux circuits bundle (relative to repo root)
NOMOS_CIRCUITS_NONINTERACTIVE0Set to 1 to overwrite outputs without prompting in setup scripts
NOMOS_CIRCUITS_REBUILD_RAPIDSNARK0Set to 1 to force rebuilding rapidsnark (host bundle only)

Example:

# Use custom circuit assets
NOMOS_KZGRS_PARAMS_PATH=/custom/path/to/kzgrs_test_params \
cargo run -p runner-examples --bin local_runner

Node Logging

Control node log output (not framework runner logs):

VariableDefaultEffect
NOMOS_LOG_LEVELinfoGlobal log level: error, warn, info, debug, trace
NOMOS_LOG_FILTERFine-grained module filtering (e.g., cryptarchia=trace,nomos_da_sampling=debug)
NOMOS_LOG_DIRHost runner: directory for per-node log files (persistent). Compose/k8s: use cfgsync.yaml for file logging.
NOMOS_TESTS_KEEP_LOGS0Keep per-run temporary directories (useful for debugging/CI artifacts)
NOMOS_TESTS_TRACINGfalseEnable debug tracing preset (combine with NOMOS_LOG_DIR unless external tracing backends configured)

Important: Node logging ignores RUST_LOG; use NOMOS_LOG_LEVEL and NOMOS_LOG_FILTER for node logs.

Example:

# Debug logging to files
NOMOS_LOG_DIR=/tmp/test-logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug" \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# Inspect logs
ls /tmp/test-logs/
# nomos-node-0.2024-12-18T14-30-00.log
# nomos-node-1.2024-12-18T14-30-00.log

Common filter targets:

Target PrefixSubsystem
cryptarchiaConsensus (Cryptarchia)
nomos_da_samplingDA sampling service
nomos_da_dispersalDA dispersal service
nomos_da_verifierDA verification
nomos_blendMix network/privacy layer
chain_serviceChain service (node APIs/state)
chain_networkP2P networking
chain_leaderLeader election

Observability & Metrics

Optional observability integration:

VariableDefaultEffect
NOMOS_METRICS_QUERY_URLPrometheus-compatible base URL for runner to query (e.g., http://localhost:9090)
NOMOS_METRICS_OTLP_INGEST_URLFull OTLP HTTP ingest URL for node metrics export (e.g., http://localhost:9090/api/v1/otlp/v1/metrics)
NOMOS_GRAFANA_URLGrafana base URL for printing/logging (e.g., http://localhost:3000)
NOMOS_OTLP_ENDPOINTOTLP trace endpoint (optional)
NOMOS_OTLP_METRICS_ENDPOINTOTLP metrics endpoint (optional)

Example:

# Enable Prometheus querying
export NOMOS_METRICS_QUERY_URL=http://localhost:9090
export NOMOS_METRICS_OTLP_INGEST_URL=http://localhost:9090/api/v1/otlp/v1/metrics
export NOMOS_GRAFANA_URL=http://localhost:3000

scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

Compose Runner Specific

Variables specific to Docker Compose deployment:

VariableDefaultEffect
COMPOSE_RUNNER_HOST127.0.0.1Host address for port mappings
COMPOSE_RUNNER_PRESERVE0Keep containers running after test (for debugging)
COMPOSE_RUNNER_HTTP_TIMEOUT_SECSOverride HTTP readiness timeout (seconds)
COMPOSE_RUNNER_HOST_GATEWAYhost.docker.internal:host-gatewayControls extra_hosts entry injected into compose (set to disable to omit)
TESTNET_RUNNER_PRESERVEAlias for COMPOSE_RUNNER_PRESERVE

Example:

# Keep containers after test for debugging
COMPOSE_RUNNER_PRESERVE=1 \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

# Containers remain running
docker ps --filter "name=nomos-compose-"
docker logs <container-id>

K8s Runner Specific

Variables specific to Kubernetes deployment:

VariableDefaultEffect
K8S_RUNNER_NAMESPACERandom UUIDKubernetes namespace (pin for debugging)
K8S_RUNNER_RELEASERandom UUIDHelm release name (pin for debugging)
K8S_RUNNER_NODE_HOSTNodePort host resolution for non-local clusters
K8S_RUNNER_DEBUG0Log Helm stdout/stderr for install commands
K8S_RUNNER_PRESERVE0Keep namespace/release after run (for debugging)
K8S_RUNNER_DEPLOYMENT_TIMEOUT_SECSOverride deployment readiness timeout
K8S_RUNNER_HTTP_TIMEOUT_SECSOverride HTTP readiness timeout (port-forwards)
K8S_RUNNER_HTTP_PROBE_TIMEOUT_SECSOverride HTTP readiness timeout (NodePort probes)
K8S_RUNNER_PROMETHEUS_HTTP_TIMEOUT_SECSOverride Prometheus readiness timeout
K8S_RUNNER_PROMETHEUS_HTTP_PROBE_TIMEOUT_SECSOverride Prometheus NodePort probe timeout

Example:

# Pin namespace for debugging
K8S_RUNNER_NAMESPACE=nomos-test-debug \
K8S_RUNNER_PRESERVE=1 \
K8S_RUNNER_DEBUG=1 \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

# Inspect resources
kubectl get pods -n nomos-test-debug
kubectl logs -n nomos-test-debug -l nomos/logical-role=validator

Platform & Build Configuration

Platform-specific build configuration:

VariableDefaultEffect
NOMOS_BUNDLE_DOCKER_PLATFORMHost archDocker platform for bundle builds: linux/arm64 or linux/amd64 (macOS/Windows hosts)
NOMOS_BIN_PLATFORMLegacy alias for NOMOS_BUNDLE_DOCKER_PLATFORM
COMPOSE_CIRCUITS_PLATFORMHost archCircuits platform for image builds: linux-aarch64 or linux-x86_64
NOMOS_EXTRA_FEATURESExtra cargo features to enable when building bundles (used by scripts/build/build-bundle.sh)

macOS / Apple Silicon:

# Native performance (recommended for local testing)
export NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64

# Or target amd64 (slower via emulation)
export NOMOS_BUNDLE_DOCKER_PLATFORM=linux/amd64

Timeouts & Performance

Timeout and performance tuning:

VariableDefaultEffect
SLOW_TEST_ENVfalseDoubles built-in readiness timeouts (useful in CI / constrained laptops)
TESTNET_PRINT_ENDPOINTS0Print TESTNET_ENDPOINTS / TESTNET_PPROF lines during deploy (set automatically by scripts/run/run-examples.sh)
NOMOS_DISPERSAL_TIMEOUT_SECS20DA dispersal timeout (seconds)
NOMOS_RETRY_COOLDOWN_SECS3Cooldown between retries (seconds)
NOMOS_GRACE_PERIOD_SECS1200Grace period before enforcing strict time-based expectations (seconds)
NOMOS_PRUNE_DURATION_SECS30Prune step duration (seconds)
NOMOS_PRUNE_INTERVAL_SECS5Interval between prune cycles (seconds)
NOMOS_SHARE_DURATION_SECS5Share duration (seconds)
NOMOS_COMMITMENTS_WAIT_SECS1Commitments wait duration (seconds)
NOMOS_SDP_TRIGGER_DELAY_SECS5SDP trigger delay (seconds)

Example:

# Increase timeouts for slow environments
SLOW_TEST_ENV=true \
scripts/run/run-examples.sh -t 120 -v 5 -e 2 compose

Node Configuration (Advanced)

Node-level configuration passed through to nomos-node/nomos-executor:

VariableDefaultEffect
CONSENSUS_SLOT_TIMEConsensus slot time (seconds)
CONSENSUS_ACTIVE_SLOT_COEFFActive slot coefficient (0.0-1.0)
NOMOS_USE_AUTONATUnsetIf set, use AutoNAT instead of a static loopback address for libp2p NAT settings
NOMOS_CFGSYNC_PORT4400Port used for cfgsync service inside the stack
NOMOS_TIME_BACKENDmonotonicSelect time backend (used by compose/k8s stack scripts and deployers)

Example:

# Faster block production
CONSENSUS_SLOT_TIME=5 \
CONSENSUS_ACTIVE_SLOT_COEFF=0.9 \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

Framework Runner Logging (Not Node Logs)

Control framework runner process logs (uses RUST_LOG, not NOMOS_*):

VariableDefaultEffect
RUST_LOGFramework runner log level (e.g., debug, info)
RUST_BACKTRACEEnable Rust backtraces on panic (1 or full)
CARGO_TERM_COLORCargo output color (always, never, auto)

Example:

# Debug framework runner (not nodes)
RUST_LOG=debug \
RUST_BACKTRACE=1 \
cargo run -p runner-examples --bin local_runner

Helper Script Variables

Variables used by helper scripts (scripts/run/run-examples.sh, etc.):

VariableDefaultEffect
NOMOS_NODE_REVFrom versions.envnomos-node git revision to build/fetch
NOMOS_BUNDLE_VERSIONFrom versions.envBundle schema version
NOMOS_IMAGE_SELECTIONInternal: image selection mode set by run-examples.sh (local/ecr/auto)
NOMOS_NODE_APPLY_PATCHES1Set to 0 to disable applying local patches when building bundles
NOMOS_NODE_PATCH_DIRpatches/nomos-nodePatch directory applied to nomos-node checkout during bundle builds
NOMOS_NODE_PATCH_LEVELPatch application level (all or an integer) for bundle builds

Quick Reference Examples

Minimal Host Run

POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

Debug Logging (Host)

POL_PROOF_DEV_MODE=true \
NOMOS_LOG_DIR=/tmp/logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace" \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

Compose with Observability

POL_PROOF_DEV_MODE=true \
NOMOS_METRICS_QUERY_URL=http://localhost:9090 \
NOMOS_GRAFANA_URL=http://localhost:3000 \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

K8s with Debug

POL_PROOF_DEV_MODE=true \
K8S_RUNNER_NAMESPACE=nomos-debug \
K8S_RUNNER_DEBUG=1 \
K8S_RUNNER_PRESERVE=1 \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

CI Environment

env:
  POL_PROOF_DEV_MODE: true
  RUST_BACKTRACE: 1
  NOMOS_TESTS_KEEP_LOGS: 1

See Also

Logging & Observability

Comprehensive guide to log collection, metrics, and debugging across all runners.

Node Logging vs Framework Logging

Critical distinction: Node logs and framework logs use different configuration mechanisms.

ComponentControlled ByPurpose
Framework binaries (cargo run -p runner-examples --bin local_runner)RUST_LOGRunner orchestration, deployment logs
Node processes (validators, executors spawned by runner)NOMOS_LOG_LEVEL, NOMOS_LOG_FILTER (+ NOMOS_LOG_DIR on host runner)Consensus, DA, mempool, network logs

Common mistake: Setting RUST_LOG=debug only increases verbosity of the runner binary itself. Node logs remain at their default level unless you also set NOMOS_LOG_LEVEL=debug.

Example:

# This only makes the RUNNER verbose, not the nodes:
RUST_LOG=debug cargo run -p runner-examples --bin local_runner

# This makes the NODES verbose:
NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner

# Both verbose (typically not needed):
RUST_LOG=debug NOMOS_LOG_LEVEL=debug cargo run -p runner-examples --bin local_runner

Logging Environment Variables

See Environment Variables Reference for complete details. Quick summary:

VariableDefaultEffect
NOMOS_LOG_DIRNone (console only)Host runner: directory for per-node log files. Compose/k8s: use cfgsync.yaml
NOMOS_LOG_LEVELinfoGlobal log level: error, warn, info, debug, trace
NOMOS_LOG_FILTERNoneFine-grained target filtering (e.g., cryptarchia=trace,nomos_da_sampling=debug)
NOMOS_TESTS_TRACINGfalseEnable debug tracing preset
NOMOS_OTLP_ENDPOINTNoneOTLP trace endpoint (optional)
NOMOS_OTLP_METRICS_ENDPOINTNoneOTLP metrics endpoint (optional)

Example: Full debug logging to files:

NOMOS_TESTS_TRACING=true \
NOMOS_LOG_DIR=/tmp/test-logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug,nomos_da_dispersal=debug,nomos_da_verifier=debug" \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

Per-Node Log Files

When NOMOS_LOG_DIR is set, each node writes logs to separate files:

File naming pattern:

  • Validators: Prefix nomos-node-0, nomos-node-1, etc. (may include timestamp suffix)
  • Executors: Prefix nomos-executor-0, nomos-executor-1, etc. (may include timestamp suffix)

Example filenames:

  • nomos-node-0.2024-12-18T14-30-00.log
  • nomos-node-1.2024-12-18T14-30-00.log
  • nomos-executor-0.2024-12-18T14-30-00.log

Local runner note: The local runner uses per-run temporary directories under the current working directory and removes them after the run unless NOMOS_TESTS_KEEP_LOGS=1. Use NOMOS_LOG_DIR=/path/to/logs to write per-node log files to a stable location.

Filter Target Names

Common target prefixes for NOMOS_LOG_FILTER:

Target PrefixSubsystem
cryptarchiaConsensus (Cryptarchia)
nomos_da_samplingDA sampling service
nomos_da_dispersalDA dispersal service
nomos_da_verifierDA verification
nomos_blendMix network/privacy layer
chain_serviceChain service (node APIs/state)
chain_networkP2P networking
chain_leaderLeader election

Example filter:

NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug,chain_service=info,chain_network=info"

Accessing Logs by Runner

Local Runner (Host Processes)

Default (temporary directories, auto-cleanup):

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
# Logs written to temporary directories in working directory
# Automatically cleaned up after test completes

Persistent file output:

NOMOS_LOG_DIR=/tmp/local-logs \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# After test completes:
ls /tmp/local-logs/
# Files with prefix: nomos-node-0*, nomos-node-1*, nomos-executor-0*
# May include timestamps in filename

Tip: Use NOMOS_LOG_DIR for persistent per-node log files, and NOMOS_TESTS_KEEP_LOGS=1 if you want to keep the per-run temporary directories (configs/state) for post-mortem inspection.

Compose Runner (Docker Containers)

Via Docker logs (default, recommended):

# List containers (note the UUID prefix in names)
docker ps --filter "name=nomos-compose-"

# Stream logs from specific container
docker logs -f <container-id-or-name>

# Or use name pattern matching:
docker logs -f $(docker ps --filter "name=nomos-compose-.*-validator-0" -q | head -1)

# Show last 100 lines
docker logs --tail 100 <container-id>

Via file collection (advanced):

To write per-node log files inside containers, set tracing_settings.logger: !File in testing-framework/assets/stack/cfgsync.yaml (and ensure the directory is writable). To access them, you must either:

  1. Copy files out after the run:
# Ensure cfgsync.yaml is configured to log to /logs
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

# After test, copy files from containers:
docker ps --filter "name=nomos-compose-"
docker cp <container-id>:/logs/node* /tmp/
  1. Mount a host volume (requires modifying compose template):
volumes:
  - /tmp/host-logs:/logs  # Add to docker-compose.yml.tera

Recommendation: Use docker logs by default. File collection inside containers is complex and rarely needed.

Keep containers for debugging:

COMPOSE_RUNNER_PRESERVE=1 \
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
cargo run -p runner-examples --bin compose_runner
# Containers remain running after test—inspect with docker logs or docker exec

Compose debugging variables:

  • COMPOSE_RUNNER_HOST=127.0.0.1 — host used for readiness probes
  • COMPOSE_RUNNER_HOST_GATEWAY=host.docker.internal:host-gateway — controls extra_hosts entry (set to disable to omit)
  • TESTNET_RUNNER_PRESERVE=1 — alias for COMPOSE_RUNNER_PRESERVE=1
  • COMPOSE_RUNNER_HTTP_TIMEOUT_SECS=<secs> — override HTTP readiness timeout

Note: Container names follow pattern nomos-compose-{uuid}-validator-{index}-1 where {uuid} changes per run.

K8s Runner (Kubernetes Pods)

Via kubectl logs (use label selectors):

# List pods
kubectl get pods

# Stream logs using label selectors (recommended)
# Helm chart labels:
# - nomos/logical-role=validator|executor
# - nomos/validator-index / nomos/executor-index
kubectl logs -l nomos/logical-role=validator -f
kubectl logs -l nomos/logical-role=executor -f

# Stream logs from specific pod
kubectl logs -f nomos-validator-0

# Previous logs from crashed pods
kubectl logs --previous -l nomos/logical-role=validator

Download logs for offline analysis:

# Using label selectors
kubectl logs -l nomos/logical-role=validator --tail=1000 > all-validators.log
kubectl logs -l nomos/logical-role=executor --tail=1000 > all-executors.log

# Specific pods
kubectl logs nomos-validator-0 > validator-0.log
kubectl logs nomos-executor-1 > executor-1.log

K8s debugging variables:

  • K8S_RUNNER_DEBUG=1 — logs Helm stdout/stderr for install commands
  • K8S_RUNNER_PRESERVE=1 — keep namespace/release after run
  • K8S_RUNNER_NODE_HOST=<ip|hostname> — override NodePort host resolution
  • K8S_RUNNER_NAMESPACE=<name> / K8S_RUNNER_RELEASE=<name> — pin namespace/release (useful for debugging)

Specify namespace (if not using default):

kubectl logs -n my-namespace -l nomos/logical-role=validator -f

Note: K8s runner is optimized for local clusters (Docker Desktop K8s, minikube, kind). Remote clusters require additional setup.


OTLP and Telemetry

OTLP exporters are optional. If you see errors about unreachable OTLP endpoints, it’s safe to ignore them unless you’re actively collecting traces/metrics.

To enable OTLP:

NOMOS_OTLP_ENDPOINT=http://localhost:4317 \
NOMOS_OTLP_METRICS_ENDPOINT=http://localhost:4318 \
cargo run -p runner-examples --bin local_runner

To silence OTLP errors: Simply leave these variables unset (the default).


Observability: Prometheus and Node APIs

Runners expose metrics and node HTTP endpoints for expectation code and debugging.

Prometheus-Compatible Metrics Querying (Optional)

  • Runners do not provision Prometheus automatically
  • For a ready-to-run stack, use scripts/setup/setup-observability.sh:
    • Compose: scripts/setup/setup-observability.sh compose up then scripts/setup/setup-observability.sh compose env
    • K8s: scripts/setup/setup-observability.sh k8s install then scripts/setup/setup-observability.sh k8s env
  • Provide NOMOS_METRICS_QUERY_URL (PromQL base URL) to enable ctx.telemetry() queries
  • Access from expectations when configured: ctx.telemetry().prometheus().map(|p| p.base_url())

Example:

# Start observability stack (Compose)
scripts/setup/setup-observability.sh compose up

# Get environment variables
eval $(scripts/setup/setup-observability.sh compose env)

# Run scenario with metrics
POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

Grafana (Optional)

  • Runners do not provision Grafana automatically (but scripts/setup/setup-observability.sh can)
  • If you set NOMOS_GRAFANA_URL, the deployer prints it in TESTNET_ENDPOINTS
  • Dashboards live in testing-framework/assets/stack/monitoring/grafana/dashboards/ (the bundled stack auto-provisions them)

Example:

# Bring up the bundled Prometheus+Grafana stack (optional)
scripts/setup/setup-observability.sh compose up
eval $(scripts/setup/setup-observability.sh compose env)

export NOMOS_GRAFANA_URL=http://localhost:3000
POL_PROOF_DEV_MODE=true scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

Default bundled Grafana login: admin / admin (see scripts/observability/compose/docker-compose.yml).

Node APIs

  • Access from expectations: ctx.node_clients().validator_clients().get(0)
  • Endpoints: consensus info, network info, DA membership, etc.
  • See testing-framework/core/src/nodes/api_client.rs for available methods

Example usage in expectations:

use testing_framework_core::scenario::{DynError, RunContext};

async fn evaluate(ctx: &RunContext) -> Result<(), DynError> {
    let client = &ctx.node_clients().validator_clients()[0];

    let info = client.consensus_info().await?;
    tracing::info!(height = info.height, "consensus info from validator 0");

    Ok(())
}

Observability Flow

flowchart TD
    Expose[Runner exposes endpoints/ports] --> Collect[Runtime collects block/health signals]
    Collect --> Consume[Expectations consume signals<br/>decide pass/fail]
    Consume --> Inspect[Operators inspect logs/metrics<br/>when failures arise]

Quick Reference

Debug Logging (Host)

NOMOS_LOG_DIR=/tmp/logs \
NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace" \
POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 host

Compose with Observability

# Start observability stack
scripts/setup/setup-observability.sh compose up
eval $(scripts/setup/setup-observability.sh compose env)

# Run with metrics
POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 compose

# Access Grafana at http://localhost:3000

K8s with Debug

K8S_RUNNER_NAMESPACE=nomos-debug \
K8S_RUNNER_DEBUG=1 \
K8S_RUNNER_PRESERVE=1 \
POL_PROOF_DEV_MODE=true \
scripts/run/run-examples.sh -t 60 -v 3 -e 1 k8s

# Inspect logs
kubectl logs -n nomos-debug -l nomos/logical-role=validator

See Also

Part V — Appendix

Quick reference materials, troubleshooting guides, and supplementary information.

Contents

  • Builder API Quick Reference: Cheat sheet for DSL methods
  • Troubleshooting Scenarios: Common issues and their solutions, including “What Failure Looks Like” with realistic examples
  • FAQ: Frequently asked questions
  • Glossary: Terminology reference

When to Use This Section

  • Quick lookups: Find DSL method signatures without reading full guides
  • Debugging failures: Match symptoms to known issues and fixes
  • Clarifying concepts: Look up unfamiliar terms in the glossary
  • Common questions: Check FAQ before asking for help

This section complements the main documentation with practical reference materials that you’ll return to frequently during development and operations.


Jump to:

Builder API Quick Reference

Quick reference for the scenario builder DSL. All methods are chainable.

Imports

use std::time::Duration;

use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_runner_k8s::K8sDeployer;
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::{ChaosBuilderExt, ScenarioBuilderExt};

Topology

use testing_framework_core::scenario::{Builder, ScenarioBuilder};

pub fn topology() -> Builder<()> {
    ScenarioBuilder::topology_with(|t| {
        t.network_star() // Star topology (all connect to seed node)
            .validators(3) // Number of validator nodes
            .executors(2) // Number of executor nodes
    })
}

Wallets

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn wallets_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0))
        .wallets(50) // Seed 50 funded wallet accounts
        .build()
}

Transaction Workload

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn transactions_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0))
        .wallets(50)
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
                .users(20) // Use 20 of the seeded wallets
        }) // Finish transaction workload config
        .build()
}

DA Workload

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn da_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(1))
        .wallets(50)
        .da_with(|da| {
            da.channel_rate(1) // number of DA channels to run
                .blob_rate(2) // target 2 blobs per block (headroom applied)
                .headroom_percent(20) // optional headroom when sizing channels
        }) // Finish DA workload config
        .build()
}

Chaos Workload (Requires enable_node_control())

use std::time::Duration;

use testing_framework_core::scenario::{NodeControlCapability, ScenarioBuilder};
use testing_framework_workflows::{ChaosBuilderExt, ScenarioBuilderExt};

pub fn chaos_plan() -> testing_framework_core::scenario::Scenario<NodeControlCapability> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
        .enable_node_control() // Enable node control capability
        .chaos_with(|c| {
            c.restart() // Random restart chaos
                .min_delay(Duration::from_secs(30)) // Min time between restarts
                .max_delay(Duration::from_secs(60)) // Max time between restarts
                .target_cooldown(Duration::from_secs(45)) // Cooldown after restart
                .apply() // Required for chaos configuration
        })
        .build()
}

Expectations

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn expectations_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0))
        .expect_consensus_liveness() // Assert blocks are produced continuously
        .build()
}

Run Duration

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn run_duration_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0))
        .with_run_duration(Duration::from_secs(120)) // Run for 120 seconds
        .build()
}

Build

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

pub fn build_plan() -> testing_framework_core::scenario::Scenario<()> {
    ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0)).build() // Construct the final Scenario
}

Deployers

use testing_framework_runner_compose::ComposeDeployer;
use testing_framework_runner_k8s::K8sDeployer;
use testing_framework_runner_local::LocalDeployer;

pub fn deployers() {
    // Local processes
    let _deployer = LocalDeployer::default();

    // Docker Compose
    let _deployer = ComposeDeployer::default();

    // Kubernetes
    let _deployer = K8sDeployer::default();
}

Execution

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn execution() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(1).executors(0))
        .expect_consensus_liveness()
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

Complete Example

use std::time::Duration;

use anyhow::Result;
use testing_framework_core::scenario::{Deployer, ScenarioBuilder};
use testing_framework_runner_local::LocalDeployer;
use testing_framework_workflows::ScenarioBuilderExt;

pub async fn run_test() -> Result<()> {
    let mut plan = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(2))
        .wallets(50)
        .transactions_with(|txs| {
            txs.rate(5) // 5 transactions per block
                .users(20)
        })
        .da_with(|da| {
            da.channel_rate(1) // number of DA channels
                .blob_rate(2) // target 2 blobs per block
                .headroom_percent(20) // optional channel headroom
        })
        .expect_consensus_liveness()
        .with_run_duration(Duration::from_secs(90))
        .build();

    let deployer = LocalDeployer::default();
    let runner = deployer.deploy(&plan).await?;
    let _handle = runner.run(&mut plan).await?;

    Ok(())
}

Troubleshooting Scenarios

Prerequisites for All Runners:

  • versions.env file at repository root (required by helper scripts)
  • POL_PROOF_DEV_MODE=true MUST be set for all runners (host, compose, k8s) to avoid expensive Groth16 proof generation that causes timeouts
  • KZG circuit assets must be present at testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params (note the repeated filename) for DA workloads

Platform/Environment Notes:

  • macOS + Docker Desktop (Apple silicon): prefer NOMOS_BUNDLE_DOCKER_PLATFORM=linux/arm64 for local compose/k8s runs to avoid slow/fragile amd64 emulation builds.
  • Disk space: bundle/image builds are storage-heavy. If you see I/O errors or Docker build failures, check free space and prune old artifacts (.tmp/, target/, and Docker build cache) before retrying.
  • K8s runner scope: the default Helm chart mounts KZG params via hostPath and uses a local image tag (logos-blockchain-testing:local). This is intended for local clusters (Docker Desktop / minikube / kind), not remote managed clusters without additional setup.
    • Quick cleanup: scripts/ops/clean.sh (and scripts/ops/clean.sh --docker if needed).
    • Destructive cleanup (last resort): scripts/ops/clean.sh --docker-system --dangerous (add --volumes if you also want to prune Docker volumes).

Recommended: Use scripts/run/run-examples.sh which handles all setup automatically.

Quick Symptom Guide

Common symptoms and likely causes:

  • No or slow block progression: missing POL_PROOF_DEV_MODE=true, missing KZG circuit assets (/kzgrs_test_params/kzgrs_test_params file) for DA workloads, too-short run window, port conflicts, or resource exhaustion—set required env vars, verify assets exist, extend duration, check node logs for startup errors.
  • Transactions not included: unfunded or misconfigured wallets (check .wallets(N) vs .users(M)), transaction rate exceeding block capacity, or rates exceeding block production speed—reduce rate, increase wallet count, verify wallet setup in logs.
  • Chaos stalls the run: chaos (node control) only works with ComposeDeployer; host runner (LocalDeployer) and K8sDeployer don’t support it (won’t “stall”, just can’t execute chaos workloads). With compose, aggressive restart cadence can prevent consensus recovery—widen restart intervals.
  • Observability gaps: metrics or logs unreachable because ports clash or services are not exposed—adjust observability ports and confirm runner wiring.
  • Flaky behavior across runs: mixing chaos with functional smoke tests or inconsistent topology between environments—separate deterministic and chaos scenarios and standardize topology presets.

What Failure Looks Like

This section shows what you’ll actually see when common issues occur. Each example includes realistic console output and the fix.

1. Missing POL_PROOF_DEV_MODE=true (Most Common!)

Symptoms:

  • Test “hangs” with no visible progress
  • CPU usage spikes to 100%
  • Eventually hits timeout after several minutes
  • Nodes appear to start but blocks aren’t produced

What you’ll see:

$ cargo run -p runner-examples --bin local_runner
    Finished dev [unoptimized + debuginfo] target(s) in 0.48s
     Running `target/debug/local_runner`
[INFO  runner_examples::local_runner] Starting local runner scenario
[INFO  testing_framework_runner_local] Launching 3 validators
[INFO  testing_framework_runner_local] Waiting for node readiness...
(hangs here for 5+ minutes, CPU at 100%)
thread 'main' panicked at 'readiness timeout expired'

Root Cause: Groth16 proof generation is extremely slow without dev mode. The system tries to compute real cryptographic proofs, which can take minutes per block.

Fix:

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner

Prevention: Set this in your shell profile or .env file so you never forget it.


2. Missing versions.env File

Symptoms:

  • Helper scripts fail immediately
  • Error about missing file at repo root
  • Scripts can’t determine which circuit/node versions to use

What you’ll see:

$ scripts/run/run-examples.sh -t 60 -v 1 -e 1 host
ERROR: versions.env not found at repository root
This file is required and should define:
  VERSION=<circuit release tag>
  NOMOS_NODE_REV=<nomos-node git revision>
  NOMOS_BUNDLE_VERSION=<bundle schema version>

Root Cause: Helper scripts need versions.env to know which versions to build/fetch.

Fix: Ensure you’re in the repository root directory. The versions.env file should already exist—verify it’s present:

cat versions.env
# Should show:
# VERSION=v0.3.1
# NOMOS_NODE_REV=abc123def456
# NOMOS_BUNDLE_VERSION=v1

3. Missing KZG Circuit Assets (DA Workloads)

Symptoms:

  • DA workload tests fail
  • Error messages about missing circuit files
  • Nodes crash during DA operations

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  testing_framework_runner_local] Starting DA workload
[ERROR nomos_da_dispersal] Failed to load KZG parameters
Error: Custom { kind: NotFound, error: "Circuit file not found at: testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params" }
thread 'main' panicked at 'workload init failed'

Root Cause: DA (Data Availability) workloads require KZG cryptographic parameters. The file must exist at: testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params (note the repeated filename).

Fix (recommended):

# Use run-examples.sh which handles setup automatically
scripts/run/run-examples.sh -t 60 -v 1 -e 1 host

Fix (manual):

# Fetch circuits
scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits

# Copy to expected location
mkdir -p testing-framework/assets/stack/kzgrs_test_params
cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/

# Verify (should be ~120MB)
ls -lh testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params

4. Node Binaries Not Found

Symptoms:

  • Error about missing nomos-node or nomos-executor binary
  • “file not found” or “no such file or directory”
  • Environment variables NOMOS_NODE_BIN / NOMOS_EXECUTOR_BIN not set

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  testing_framework_runner_local] Spawning validator 0
Error: Os { code: 2, kind: NotFound, message: "No such file or directory" }
thread 'main' panicked at 'failed to spawn nomos-node process'

Root Cause: The local runner needs compiled nomos-node and nomos-executor binaries, but doesn’t know where they are.

Fix (recommended):

# Use run-examples.sh which builds binaries automatically
scripts/run/run-examples.sh -t 60 -v 1 -e 1 host

Fix (manual - set paths explicitly):

# Build binaries first
cd ../nomos-node  # or wherever your nomos-node checkout is
cargo build --release --bin nomos-node --bin nomos-executor

# Set environment variables
export NOMOS_NODE_BIN=$PWD/target/release/nomos-node
export NOMOS_EXECUTOR_BIN=$PWD/target/release/nomos-executor

# Return to testing framework
cd ../nomos-testing
POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner

5. Docker Daemon Not Running (Compose)

Symptoms:

  • Compose tests fail immediately
  • “Cannot connect to Docker daemon”
  • Docker commands don’t work

What you’ll see:

$ scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose
[INFO  runner_examples::compose_runner] Starting compose deployment
Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
thread 'main' panicked at 'compose deployment failed'

Root Cause: Docker Desktop isn’t running, or your user doesn’t have permission to access Docker.

Fix:

# macOS: Start Docker Desktop application
open -a Docker

# Linux: Start Docker daemon
sudo systemctl start docker

# Verify Docker is working
docker ps

# If permission denied, add your user to docker group (Linux)
sudo usermod -aG docker $USER
# Then log out and log back in

6. Image Not Found (Compose/K8s)

Symptoms:

  • Compose/K8s tests fail during deployment
  • “Image not found: logos-blockchain-testing:local”
  • Containers fail to start

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin compose_runner
[INFO  testing_framework_runner_compose] Starting compose deployment
Error: Failed to pull image 'logos-blockchain-testing:local': No such image
thread 'main' panicked at 'compose deployment failed'

Root Cause: The Docker image hasn’t been built yet, or was pruned.

Fix (recommended):

# Use run-examples.sh which builds the image automatically
scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose

Fix (manual):

# 1. Build Linux bundle
scripts/build/build-bundle.sh --platform linux

# 2. Set bundle path
export NOMOS_BINARIES_TAR=$(ls -t .tmp/nomos-binaries-linux-*.tar.gz | head -1)

# 3. Build Docker image
scripts/build/build_test_image.sh

# 4. Verify image exists
docker images | grep logos-blockchain-testing

# 5. For kind/minikube: load image into cluster
kind load docker-image logos-blockchain-testing:local
# OR: minikube image load logos-blockchain-testing:local

7. Port Conflicts

Symptoms:

  • “Address already in use” errors
  • Tests fail during node startup
  • Observability stack (Prometheus/Grafana) won’t start

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  testing_framework_runner_local] Launching validator 0 on port 18080
Error: Os { code: 48, kind: AddrInUse, message: "Address already in use" }
thread 'main' panicked at 'failed to bind port 18080'

Root Cause: Previous test didn’t clean up properly, or another service is using the port.

Fix:

# Find processes using the port
lsof -i :18080   # macOS/Linux
netstat -ano | findstr :18080  # Windows

# Kill orphaned nomos processes
pkill nomos-node
pkill nomos-executor

# For compose: ensure containers are stopped
docker compose down
docker ps -a --filter "name=nomos-compose-" -q | xargs docker rm -f

# Check if port is now free
lsof -i :18080  # Should return nothing

For Observability Stack Port Conflicts:

# Edit ports in observability compose file
vim scripts/observability/compose/docker-compose.yml

# Change conflicting port mappings:
# ports:
#   - "9090:9090"  # Prometheus - change to "19090:9090" if needed
#   - "3000:3000"  # Grafana - change to "13000:3000" if needed

8. Wallet Seeding Failed (Insufficient Funds)

Symptoms:

  • Transaction workload reports wallet issues
  • “Insufficient funds” errors
  • Transactions aren’t being submitted

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  testing_framework_workflows] Starting transaction workload with 10 users
[ERROR testing_framework_workflows] Wallet seeding failed: requested 10 users but only 3 wallets available
thread 'main' panicked at 'workload init failed: insufficient wallets'

Root Cause: Topology configured fewer wallets than the workload needs. Transaction workload has .users(M) but topology only has .wallets(N) where N < M.

Fix:

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

let scenario = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .wallets(20) // ← Increase wallet count
    .transactions_with(|tx| {
        tx.users(10) // ← Must be ≤ wallets(20)
            .rate(5)
    })
    .build();

9. Resource Exhaustion (OOM / CPU)

Symptoms:

  • Nodes crash randomly
  • “OOM Killed” messages
  • Test becomes flaky under load
  • Docker containers restart repeatedly

What you’ll see:

$ docker ps --filter "name=nomos-compose-"
CONTAINER ID   STATUS
abc123def456   Restarting (137) 30 seconds ago  # 137 = OOM killed

$ docker logs abc123def456
[INFO  nomos_node] Starting validator
[INFO  consensus] Processing block
Killed  # ← OOM killer terminated the process

Root Cause: Too many nodes, too much workload traffic, or insufficient Docker resources.

Fix:

# 1. Reduce topology size
# In your scenario:
#   .topology(Topology::preset_3v1e())  # Instead of preset_10v2e()

# 2. Reduce workload rates
#   .workload(TransactionWorkload::new().rate(5.0))  # Instead of rate(100.0)

# 3. Increase Docker resources (Docker Desktop)
# Settings → Resources → Memory: 8GB minimum (12GB+ recommended for large topologies)
# Settings → Resources → CPUs: 4+ cores recommended

# 4. Increase file descriptor limits (Linux/macOS)
ulimit -n 4096

# 5. Close other heavy applications (browsers, IDEs, etc.)

10. Logs Disappear After Run

Symptoms:

  • Test completes but no logs on disk
  • Can’t debug failures because logs are gone
  • Temporary directories cleaned up automatically

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  runner_examples] Test complete, cleaning up
[INFO  testing_framework_runner_local] Removing temporary directories
$ ls .tmp/
# Empty or missing

Root Cause: Framework cleans up temporary directories by default to avoid disk bloat.

Fix:

# Persist logs to a specific directory
NOMOS_LOG_DIR=/tmp/test-logs \
NOMOS_TESTS_KEEP_LOGS=1 \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# Logs persist after run
ls /tmp/test-logs/
# nomos-node-0.2024-12-18T14-30-00.log
# nomos-node-1.2024-12-18T14-30-00.log
# ...

11. Consensus Timing Too Tight / Run Duration Too Short

Symptoms:

  • “Consensus liveness expectation failed”
  • Only 1-2 blocks produced (or zero)
  • Nodes appear healthy but not making progress

What you’ll see:

$ POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner
[INFO  testing_framework_core] Starting workloads
[INFO  testing_framework_core] Run window: 10 seconds
[INFO  testing_framework_core] Evaluating expectations
[ERROR testing_framework_core] Consensus liveness expectation failed: expected min 5 blocks, got 1
thread 'main' panicked at 'expectations failed'

Root Cause: Run duration too short for consensus parameters. If CONSENSUS_SLOT_TIME=20s but run duration is only 10s, you can’t produce many blocks.

Fix:

use std::time::Duration;

use testing_framework_core::scenario::ScenarioBuilder;
use testing_framework_workflows::ScenarioBuilderExt;

// Increase run duration to allow more blocks.
let scenario = ScenarioBuilder::topology_with(|t| t.network_star().validators(3).executors(1))
    .expect_consensus_liveness()
    .with_run_duration(Duration::from_secs(120)) // ← Give more time
    .build();

Or adjust consensus timing (if you control node config):

# Faster block production (shorter slot time)
CONSENSUS_SLOT_TIME=5 \
CONSENSUS_ACTIVE_SLOT_COEFF=0.9 \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

Summary: Quick Checklist for Failed Runs

When a test fails, check these in order:

  1. POL_PROOF_DEV_MODE=true is set (REQUIRED for all runners)
  2. versions.env exists at repo root
  3. KZG circuit assets present (for DA workloads): testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params
  4. Node binaries available (NOMOS_NODE_BIN / NOMOS_EXECUTOR_BIN set, or using run-examples.sh)
  5. Docker daemon running (for compose/k8s)
  6. Docker image built (logos-blockchain-testing:local exists for compose/k8s)
  7. No port conflicts (lsof -i :18080, kill orphaned processes)
  8. Sufficient wallets (.wallets(N).users(M))
  9. Enough resources (Docker memory 8GB+, ulimit -n 4096)
  10. Run duration appropriate (long enough for consensus timing)
  11. Logs persisted (NOMOS_LOG_DIR + NOMOS_TESTS_KEEP_LOGS=1 if needed)

Still stuck? Check node logs (see Where to Find Logs) for the actual error.

Where to Find Logs

Log Location Quick Reference

RunnerDefault OutputWith NOMOS_LOG_DIR + FlagsAccess Command
Host (local)Per-run temporary directories under the current working directory (removed unless NOMOS_TESTS_KEEP_LOGS=1)Per-node files with prefix nomos-node-{index} (set NOMOS_LOG_DIR)cat $NOMOS_LOG_DIR/nomos-node-0*
ComposeDocker container stdout/stderrSet tracing_settings.logger: !File in testing-framework/assets/stack/cfgsync.yaml (and mount a writable directory)docker ps then docker logs <container-id>
K8sPod stdout/stderrSet tracing_settings.logger: !File in testing-framework/assets/stack/cfgsync.yaml (and mount a writable directory)kubectl logs -l nomos/logical-role=validator

Important Notes:

  • Host runner (local processes): Per-run temporary directories are created under the current working directory and removed after the run unless NOMOS_TESTS_KEEP_LOGS=1. To write per-node log files to a stable location, set NOMOS_LOG_DIR=/path/to/logs.
  • Compose/K8s: Node log destination is controlled by testing-framework/assets/stack/cfgsync.yaml (tracing_settings.logger). By default, rely on docker logs or kubectl logs.
  • File naming: Log files use prefix nomos-node-{index}* or nomos-executor-{index}* with timestamps, e.g., nomos-node-0.2024-12-01T10-30-45.log (NOT just .log suffix).
  • Container names: Compose containers include project UUID, e.g., nomos-compose-<uuid>-validator-0-1 where <uuid> is randomly generated per run

Accessing Node Logs by Runner

Local Runner

Console output (default):

POL_PROOF_DEV_MODE=true cargo run -p runner-examples --bin local_runner 2>&1 | tee test.log

Persistent file output:

NOMOS_LOG_DIR=/tmp/debug-logs \
NOMOS_LOG_LEVEL=debug \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin local_runner

# Inspect logs (note: filenames include timestamps):
ls /tmp/debug-logs/
# Example: nomos-node-0.2024-12-01T10-30-45.log
tail -f /tmp/debug-logs/nomos-node-0*  # Use wildcard to match timestamp

Compose Runner

Stream live logs:

# List running containers (note the UUID prefix in names)
docker ps --filter "name=nomos-compose-"

# Find your container ID or name from the list, then:
docker logs -f <container-id>

# Or filter by name pattern:
docker logs -f $(docker ps --filter "name=nomos-compose-.*-validator-0" -q | head -1)

# Show last 100 lines
docker logs --tail 100 <container-id>

Keep containers for post-mortem debugging:

COMPOSE_RUNNER_PRESERVE=1 \
NOMOS_TESTNET_IMAGE=logos-blockchain-testing:local \
POL_PROOF_DEV_MODE=true \
cargo run -p runner-examples --bin compose_runner

# OR: Use run-examples.sh (handles setup automatically)
COMPOSE_RUNNER_PRESERVE=1 scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose

# After test failure, containers remain running:
docker ps --filter "name=nomos-compose-"
docker exec -it <container-id> /bin/sh
docker logs <container-id> > debug.log

Note: Container names follow the pattern nomos-compose-{uuid}-validator-{index}-1 or nomos-compose-{uuid}-executor-{index}-1, where {uuid} is randomly generated per run.

K8s Runner

Important: Always verify your namespace and use label selectors instead of assuming pod names.

Stream pod logs (use label selectors):

# Check your namespace first
kubectl config view --minify | grep namespace

# All validator pods (add -n <namespace> if not using default)
kubectl logs -l nomos/logical-role=validator -f

# All executor pods
kubectl logs -l nomos/logical-role=executor -f

# Specific pod by name (find exact name first)
kubectl get pods -l nomos/logical-role=validator  # Find the exact pod name
kubectl logs -f <actual-pod-name>        # Then use it

# With explicit namespace
kubectl logs -n my-namespace -l nomos/logical-role=validator -f

Download logs from crashed pods:

# Previous logs from crashed pod
kubectl get pods -l nomos/logical-role=validator  # Find crashed pod name first
kubectl logs --previous <actual-pod-name> > crashed-validator.log

# Or use label selector for all crashed validators
for pod in $(kubectl get pods -l nomos/logical-role=validator -o name); do
  kubectl logs --previous $pod > $(basename $pod)-previous.log 2>&1
done

Access logs from all pods:

# All pods in current namespace
for pod in $(kubectl get pods -o name); do
  echo "=== $pod ==="
  kubectl logs $pod
done > all-logs.txt

# Or use label selectors (recommended)
kubectl logs -l nomos/logical-role=validator --tail=500 > validators.log
kubectl logs -l nomos/logical-role=executor --tail=500 > executors.log

# With explicit namespace
kubectl logs -n my-namespace -l nomos/logical-role=validator --tail=500 > validators.log

Debugging Workflow

When a test fails, follow this sequence:

1. Check Framework Output

Start with the test harness output—did expectations fail? Was there a deployment error?

Look for:

  • Expectation failure messages
  • Timeout errors
  • Deployment/readiness failures

2. Verify Node Readiness

Ensure all nodes started successfully and became ready before workloads began.

Commands:

# Local: check process list
ps aux | grep nomos

# Compose: check container status (note UUID in names)
docker ps -a --filter "name=nomos-compose-"

# K8s: check pod status (use label selectors, add -n <namespace> if needed)
kubectl get pods -l nomos/logical-role=validator
kubectl get pods -l nomos/logical-role=executor
kubectl describe pod <actual-pod-name>  # Get name from above first

3. Inspect Node Logs

Focus on the first node that exhibited problems or the node with the highest index (often the last to start).

Common error patterns:

  • “ERROR: versions.env missing” → missing required versions.env file at repository root
  • “Failed to bind address” → port conflict
  • “Connection refused” → peer not ready or network issue
  • “Proof verification failed” or “Proof generation timeout” → missing POL_PROOF_DEV_MODE=true (REQUIRED for all runners)
  • “Failed to load KZG parameters” or “Circuit file not found” → missing KZG circuit assets at testing-framework/assets/stack/kzgrs_test_params/
  • “Insufficient funds” → wallet seeding issue (increase .wallets(N) or reduce .users(M))

4. Check Log Levels

If logs are too sparse, increase verbosity:

NOMOS_LOG_LEVEL=debug \
NOMOS_LOG_FILTER="cryptarchia=trace,nomos_da_sampling=debug" \
cargo run -p runner-examples --bin local_runner

If metric updates are polluting your logs (fields like counter.* / gauge.*), move those events to a dedicated tracing target (e.g. target: "nomos_metrics") and set NOMOS_LOG_FILTER="nomos_metrics=off,..." so they don’t get formatted into log output.

5. Verify Observability Endpoints

If expectations report observability issues:

Prometheus (Compose):

curl http://localhost:9090/-/healthy

Node HTTP APIs:

curl http://localhost:18080/consensus/info  # Adjust port per node

6. Compare with Known-Good Scenario

Run a minimal baseline test (e.g., 2 validators, consensus liveness only). If it passes, the issue is in your workload or topology configuration.

Common Error Messages

“Consensus liveness expectation failed”

  • Cause: Not enough blocks produced during the run window, missing POL_PROOF_DEV_MODE=true (causes slow proof generation), or missing KZG assets for DA workloads.
  • Fix:
    1. Verify POL_PROOF_DEV_MODE=true is set (REQUIRED for all runners).
    2. Verify KZG assets exist at testing-framework/assets/stack/kzgrs_test_params/ (for DA workloads).
    3. Extend with_run_duration() to allow more blocks.
    4. Check node logs for proof generation or DA errors.
    5. Reduce transaction/DA rate if nodes are overwhelmed.

“Wallet seeding failed”

  • Cause: Topology doesn’t have enough funded wallets for the workload.
  • Fix: Increase .wallets(N) count or reduce .users(M) in the transaction workload (ensure N ≥ M).

“Node control not available”

  • Cause: Runner doesn’t support node control (only ComposeDeployer does), or enable_node_control() wasn’t called.
  • Fix:
    1. Use ComposeDeployer for chaos tests (LocalDeployer and K8sDeployer don’t support node control).
    2. Ensure .enable_node_control() is called in the scenario before .chaos().

“Readiness timeout”

  • Cause: Nodes didn’t become responsive within expected time (often due to missing prerequisites).
  • Fix:
    1. Verify POL_PROOF_DEV_MODE=true is set (REQUIRED for all runners—without it, proof generation is too slow).
    2. Check node logs for startup errors (port conflicts, missing assets).
    3. Verify network connectivity between nodes.
    4. For DA workloads, ensure KZG circuit assets are present.

“ERROR: versions.env missing”

  • Cause: Helper scripts (run-examples.sh, build-bundle.sh, setup-circuits-stack.sh) require versions.env file at repository root.
  • Fix: Ensure you’re running from the repository root directory. The versions.env file should already exist and contains:
  VERSION=<circuit release tag>
  NOMOS_NODE_REV=<nomos-node git revision>
  NOMOS_BUNDLE_VERSION=<bundle schema version>

Use the checked-in versions.env at the repository root as the source of truth.

“Port already in use”

  • Cause: Previous test didn’t clean up, or another process holds the port.
  • Fix: Kill orphaned processes (pkill nomos-node), wait for Docker cleanup (docker compose down), or restart Docker.

“Image not found: logos-blockchain-testing:local”

  • Cause: Docker image not built for Compose/K8s runners, or KZG assets not baked into the image.
  • Fix (recommended): Use run-examples.sh which handles everything:
    scripts/run/run-examples.sh -t 60 -v 1 -e 1 compose
    
  • Fix (manual):
    1. Build bundle: scripts/build/build-bundle.sh --platform linux
    2. Set bundle path: export NOMOS_BINARIES_TAR=.tmp/nomos-binaries-linux-v0.3.1.tar.gz
    3. Build image: scripts/build/build_test_image.sh
    4. kind/minikube: load the image into the cluster nodes (e.g. kind load docker-image logos-blockchain-testing:local, or minikube image load ...), or push to a registry and set NOMOS_TESTNET_IMAGE accordingly.

“Failed to load KZG parameters” or “Circuit file not found”

  • Cause: DA workload requires KZG circuit assets. The file testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params (note repeated filename) must exist. Inside containers, it’s at /kzgrs_test_params/kzgrs_test_params.
  • Fix (recommended): Use run-examples.sh which handles setup:
    scripts/run/run-examples.sh -t 60 -v 1 -e 1 <mode>
    
  • Fix (manual):
    1. Fetch assets: scripts/setup/setup-nomos-circuits.sh v0.3.1 /tmp/nomos-circuits
    2. Copy to expected path: cp -r /tmp/nomos-circuits/* testing-framework/assets/stack/kzgrs_test_params/
    3. Verify file exists: ls -lh testing-framework/assets/stack/kzgrs_test_params/kzgrs_test_params
    4. For Compose/K8s: rebuild image with assets baked in

For detailed logging configuration and observability setup, see Logging & Observability.

FAQ

Why block-oriented timing?
Slots advance at a fixed rate (NTP-synchronized, 2s by default), so reasoning about blocks and consensus intervals keeps assertions aligned with protocol behavior rather than arbitrary wall-clock durations.

Can I reuse the same scenario across runners?
Yes. The plan stays the same; swap runners (local, compose, k8s) to target different environments.

When should I enable chaos workloads?
Only when testing resilience or operational recovery; keep functional smoke tests deterministic.

How long should runs be?
The framework enforces a minimum of 2× slot duration (4 seconds with default 2s slots), but practical recommendations:

  • Smoke tests: 30s minimum (~14 blocks with default 2s slots, 0.9 coefficient)
  • Transaction workloads: 60s+ (~27 blocks) to observe inclusion patterns
  • DA workloads: 90s+ (~40 blocks) to account for dispersal and sampling
  • Chaos tests: 120s+ (~54 blocks) to allow recovery after restarts

Very short runs (< 30s) risk false confidence—one or two lucky blocks don’t prove liveness.

Do I always need seeded wallets?
Only for transaction scenarios. Data-availability or pure chaos scenarios may not require them, but liveness checks still need validators producing blocks.

What if expectations fail but workloads “look fine”?
Trust expectations first—they capture the intended success criteria. Use the observability signals and runner logs to pinpoint why the system missed the target.

Glossary

  • Validator: node role responsible for participating in consensus and block production.
  • Executor: a validator node with the DA dispersal service enabled. Executors can submit transactions and disperse blob data to the DA network, in addition to performing all validator functions.
  • DA (Data Availability): subsystem ensuring blobs or channel data are published and retrievable for validation.
  • Deployer: component that provisions infrastructure (spawns processes, creates containers, or launches pods), waits for readiness, and returns a Runner. Examples: LocalDeployer, ComposeDeployer, K8sDeployer.
  • Runner: component returned by deployers that orchestrates scenario execution—starts workloads, observes signals, evaluates expectations, and triggers cleanup.
  • Workload: traffic or behavior generator that exercises the system during a scenario run.
  • Expectation: post-run assertion that judges whether the system met the intended success criteria.
  • Topology: declarative description of the cluster shape, roles, and high-level parameters for a scenario.
  • Scenario: immutable plan combining topology, workloads, expectations, and run duration.
  • Blockfeed: stream of block observations used for liveness or inclusion signals during a run.
  • Control capability: the ability for a runner to start, stop, or restart nodes, used by chaos workloads.
  • Slot duration: time interval between consensus rounds in Cryptarchia. Blocks are produced at multiples of the slot duration based on lottery outcomes.
  • Block cadence: observed rate of block production in a live network, measured in blocks per second or seconds per block.
  • Cooldown: waiting period after a chaos action (e.g., node restart) before triggering the next action, allowing the system to stabilize.
  • Run window: total duration a scenario executes, specified via with_run_duration(). Framework auto-extends to at least 2× slot duration.
  • Readiness probe: health check performed by runners to ensure nodes are reachable and responsive before starting workloads. Prevents false negatives from premature traffic.
  • Liveness: property that the system continues making progress (producing blocks) under specified conditions. Contrasts with safety/correctness which verifies that state transitions are accurate.
  • State assertion: expectation that verifies specific values in the system state (e.g., wallet balances, UTXO sets) rather than just progress signals. Also called “correctness expectations.”
  • Mantle transaction: transaction type in Logos that can contain UTXO transfers (LedgerTx) and operations (Op), including channel data (ChannelBlob).
  • Channel: logical grouping for DA blobs; each blob belongs to a channel and references a parent blob in the same channel, creating a chain of related data.
  • POL_PROOF_DEV_MODE: environment variable that disables expensive Groth16 zero-knowledge proof generation for leader election. Required for all runners (local, compose, k8s) for practical testing—without it, proof generation causes timeouts. Should never be used in production environments.

External Resources