Testing🔗
The workspace runs six tiers of tests, all gated in CI. PRs are expected to pass every tier.
At a glance🔗
| Tier | Where it lives | How to run | Gated by |
|---|---|---|---|
| Unit | src/ modules with #[cfg(test)] mod tests | cargo test --workspace --all-features --locked | test job on Linux, macOS, Windows. |
| Integration (in-process) | crates/{rsigma-parser,rsigma-eval,rsigma-convert,rsigma-runtime}/tests/*.rs | Same. | Same. |
| End-to-end (binary + containers) | crates/rsigma-cli/tests/cli_*.rs, crates/rsigma-runtime/tests/nats_e2e.rs, crates/rsigma-convert/tests/postgres_integration.rs | Same; testcontainers-based tests skip when Docker is unavailable. | Same. |
| Snapshot / golden | crates/rsigma-{parser,eval,convert}/tests/snapshots/, tests/fixtures/dynamic-pipelines/golden/ | cargo test plus the SigmaHQ-corpus job for the dynamic-pipelines goldens. | test and sigma-corpus jobs. |
| SigmaHQ corpus | .github/workflows/ci.yml -> sigma-corpus | cargo build --release --all-features --locked -p rsigma then target/release/rsigma rule validate /tmp/sigma/rules/ --verbose | sigma-corpus job, on every PR. |
| Coverage | cargo-llvm-cov (Linux) | cargo llvm-cov --workspace --all-features --lcov --output-path lcov.info | coverage job (advisory, not gating). |
Unit tests🔗
Located inside the crate modules they test. Conventional Rust:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn parses_minimal_rule() {
let rule = parse_sigma_yaml(MINIMAL_YAML).unwrap();
assert_eq!(rule.rules.len(), 1);
}
}
Bias toward unit tests for pure-functional logic (parsers, matchers, formatters). Bias toward integration tests for end-to-end shapes (CLI invocations, daemon HTTP round-trips, dynamic source resolution).
Integration tests (in-process)🔗
These tests link directly against the crate as a library and exercise multi-component flows without spawning the compiled binary.
| Crate | Files | Tests | What they cover |
|---|---|---|---|
rsigma-parser | ast_snapshots.rs, parse_errors.rs (+ snapshots/ for insta) | ~30 | Multi-document parsing, malformed YAML, directory parsing; insta-locked AST snapshots. |
rsigma-eval | integration.rs, correlation_edge.rs, error_paths.rs, pipeline_errors.rs, regression_eval.rs, state_snapshot.rs (+ shared helpers/) | ~56 | Full rule-eval pipelines, correlation edge cases, snapshot replay, pipeline error semantics. |
rsigma-convert | golden_postgres.rs, golden_lynxdb.rs (+ golden/ for committed expected outputs) | (golden) | Backend query generation for every --format (default, view, timescaledb, continuous_aggregate, sliding_window, minimal). |
rsigma-runtime | integration.rs, evtx_integration.rs, sources_integration.rs (the nats_*.rs files live in the E2E section below) | ~40 | Streaming runtime; EVTX file parsing against the committed security.evtx fixture; dynamic source resolution (HTTP, file, command, in-process mocks) with TTL, refresh, and template expansion. |
Helpers (test rule fixtures, common test pipelines) live in crates/<crate>/tests/helpers/mod.rs or crates/<crate>/tests/common/mod.rs. Reuse them; do not duplicate.
Do not duplicate unit-level assertions in integration tests. Integration tests own the boundaries, the multi-component chains, and the error paths.
End-to-end tests🔗
E2E tests cross the binary boundary or stand up real external services through containers. They are the highest-confidence layer and the longest to run.
CLI E2E (crates/rsigma-cli/tests/cli_*.rs)🔗
The 12 cli_*.rs files contain 167 tests that invoke the freshly built rsigma binary via assert_cmd. They exercise stdin, stdout, stderr, exit codes, and (for the daemon tests) the full HTTP, NATS, and OTLP wire surface.
| File | Tests | What it covers |
|---|---|---|
cli_convert.rs | 14 | backend convert against every shipped backend and output format. |
cli_daemon.rs | 20 | Long-running daemon (stdin input), hot-reload, health, shutdown. |
cli_daemon_dynamic.rs | 16 | Dynamic-pipeline source resolution end-to-end via the daemon's POST /api/v1/sources/resolve. |
cli_daemon_http.rs | 10 | HTTP input mode, POST /api/v1/events, OTLP HTTP. |
cli_daemon_nats.rs | 8 | NATS input + sink over an in-process NATS server. |
cli_daemon_otlp.rs | 9 | OTLP HTTP and gRPC ingest, with the metric-label assertions added in PR #115. |
cli_deprecation.rs | 17 | Every deprecated flat alias still works and prints the correct stderr migration warning. |
cli_eval.rs | 32 | engine eval: inline events, @file, stdin, jq / JSONPath, fail-on-detection, exit codes. |
cli_fields.rs | 16 | rule fields extraction across detection items, correlation, filters; --no-filters, --json. |
cli_lint.rs | 21 | rule lint, .rsigma-lint.yml, # rsigma-disable suppressions, --fix. |
cli_parse.rs | 8 | rule parse exit-code and structured-error contract. |
cli_validate.rs | 4 | rule validate against good and bad rule sets. |
The shared harness in crates/rsigma-cli/tests/common/mod.rs is the canonical reference for spawning a long-running daemon under test: it drains stdout in a background thread to prevent pipe stalls, forwards stderr lines via mpsc, probes the actual TCP socket with TcpStream::connect_timeout before returning a handle, and wraps the Child in a ChildGuard RAII type that kills it on drop. PR #115 hardened this against macOS-under-load flakes by replacing every std::thread::sleep wait with a poll_until retry loop that polls the actual observable condition (HTTP status, metric counter) every 50 ms up to a 5 s deadline. Use it for any new daemon-level test; do not roll your own.
Container E2E (NATS and Postgres via testcontainers)🔗
Four files spin up real services in Docker containers via testcontainers. Together they cover 29 tests, all guarded by a can_run_linux_containers() probe that shells out to docker info and checks that the daemon reports a Linux OS type. If Docker is missing or only provides Windows containers, the tests print "Skipping" and return successfully.
| File | Tests | Container | What it covers |
|---|---|---|---|
crates/rsigma-runtime/tests/nats_e2e.rs | 6 | NATS JetStream | Replay-from-offset, replay-from-timestamp, JetStream-based DLQ, consumer groups (the highest-rigor NATS surface). |
crates/rsigma-runtime/tests/nats_integration.rs | 7 | NATS JetStream | Connection auth (token, NKey, JWT), TLS round-trips, ack semantics, source / sink fan-out. |
crates/rsigma-cli/tests/cli_daemon_nats.rs | 8 | NATS JetStream | The full rsigma engine daemon --input nats ... shape: spawn the binary, point it at the container, assert against published detection matches. |
crates/rsigma-convert/tests/postgres_integration.rs | 8 | PostgreSQL | Convert real Sigma rules to SQL with convert_collection, execute the generated queries against a live PostgreSQL container, assert match counts against the Okta cross-tenant impersonation chain from the detection-layer-on-postgres companion project. This is the only place where the documented PostgreSQL backend output formats (default, view, timescaledb, continuous_aggregate, sliding_window) are tested as SQL the database actually accepts, rather than just as text matching a golden file. |
The skip_without_docker!() macro pattern is identical in all four:
macro_rules! skip_without_docker {
() => {
if !can_run_linux_containers() {
eprintln!("Skipping: Docker with Linux container support is not available");
return;
}
};
}
Use the same skip_without_docker!() pattern for any new test that requires an external service via testcontainers. CI runs these on the Linux matrix entry; macOS and Windows entries skip them.
What "e2e" means here🔗
- Goal: cross every internal boundary the binary has, so a regression in the dispatch / IO / metric / exit-code surface fails CI rather than escaping to a user.
- Scope: the compiled binary; the HTTP API; NATS JetStream wiring (via testcontainers, 21 tests across three files); the OTLP HTTP and gRPC handlers; and the PostgreSQL backend's generated SQL (via testcontainers, 8 tests).
- Out of scope (today): LynxDB, Splunk, Elastic, and KQL backends only have golden-text coverage, not live-query e2e. The Kubernetes deployment path has no e2e coverage yet (covered by the Helm Chart roadmap item when it lands).
Golden tests🔗
The dynamic-pipelines suite under tests/fixtures/dynamic-pipelines/ is the canonical golden-file harness:
tests/fixtures/dynamic-pipelines/
├── pipelines/ # inputs (one *.yml per scenario)
├── sources/ # mock source bodies (HTTP, file, command output)
└── golden/ # expected `rsigma pipeline resolve --pretty` output
The CI loop in the sigma-corpus job iterates pipelines/*.yml, runs rsigma pipeline resolve --pretty, and diffs against golden/${name}.json. To run the same check locally:
cargo build --release --all-features --locked -p rsigma
for pipeline in tests/fixtures/dynamic-pipelines/pipelines/*.yml; do
name=$(basename "$pipeline" .yml)
golden="tests/fixtures/dynamic-pipelines/golden/${name}.json"
diff -u "$golden" <(./target/release/rsigma pipeline resolve --pipeline "$pipeline" --pretty) \
|| echo "FAIL: $name"
done
To regenerate a golden after an intentional behaviour change:
./target/release/rsigma pipeline resolve --pipeline tests/fixtures/dynamic-pipelines/pipelines/<name>.yml --pretty \
> tests/fixtures/dynamic-pipelines/golden/<name>.json
Then git diff the resulting golden file; if the diff matches your intent, commit it along with the code change. Otherwise revert and investigate.
SigmaHQ corpus regression🔗
CI clones SigmaHQ/sigma at main and runs three checks (see .github/workflows/ci.yml, job sigma-corpus):
# 1. Every rule must parse and compile.
./target/release/rsigma rule validate /tmp/sigma/rules/ --verbose
# 2. The dynamic-pipelines fixtures must still resolve cleanly against
# the live corpus, validating that the field-mapping and include
# expansion stay compatible with rules in the wild.
./target/release/rsigma rule validate /tmp/sigma/rules/ \
--pipeline tests/fixtures/dynamic-pipelines/pipelines/field_mapping.yml \
--pipeline tests/fixtures/dynamic-pipelines/pipelines/allowlist.yml \
--pipeline tests/fixtures/dynamic-pipelines/pipelines/multi_format.yml \
--pipeline tests/fixtures/dynamic-pipelines/pipelines/extract_languages.yml \
--pipeline tests/fixtures/dynamic-pipelines/pipelines/include_expansion.yml \
--resolve-sources --verbose
# 3. The dynamic-pipelines goldens must match (the diff loop shown above).
A regression in any of those steps fails the PR. Locally:
cargo build --release --all-features --locked -p rsigma
git clone --depth 1 https://github.com/SigmaHQ/sigma /tmp/sigma
./target/release/rsigma rule validate /tmp/sigma/rules/ --verbose
This is the only place we run "the real corpus". Keep it green.
Coverage🔗
The coverage job runs cargo llvm-cov --workspace --all-features --lcov on Linux and uploads lcov.info. It is advisory, not gating; there are no per-crate thresholds enforced today. Drops of more than a couple of percentage points warrant a comment on the PR.
Performance regressions🔗
Criterion benchmarks live under crates/<crate>/benches/. Run them manually:
cargo bench -p rsigma-eval -- eval
cargo bench -p rsigma-parser -- parse
cargo bench -p rsigma-runtime -- runtime_throughput
Benchmarks are not gated in CI. The numbers in Benchmarks come from a manual run on the development workstation; if a PR makes a hot-path change, attach a before/after Criterion summary in the PR description.
Tips🔗
- Run only the failing test first.
cargo test -p rsigma-runtime nats_e2e::test_replay_from_offset -- --nocaptureis much faster than--workspace. - Run feature-gated tests once with the feature off. A
#[cfg(feature = "nats")] fn test_x()is silently skipped if you forget; CI catches that. Locally,cargo test --no-default-features -p rsigma-runtimeis a useful smoke test. - In-process NATS and OTLP servers are spawned by the integration tests in
crates/rsigma-runtime/tests/nats_integration.rsandcrates/rsigma-cli/tests/cli_daemon_otlp.rs; they do not need external infrastructure. - Container-backed NATS e2e in
crates/rsigma-runtime/tests/nats_e2e.rsneeds Docker. On a Mac,colima startor Docker Desktop is the easiest local setup. - CLI tests use
assert_cmd. They invoke the compiledrsigmabinary, so the first run is slow because it triggers a full build. Subsequent runs reuse the cache.
See also: Fuzzing, Benchmarks, Contributing.