Adding an input format🔗
The streaming runtime accepts events as raw text lines and dispatches them through parse_line(line, format) -> Option<EventInputDecoded>. Each variant of InputFormat corresponds to one adapter under crates/rsigma-runtime/src/input/. This page walks through adding a new one (a binary protocol, a vendor-specific log shape, a new structured-text format), wiring it into the CLI, and gating it behind a feature flag.
Pick the right EventInputDecoded shape🔗
EventInputDecoded is a three-arm enum that wraps a typed event payload:
| Variant | Backing type | Use when... |
|---|---|---|
Json | rsigma_eval::JsonEvent | The format maps cleanly onto a JSON object (CEF extensions, key/value with nested structure, GELF). |
Kv | rsigma_eval::KvEvent | The format is a flat Vec<(String, String)> of string fields (logfmt, classic key=value lines). |
Plain | rsigma_eval::PlainEvent | No structure available; only keyword matching makes sense (raw /var/log/messages). |
Pick the shape that minimises copies. If your format already produces a serde_json::Value, use Json and bypass the conversion cost. If it produces a flat keyed string list, use Kv and the engine consumes it directly.
If you need a fourth shape (a structured non-JSON tree, a typed proto message), open an issue first; adding a new EventInputDecoded variant changes every match arm in this module.
Walkthrough: adding Lecf (a hypothetical vendor format)🔗
Step 1: scaffold the adapter module.
crates/rsigma-runtime/src/input/
├── auto.rs
├── cef.rs
├── evtx.rs
├── json.rs
├── lecf.rs ← new
├── logfmt.rs
├── mod.rs ← register the new module here
├── plain.rs
└── syslog.rs
Step 2: write the parser. Convention: one public function parse_<format>(line: &str) -> Option<EventInputDecoded>. Return None if the line cannot be parsed as your format.
// crates/rsigma-runtime/src/input/lecf.rs
use rsigma_eval::JsonEvent;
use serde_json::Value;
use crate::input::EventInputDecoded;
pub fn parse_lecf(line: &str) -> Option<EventInputDecoded> {
if !line.starts_with("LECF:") {
return None;
}
let body = line.strip_prefix("LECF:")?.trim();
let mut map = serde_json::Map::new();
for pair in body.split('|') {
let mut parts = pair.splitn(2, '=');
if let (Some(k), Some(v)) = (parts.next(), parts.next()) {
map.insert(k.trim().to_string(), Value::String(v.trim().to_string()));
}
}
let value = Value::Object(map);
Some(EventInputDecoded::Json(JsonEvent::owned(value)))
}
Keep the parser allocation-light on the happy path. CEF and logfmt are the existing high-throughput references; their parsers do not clone the input line.
Step 3: register the module and re-export the public parser.
// crates/rsigma-runtime/src/input/mod.rs
#[cfg(feature = "lecf")]
mod lecf;
#[cfg(feature = "lecf")]
pub use self::lecf::parse_lecf;
Step 4: extend the InputFormat enum.
pub enum InputFormat {
Auto(SyslogConfig),
Json,
Syslog(SyslogConfig),
Plain,
#[cfg(feature = "logfmt")]
Logfmt,
#[cfg(feature = "cef")]
Cef,
#[cfg(feature = "lecf")]
Lecf, // ← add
}
Step 5: extend the parse_line dispatch.
pub fn parse_line(line: &str, format: &InputFormat) -> Option<EventInputDecoded> {
if line.trim().is_empty() {
return None;
}
Some(match format {
InputFormat::Auto(c) => auto_detect(line, c),
InputFormat::Json => parse_json(line)?,
InputFormat::Syslog(c) => parse_syslog(line, c),
InputFormat::Plain => parse_plain(line),
#[cfg(feature = "logfmt")]
InputFormat::Logfmt => parse_logfmt(line),
#[cfg(feature = "cef")]
InputFormat::Cef => parse_cef(line)?,
#[cfg(feature = "lecf")]
InputFormat::Lecf => parse_lecf(line)?, // ← add
})
}
Step 6: optionally extend auto_detect so users on --input auto (the default) hit your format. Keep it cheap: a single byte / prefix check before the more expensive JSON parse. If your format does not have a cheap fingerprint, do not add it to auto; ship it as an explicit --input <name> opt-in.
Gate it behind a feature🔗
In crates/rsigma-runtime/Cargo.toml:
If your parser pulls in a new crate, add it as an optional dependency and list it after the = sign:
[dependencies]
lecf-parser = { version = "0.3", optional = true }
[features]
lecf = ["dep:lecf-parser"]
Then in crates/rsigma-cli/Cargo.toml:
[features]
default = []
daemon = ["dep:tokio", "dep:axum", "rsigma-runtime/logfmt", "rsigma-runtime/cef", "rsigma-runtime/lecf"]
The Docker image and release archives are built with --all-features; opt-in users that build from source can keep their binary lean by omitting the feature.
Wire it into the CLI🔗
The --input flag is parsed in crates/rsigma-cli/src/daemon/. The current value-parser already accepts strings; add a "lecf" arm in the match that turns the string into an InputFormat. Look for the existing "cef" / "logfmt" arms; same pattern.
Test it🔗
Two test layers:
- Unit tests in
crates/rsigma-runtime/src/input/lecf.rs:
#[cfg(test)]
mod tests {
use super::*;
use rsigma_eval::{Event, EventValue};
#[test]
fn parses_basic_lecf() {
let evt = parse_lecf("LECF: host=server1|action=denied").unwrap();
let EventInputDecoded::Json(e) = evt else { panic!() };
assert_eq!(e.get_field("host"), Some(EventValue::Str("server1".into())));
}
#[test]
fn rejects_non_lecf() {
assert!(parse_lecf("just some text").is_none());
}
}
- Integration test in
crates/rsigma-runtime/tests/:
#[cfg(feature = "lecf")]
#[test]
fn lecf_end_to_end_match() {
// build a tiny RuntimeEngine, parse_line with InputFormat::Lecf,
// assert a rule matches.
}
- Fuzz harness. Add
fuzz/fuzz_targets/fuzz_lecf.rsmirroring the existingfuzz_input_formats.rs; register it infuzz/Cargo.tomland.github/workflows/fuzz.yml. See Fuzzing.
Document it🔗
Three places:
- User guide at
docs/guide/input-formats.md. Add a section with the format description, a sample input line, and any caveats. - CLI reference at
docs/cli/engine/daemon.md(the--inputflag accepts a new value). - Feature flags reference at
docs/reference/feature-flags.md. Addlecfto thersigma-runtimeandrsigma-clirows.
Checklist🔗
-
crates/rsigma-runtime/src/input/<name>.rsadapter module. - Module registered and (if applicable) public parser re-exported from
input/mod.rs. -
InputFormat::<Name>variant added (feature-gated if optional). -
parse_linedispatch extended. -
auto_detectextended (only if your format has a cheap fingerprint). - Feature flag added to
rsigma-runtime/Cargo.tomland forwarded byrsigma-cli. - CLI
--inputvalue-parser extended. - Unit + integration tests in
rsigma-runtime. - Fuzz harness for the new parser (if it ingests untrusted bytes).
- Guide, CLI reference, and feature-flags reference updated.
- CHANGELOG entry.
See also🔗
rsigma-runtimeinput/mod.rs for the existing adapters as reference.- Input Formats guide for the user-facing description.
- Fuzzing for the harness conventions.