rsigma engine classify🔗
Report which schema each event matches, recognized from the event's content.
Synopsis🔗
Description🔗
Reads events and prints, per event, the schema RSigma recognizes it as (or unknown), plus a per-schema summary. Recognition is content-based: it keys off marker fields and values, not the wire format, so it tells ECS, flat Sysmon, rendered Windows Event Log, CEF, and OCSF apart even when they all arrive as JSON.
This is a diagnostic for understanding a mixed dataset and for tuning schema signatures before wiring them into a pipeline. It does not load rules or evaluate detections. For the live equivalent on a running daemon, see the GET /api/v1/schemas endpoint and the --observe-schemas flag on engine daemon.
Flags🔗
| Flag | Default | Description |
|---|---|---|
-e, --event <EVENT> | stdin | A single event as a JSON string, or @path to read NDJSON from a file. Without this flag, reads NDJSON from stdin. |
--schema-config <PATH> | unset | YAML file of user-defined schema signatures (and optional routing: bindings), merged over the built-ins. |
--explain | off | Show, per event, why it classified as it did: the matched signature's per-predicate pass/fail, or for an unknown event the closest near-miss. |
--check | off | Validate the --schema-config and exit (does not read events). Reports unreachable signatures, unknown or duplicate routing bindings, and missing pipeline files; exits non-zero on any finding. |
The global --output-format flag selects json, ndjson, table, csv, or tsv.
Built-in schemas🔗
| Schema | Recognized by |
|---|---|
ecs_windows | ecs.version present plus a Windows marker (winlog.*, host.os.type: windows). Implies product: windows; routes as ecs. |
ecs_linux | ecs.version present plus a Linux marker (host.os.type: linux, host.os.kernel). Implies product: linux; routes as ecs. |
ecs | ecs.version present |
ocsf | class_uid and metadata.version present |
windows_eventlog | Event.System.EventID or Event.System.Provider present (rendered EVTX) |
sysmon | the Sysmon channel/provider, or flat EventID + ProcessGuid + Image/CommandLine |
cef | deviceVendor, deviceProduct, and signatureId present |
generic_json | any structured event matching no specific schema |
An event that matches no signature (for example a field-less object or non-JSON line) is reported as unknown, which is the signal for an unsupported schema.
User signatures🔗
--schema-config loads additional signatures. Each is a name, an optional specificity (higher wins on overlap; default 50), and a match list of conditions that must all hold:
schemas:
- name: my_vendor
specificity: 70
match:
- field_present: vendor.product
- equals:
field: event_type
value: alert
- any_of: [user.name, user.id]
- matches:
field: message
value: "^CEF:\\d"
- field_absent: ecs.version
- gte: { field: severity, value: 3 }
- in: { field: action, values: [alert, block] }
- any:
- field_present: winlog.channel
- equals: { field: host.os.type, value: windows }
The full grammar, every predicate form (including the numeric, set-membership, cross-field, and not/any/all group forms), and their exact semantics are in the Schema Signatures reference.
Explaining and validating🔗
--explain prints, per event, why it classified as it did, the matched signature's predicates or the closest near-miss for an unknown event:
--check statically validates a config (unreachable signatures, unknown or duplicate routing bindings, missing pipeline files) and exits non-zero on findings, for CI:
When the config has a routing: section, classify also shows, per event, which pipeline-set it would route to, a dry-run of schema routing without loading rules.
Examples🔗
Classify a single event:
Classify an NDJSON stream and see the per-schema table:
Tune custom signatures against a sample:
Output🔗
The structured report carries a summary (total_events, classified, unknown, parse_errors, and by_schema counts) and an events array with the index, schema, and specificity of each event. In table, csv, and tsv formats the per-event rows go to stdout and the summary line to stderr.
Exit codes🔗
| Code | Meaning |
|---|---|
0 | Success |
2 | Bad input (invalid inline JSON, unreadable file) |
3 | Bad schema config |
See Exit Codes for the full scheme.