rstix๐
rstix is the rsigma workspace crate for STIX 2.1 (and future TAXII 2.1 client work). It provides typed Rust objects for all 42 built-in STIX types, bundle ingestion, extension round-trip, and a semantic validation pipeline.
Canonical API reference: docs.rs/rstix. Contributor-facing detail: crate README.
Phase status๐
| Phase | Status |
|---|---|
Core Foundation (core, id, vocab) | Complete |
Data Model + Serialization (model, Bundle, parse_reader, Bundle::validate) | Complete |
Pattern Engine (pattern โ Pattern::parse, Levels 1โ3 parse + type-check + evaluation) | Parse, type-check, and evaluation complete; printer and indicator wiring deferred |
| Graph + Marking + Store | Planned |
| TAXII Client | Planned |
Quick start๐
use std::fs::File;
use std::io::BufReader;
use rstix::model::{Bundle, ValidationCode};
use rstix::parse_bundle;
// String parse (small bundles)
let bundle = parse_bundle(json_str)?;
// Streaming parse (large bundles, e.g. MITRE ATT&CK ~50 MiB)
let file = File::open("enterprise-attack.json")?;
let bundle = Bundle::parse_reader(BufReader::new(file))?;
// MUST rules enforced at parse; SHOULD rules as warnings
let report = bundle.validate();
for warning in report.warnings_with_code(ValidationCode::StixW0031TlpV1Encoding) {
eprintln!("{}: {}", warning.object_id.as_deref().unwrap_or("?"), warning.message);
}
// Round-trip
let out = serde_json::to_string(&bundle)?;
Pattern Engine (STIX ยง9)๐
The optional pattern feature adds STIX patterning: parse, type-check, and evaluate Levels 1โ3; canonical printer and Indicator AST wiring follow in later Pattern Engine work.
flowchart LR
SRC["Pattern string"] --> LEX["lexer.rs"]
LEX --> PAR["parser.rs<br/>Levels 1โ3 AST"]
PAR --> TCK["typeck.rs<br/>SCO schema + extensions"]
TCK --> PAT["Pattern::parse"]
PAT --> AST["PatternAst"]
AST --> EVAL["eval.rs<br/>Levels 1โ3"]
CTX["context.rs<br/>ObservationContext"] --> EVAL
EVAL --> OUT["bool match"]
PAT -.->|"deferred"| PRINT["print.rs"]
IND["IndicatorPattern::Stix"] -.->|"deferred"| PAT | Module | Role | Status |
|---|---|---|
pattern/lexer.rs | Tokenizer; 64 KiB input cap | Done |
pattern/parser.rs | Recursive-descent parser; dict keys, ref-list [*], custom SCO types | Done |
pattern/typeck.rs | Property paths, extensions.'โฆ', _ref.type, ISSUBSET on CIDR strings | Done |
pattern/eval.rs | Level 1โ3 evaluation, matches_single, matches_single_with_bundle, evaluate_observed_data | Done |
pattern/context.rs | ObservationContext, TimestampedObservation, observed-data context builder | Done |
pattern/security.rs | Regex compile size limit for MATCHES | Done |
pattern/path.rs | Object-path resolution, CIDR helpers, _ref via bundle | Done |
pattern/print.rs | Canonical pattern printer | Planned |
use rstix::Pattern;
use rstix::pattern::{ObservationContext, TimestampedObservation};
let pattern = Pattern::parse("[ipv4-addr:value = '198.51.100.1/32']")?;
assert_eq!(pattern.observed_types().len(), 1);
// Level 1: single SCO
let sco = /* ... */;
assert!(pattern.matches_single(&sco)?);
// Levels 2โ3: timestamped observations
let ctx = ObservationContext::from_scos(&observations);
assert!(pattern.evaluate(&ctx)?);
// Custom SCO types (STIX ยง9.8) appear in observed_type_names(), not observed_types().
let custom = Pattern::parse("[x-usb-device:usbdrive.serial_number = '1']")?;
assert_eq!(custom.observed_type_names(), vec!["x-usb-device"]);
Build with cargo build -p rstix --features pattern.
In scope (parse + type-check + evaluation)๐
Lexer, Level 1โ3 parser, SCO schema type-checker (18 built-in + custom types), Pattern::parse, Pattern::evaluate, matches_single, matches_single_with_bundle, evaluate_observed_data, ObservationContext, full ยง9 comparison and temporal semantics (including MATCHES, hex/binary constants), manifest-driven SCO field tests (tests/pattern_eval_sco_fields.rs, 276 cases), per-operator and error-path integration tests, spec ยง9.8 fixture tests under tests/fixtures/pattern/.
Remaining Pattern Engine work (next slice)๐
| Item | Notes |
|---|---|
| Canonical printer + AST round-trip | print.rs |
IndicatorPattern::Stix { ast } serde | Indicator integration |
fuzz_stix_pattern | Fuzz target |
Grammar authority: STIX Specification ยง9. Internal storage uses PatternAst after type-check.
Evaluation notes (STIX ยง9):
TimestampedObservation::at:Option<StixTimestamp>; temporal patterns returnMissingTimestampwhen any observation lacks a timestamp.matches_single_with_bundle: pass a bundle when Level 1 patterns dereference_refpaths.- Custom SCO types: vendor types (e.g.
x-usb-device) deserialize asCustomScoand evaluate nested property paths. process:name: resolved fromimage_refโ file name when a bundle is present, otherwise from the executable token incommand_line.file:created: alias forctime.network-traffic:dst_ref.type:_refdereference thentypeon the target SCO.file:hashes.MD5: dictionary dot-key syntax per ยง9.7.3.extensions.'โฆ': predefined SCO extension paths (e.g.windows-pebinary-ext.sections[*].entropy).ISSUBSET/ISSUPERSETon string: IP/CIDR subset checks per ยง9.6.- Custom SCO types (
x-usb-device, โฆ): parsed and type-checked permissively (leaf properties as string).
Tests: tests/fixtures/pattern/ (STIX ยง9.8), tests/fixtures/pattern/sco-fields/ (SCO field manifest), tests/pattern_parse.rs, tests/pattern_eval.rs, tests/pattern_spec_eval.rs, tests/pattern_eval_operators.rs, tests/pattern_eval_sco_fields.rs, tests/pattern_eval_errors.rs, unit modules pattern::parser::level1, level23, not, pattern::typeck::, pattern::eval, pattern::security.
Later workspace phases (Graph + Marking + Store, TAXII Client) may index indicators by Pattern::observed_types() but do not reimplement pattern grammar.
Public API surface๐
Crate root (rstix)๐
| Symbol | Role |
|---|---|
parse_bundle(&str) | Parse a bundle JSON string with default ParseOptions. |
Bundle | Typed container; navigation, serialize, validate(). |
StixObject | Top-level enum: SDO / SCO / SRO / Meta / Custom. |
ParseOptions, TypeRegistry | Limits, custom type registration. |
ValidationReport, ValidationCode, ValidationFinding | Semantic validation output. |
ParseError, model::ModelError | Parse-time failures (MUST rules). |
Pattern, PatternAst, PatternScoType, PatternError, PatternMatchError, ObservationContext, TimestampedObservation | STIX pattern parse, type-check, and evaluation (pattern feature). |
core๐
StixId, 42 typed ID wrappers, StixObjectKind, StixTimestamp, TaxiiTimestamp, Confidence, SpecVersion, LanguageTag, QueryableStixObject, QueryValue.
model๐
| Submodule | Contents |
|---|---|
common | SdoSroCommonProps, ScoCommonProps, ExternalReference, GranularMarking, ExtensionMap, KillChainPhase |
meta | MarkingDefinition, ExtensionDefinition, LanguageContent, TLP UUID constants |
sdo | All 19 SDOs, SdoObject, IndicatorPattern, ObservedDataForm, ObservedDataEmbeddedObject |
sro | Relationship, Sighting, SroObject |
sco | All 18 SCOs, ScoObject, typed ref unions, 12 predefined extensions under sco::extensions |
validate | Shared MUST validators (used at deserialize and bundle ref checks) |
validation | Bundle::validate() implementation and ValidationCode enum |
id๐
Deterministic SCO UUIDv5: select_id_contributing_properties, JCS canonicalization, generate_sco_id, verify_sco_deterministic_id.
vocab๐
Closed enums (hash algorithms, encryption algorithms, opinion values) and open vocabularies (REGION_OV, malware types, etc.).
Bundle parsing๐
Methods๐
| Method | Use when |
|---|---|
Bundle::parse(&str) | Entire JSON is in memory. |
Bundle::parse_with_options(&str, &ParseOptions) | Custom types or stricter limits. |
Bundle::parse_reader(R: Read) | Large files; uses serde_json streaming reader with byte cap. |
Bundle::parse_reader_with_options(R, &ParseOptions) | Streaming + options. |
Default ParseOptions๐
| Field | Default | Purpose |
|---|---|---|
max_nesting_depth | 64 | Reject deeply nested JSON (DoS guard). |
max_string_length | 1_048_576 (1 MiB) | Max length of any JSON string value. |
max_bundle_bytes | 256 MiB | Max bytes read from stream / checked for string parse. |
max_object_count | usize::MAX | Max objects in one bundle. |
allow_custom | false | Unknown type โ error unless registered or allowed. |
Navigation๐
| Method | Description |
|---|---|
bundle.objects() | All objects in document order. |
bundle.get(&StixId) | Untyped lookup by id. |
bundle.get_typed::<T>(&StixId) | Typed lookup (Malware, custom types, โฆ). |
bundle.objects_of_type::<T>() | Iterator over all objects of type T. |
bundle.extra_properties(&StixId) | Top-level x_* and hoisted extension keys peeled at parse. |
bundle.validate_refs() | Re-run MUST ref resolution (normally called during parse). |
Plan API name get::<T>() is implemented as get_typed::<T>() to avoid clashing with untyped get.
Custom STIX types๐
Register extension SDOs per ParseOptions instance (not global):
use rstix::model::{Bundle, BundleObjectCast, ParseOptions, StixObject};
#[derive(serde::Deserialize, serde::Serialize)]
struct XMySdo { /* ... */ }
impl BundleObjectCast for XMySdo {
fn cast_from(object: &StixObject) -> Option<&Self> {
match object {
StixObject::Custom(c) => c.downcast_typed(),
_ => None,
}
}
}
let opts = ParseOptions::new().register_custom_type::<XMySdo>("x-my-sdo");
let bundle = Bundle::parse_with_options(json, &opts)?;
Semantic validation (Bundle::validate)๐
Parse enforces STIX MUST rules (hard errors). Bundle::validate() collects SHOULD-level and advisory findings without rejecting the bundle.
ValidationCode | Meaning |
|---|---|
StixW0031TlpV1Encoding | Legacy TLP 1.x marking encoding or TLP1 marking ref (STIX-W0031). |
ScoDeterministicIdMismatch | SCO id does not match UUIDv5 from id-contributing properties. |
GranularSelectorSemanticInvalid | Granular-marking selector does not resolve on the object. |
LanguageContentFieldUnknown | Translation field is not a property on the target object. |
LanguageContentValueMismatch | Translation type or list length does not mirror the target property. |
LanguageContentObjectModifiedMismatch | object_modified does not match target modified. |
LocationCountryNotIso3166 | country is not ISO 3166-1 alpha-2. |
LocationRegionNotInOpenVocab | region is not in STIX region-ov. |
InvalidCapecExternalReference | CAPEC external_id shape (attack-pattern). |
InvalidCveExternalReference | CVE external_id shape (vulnerability). |
RelationshipEndpointMatrixInvalid | Relationship source/target types outside STIX 2.1 matrix. |
EncryptionAlgorithmInvalid | Artifact encryption_algorithm not in closed vocabulary. |
There is no strict parse flag: permissive parse + explicit validate() is the supported workflow (see maintainer direction on issue #267).
Wire-format validation (pragmatic vs full spec)๐
STIX SHOULD cite full Internet standards for some string fields. rstix uses lightweight structural checks at parse time โ enough to reject obvious garbage without pulling in full IDNA/email parsers.
| Field | STIX reference | rstix today | Full standard (not implemented) |
|---|---|---|---|
domain-name.value | RFC 1034 / 5890 | Label structure, no empty labels, no .. | IDNA: Unicode domain โ Punycode (xn--โฆ), full UTS #46 |
email-addr.value | RFC 5322 | Non-empty local@domain with dot in domain, no whitespace | RFC 5322: full addr-spec grammar (quoted strings, comments, IP literals) |
url.value | Valid URL | http://, https://, or ftp:// prefix | WHATWG URL parser, IDNA in host, normalization |
Why full IDNA / RFC 5322 are not in Data Model + Serialization: they are large, locale-sensitive parsers unrelated to STIX object typing. Basic checks catch malformed CTI early; strict compliance belongs in an optional validation profile or a dedicated dependency (idna, mail-parser, etc.) if a downstream consumer requires it.
Extensions and round-trip๐
- Top-level
x_*keys are peeled before typed deserialize โBundle::extra_properties(), merged back on serialize. toplevel-property-extensionkeys are hoisted fromextensionsthe same way.- Standalone leaf deserialize stores unknown keys in
common.extra(SDO/SRO/SCO) orMarkingDefinition.extra, drained intoextra_propertiesduring bundle parse.
Testing๐
| Layer | Location |
|---|---|
| Wire round-trip | tests/spec.rs, tests/fixtures/spec/ |
| Bundle integration | tests/bundle.rs |
| Semantic validation | tests/validation.rs, tests/fixtures/validation/ |
| Streaming + custom types + ATT&CK | tests/integration.rs |
| Pattern parse + type-check + evaluation | tests/pattern_parse.rs, tests/pattern_eval.rs, tests/pattern_spec_eval.rs, tests/pattern_eval_operators.rs, tests/pattern_eval_sco_fields.rs, tests/pattern_eval_errors.rs, tests/fixtures/pattern/, tests/fixtures/pattern/sco-fields/ (requires pattern feature) |
| Fuzz | fuzz/fuzz_targets/fuzz_rstix_parse_bundle.rs |
Run crate tests:
Local MITRE ATT&CK corpus (not in git)๐
The full ATT&CK STIX bundle (~50 MiB) is not committed. CI uses a synthetic 5โฏ000-object streaming test. For local verification, download a bundle (for example MITRE ATT&CK 19.1) and point the integration test at it:
RSTIX_ATTCK_BUNDLE=/path/to/enterprise-attack-19.1.json \
cargo test -p rstix --features serde attck_corpus_roundtrip_when_present -- --nocapture
This runs parse_reader โ serialize โ reparse and asserts object count stability. Verified against enterprise-attack-19.1.json (~53 MiB) locally.
STIX version vs TLP marking encoding๐
Three independent ideas โ do not mix them:
| STIX object model | TLP v1 encoding (legacy) | TLP v2 encoding (current) | |
|---|---|---|---|
| JSON | "spec_version": "2.1" | "definition_type":"tlp", "definition":{"tlp":"white"} | "extensions":{โฆ,"tlp_2_0":"clear"} |
| Meaning | Object follows STIX 2.1 rules | Old TLP label wire format (deprecated for new markings) | Current TLP label wire format |
| rstix constants | SpecVersion::V2_1 | TLP1_WHITE_ID โฆ TLP1_RED_ID | TLP2_CLEAR_ID โฆ TLP2_RED_ID |
A STIX 2.1 bundle can contain marking-definition objects that still use the legacy TLP v1 encoding โ that is normal (ATT&CK references the predefined v1 UUIDs).
Full developer guide: crate README โ STIX version vs TLP marking encoding.
Model invariants (summary)๐
Full table: crate README โ Model invariant decisions.
- MUST at parse: id/type match, ref resolution in bundle, extension routing, SCO forbidden common props, SDO/SRO time ordering, and type-specific MUST rules documented in
ModelError. - SHOULD via
validate(): relationship matrix, CAPEC/CVE, encryption algorithm, TLP v1 warnings, granular selector semantics, language-content rules, location country/region vocabularies, SCO deterministic id.
Feature flags๐
| Feature | Purpose |
|---|---|
serde (default) | Bundle parsing, serialization, validation. |
pattern | STIX pattern lexer, Level 1โ3 parser, type-checker, and evaluator (Pattern::parse, Pattern::evaluate). |