Expand description
Diagnostic snapshot capture and traversal.
Test scenarios use Op::CaptureSnapshot
to request a host-side diagnostic capture mid-run. The capture
result — a crate::monitor::dump::FailureDumpReport — is keyed by the name argument
and stored on the scenario’s SnapshotBridge, where downstream
test code reaches it via Snapshot for typed traversal of
BTF-rendered map values, per-CPU entries, and scalar variables.
§Lifecycle
-
Wire-up. Before
execute_stepsruns, host orchestration installs aSnapshotBridgein the current thread viaSnapshotBridge::set_thread_local. The bridge owns the storage map and a callable that performs the capture. -
Capture. When the executor reaches
Op::CaptureSnapshot { name }, it invokesSnapshotBridge::capturewith the name. The closure performs the freeze rendezvous (request/reply with the freeze coordinator), builds acrate::monitor::dump::FailureDumpReport, and returns it; the bridge stores it under the name. -
Inspection. After the scenario completes, the test author pulls captured reports out via
SnapshotBridge::drainand constructsSnapshotviews to assert against rendered values:snapshot.var("nr_cpus_onln").as_u64()? > 0,snapshot.map("scx_per_task")?.find(|e| e.get("tid").as_i64().map_or(false, |t| t == pid)).
§On-demand vs error-trigger captures
Op::CaptureSnapshot requests are orthogonal to the error-class freeze
path. The freeze coordinator’s existing state machine for
SCX_EXIT_ERROR triggers (Idle → TookEarly → Done) governs the
unsolicited capture pipeline; on-demand captures funnel
through a separate request/reply channel and never touch the
error-trigger state. The coordinator services on-demand requests
even after Done so post-failure scenarios can still snapshot
state for context. The serialisation rule: at most one capture in
flight at a time — the on-demand path waits for the previous
capture’s vCPUs to fully return to parked == false before
issuing the next freeze request, mirroring the rendezvous
invariants the error-trigger path already obeys.
§Guest → host wire: virtio-console port-1 TLV request/reply
The guest-driven capture trigger rides the virtio-console bulk
port (/dev/vport0p1), not an ioeventfd/MMIO doorbell.
- The guest
Op::CaptureSnapshothandler callscrate::vmm::guest_comms::request_snapshotwithcrate::vmm::wire::SNAPSHOT_KIND_CAPTURE, the capturenameas the tag, and a timeout.request_snapshotallocates a per-requestrequest_id, builds aSnapshotRequestPayload { request_id, kind, tag }, and sends it as a TLV frame over the port-1 TX writer. - The host freeze coordinator services the request, builds the
crate::monitor::dump::FailureDumpReport, and stores it on itsSnapshotBridgekeyed by the tag. request_snapshotblocks reading TLV reply frames from the sameO_RDWRfd until it observes one whose payloadrequest_idmatches, then returns acrate::vmm::wire::SnapshotRequestResult.
The guest
Op::WatchSnapshot
registration uses the same port-1 stream with
crate::vmm::wire::SNAPSHOT_KIND_WATCH.
§No-bridge path
When Op::CaptureSnapshot runs with no installed bridge, the op
fails loudly rather than skipping (per the no-silent-drops
policy): in-guest it routes through the port-1 transport and
bails on a transport failure (including a latched-dead
transport); in host_only mode with no test-fixture bridge it
bails with a “not supported in host_only mode” error.
§Field accessor traversal
SnapshotMap, SnapshotEntry, and SnapshotField form a
lazy borrow chain over the report. Dotted-path lookups (e.g.
entry.get("ctx.weight.value")) walk
RenderedValue::Struct members by name and follow
RenderedValue::Ptr dereferences transparently — the test
author writes the dotted path the BTF source would suggest;
pointer chasing is invisible.
Missing fields land in SnapshotField::Missing with an
actionable error string identifying the path component that
could not be resolved AND the available alternatives at that
level. Terminal accessors (as_u64, as_i64, as_bool,
as_str) return Result<T, SnapshotError> so an absent /
type-mismatched field bubbles up as a recoverable error rather
than panicking.
§Cross-surface accessor vocabulary
SnapshotField, JsonField, and
crate::monitor::btf_render::RenderedValue share a uniform
method vocabulary so a test author moves between the
BTF-rendered (BPF maps + globals), JSON-rendered (scheduler
stats), and raw-tree surfaces without re-learning syntax:
| Method | What it does |
|---|---|
.as_u64()/.as_i64()/.as_f64()/.as_bool() | Typed scalar extract. |
.as_str() | UTF-8 string extract (SnapshotField / JsonField only; Enum variant / JSON string). |
.as_u64_array() / .as_u32_array() / .as_i64_array() / .as_f64_array() / .as_bool_array() | Element-typed array extract. |
.get(path) | Dotted-path walk ("a.b.c"); returns a typed sub-view. |
.member(name) | Single-step struct-member walk (RenderedValue only; no dots). |
.index(i) | Array element by 0-indexed position (RenderedValue only). |
.raw() | Drop into the wrapper’s underlying value for raw Option-returning navigation (RenderedValue for SnapshotField, serde_json::Value for JsonField). |
The wrapper types (SnapshotField, JsonField) return
Result with rich SnapshotError context; the raw
RenderedValue layer returns Option (the caller has already
pattern-matched into a known variant, so absence is a
programming-error class handled locally). Convert between
layers with SnapshotField::raw().
For multi-scheduler scenarios (after
crate::scenario::ops::Op::ReplaceScheduler or two
crate::scenario::ops::Op::AttachScheduler calls), use
Snapshot::active to project the view to the currently-
attached scheduler’s maps and chain the standard accessors
against it. Snapshot::live_var is the shorthand for
self.active()?.var(name); Snapshot::vars iterates every
captured copy when the framework cannot determine “active”
automatically.
Re-exports§
pub use bridge::BridgeGuard;pub use bridge::CaptureCallback;pub use bridge::CgroupProcsSnapshot;pub use bridge::KernelOpCallback;pub use bridge::MAX_STORED_EVENTS;pub use bridge::MAX_STORED_SNAPSHOTS;pub use bridge::MAX_WATCH_SNAPSHOTS;pub use bridge::SnapshotBridge;pub use bridge::SnapshotBridgeEvent;pub use bridge::WatchRegisterCallback;pub use bridge::with_active_bridge;
Modules§
- bridge
SnapshotBridgeis the request/reply channel between the scenario executor and the host capture pipeline. Implements callbacks (CaptureCallback,WatchRegisterCallback), the per-thread bridge installation guard (BridgeGuard), the diagnostic event log (SnapshotBridgeEvent), and the storage caps (MAX_STORED_SNAPSHOTS,MAX_STORED_EVENTS,MAX_WATCH_SNAPSHOTS).- pickers
- Predefined disambiguator closures for
Snapshot::live_var_via.
Structs§
- Drained
Snapshot Entry - Typed shape of one entry drained from the snapshot bridge’s ordered per-tag store. Fields:
- Excluded
Map - One captured map that the KVA-whitelist filter rejected.
Payload for
SnapshotError::ActiveFilterExcludedMaps::excluded_maps. Themap_kvafield name matchescrate::monitor::dump::FailureDumpMap::map_kva(the source-of-truth field), and amap_kva == 0here flags a capture where the per-map KVA was not recorded (synthetic fixture or capture-path bug — production captures filter zero KVAs out at the walker level). - Snapshot
- Borrowed view over a captured
FailureDumpReportfor typed traversal of BTF-rendered map values, per-CPU entries, and scalar variables. - Snapshot
Map - One map’s view, possibly narrowed to a specific per-CPU slot via
Self::cpu. Returned bySnapshot::map.
Enums§
- Json
Field - One value’s view at the leaf of a dotted-path walk over a
serde_json::Value. Returned bystats_path/StatsValue::get. - Missing
Stats Reason - Why a sample’s
statsslot is unavailable — carried onSnapshotError::MissingStatsso operator diagnostics name the specific failure mode rather than the generic “stats absent”. Built byFrom<&crate::vmm::sched_stats::SchedStatsError>for the relay-failure path, plus dedicated variants for the pre-client gates that thecrate::vmm::SchedStatsErrorenum doesn’t cover (no scheduler binary configured). - Snapshot
Entry - One entry’s view — either a HASH (key, value) pair, a per-CPU array entry, a per-CPU hash entry, a single rendered value, or a missing-entry marker.
- Snapshot
Error - Reason a snapshot accessor or terminal read could not resolve.
- Snapshot
Field - One field’s view at the leaf of a dotted-path walk.
Functions§
- stats_
path - Build a
JsonFieldview rooted atvalueand walk along the dotted path. An empty path returns the root unchanged so a caller writingstats_path(v, "").as_f64()(e.g. for a scalar-rooted stats response) hits the typed scalar accessor directly.
Type Aliases§
- Snapshot
Result - Result alias for snapshot accessors.