Snapshot

Struct Snapshot 

Source
#[non_exhaustive]
pub struct Snapshot<'a> { /* private fields */ }
Expand description

Borrowed view over a captured FailureDumpReport for typed traversal of BTF-rendered map values, per-CPU entries, and scalar variables.

Constructed from a FailureDumpReport reference (typically obtained via super::SnapshotBridge::drain); the view is cheap to build — it does not copy the underlying report. Accessor methods all return further borrowed views that walk the report in place.

Implementations§

Source§

impl<'a> Snapshot<'a>

Source

pub fn new(report: &'a FailureDumpReport) -> Self

Build a borrowed view over report with no active-scheduler filter. Every map-walking accessor sees every captured map.

Source

pub fn report(&self) -> &'a FailureDumpReport

Underlying FailureDumpReport borrowed back to the caller.

Escape hatch. Most consumers should reach for the typed accessors on Snapshot / SnapshotMap / SnapshotEntry / SnapshotField, which route through SnapshotError and compose with the crate::assert::temporal patterns via SeriesField. Use report() only when a FailureDumpReport field has no typed accessor yet:

  • vcpu_regs — per-vCPU register snapshot captured at the freeze instant.
  • vcpu_perf_at_freeze — per-vCPU hardware perf counter snapshot captured at the freeze instant.
  • dump_truncated_at_us — microseconds-into-the-dump at which the soft deadline tripped.
  • sdt_allocations, scx_static_ranges — SDT allocator and scx static memory layout snapshots used by the arena / pointer-renderer pipelines.
  • schema — wire-format metadata (Self::is_placeholder already wraps the boolean form).

All other fields documented as escape-only on FailureDumpReport above now have first-class accessors on Snapshot (event_counter_timeline, rq_scx_states, dsq_states, scx_sched_state, per_cpu_time, per_node_numa, task_enrichments, prog_runtime_stats, probe_counters) and on SnapshotMap (ringbuf, arena, fd_array, stack_trace, map_error).

Five *_unavailable diagnostic accessors cover the subset of walker-backed fields the dump pipeline writes a reason string for: Self::scx_walker_unavailable (shared by rq_scx_states / dsq_states / scx_sched_state — the scx walker writes one reason for the whole group), Self::task_enrichments_unavailable, Self::prog_runtime_stats_unavailable, Self::per_node_numa_unavailable, and Self::sdt_alloc_unavailable (for the still-escape-only sdt_allocations field above). The remaining accessors (event_counter_timeline, per_cpu_time, probe_counters) have no companion diagnostic — empty / None is their only “no capture” signal.

Caveats of the bypass:

  • No SnapshotError routing — call-site is on its own to handle missing fields / type mismatches / per-CPU narrowing.
  • No SeriesField integration — temporal patterns (nondecreasing, rate_within, etc.) cannot consume raw FailureDumpReport field values.
  • No placeholder-sample short-circuit (Self::is_placeholder check is the caller’s responsibility).
Source

pub fn map(&self, name: &str) -> SnapshotResult<SnapshotMap<'a>>

Look up a BPF map by exact name. Respects the Self::active filter when set — only maps the filter admits are considered. Returns SnapshotError::MapNotFound (with the captured map names in available) when no match is found among the admitted maps, or SnapshotError::PlaceholderSnapshot when the snapshot’s underlying FailureDumpReport is a placeholder (freeze rendezvous failed; no maps to walk).

Source

pub fn var(&self, name: &str) -> SnapshotField<'a>

Walk the BTF-rendered fields of every *.bss / *.data / *.rodata global-section map for a top-level variable named name. Convenience for .var("nr_cpus_onln") style scalar reads without naming the section explicitly.

Returns SnapshotField::Value on a unique match; SnapshotField::Missing with SnapshotError::VarNotFound (and the union of every global-section map’s top-level member names in available) when no map exposes the name; OR — when more than one global-section map exposes the name — auto-falls-back to Self::live_var semantics (delegates to Self::active and re-projects) before yielding SnapshotError::AmbiguousVar.

§Auto-fallback contract

When the raw scan finds 2+ hits AND the snapshot is not already narrowed by Self::active (i.e. self.active_obj is None), var() calls Self::active: on Ok it returns active.var(name) directly — whether SnapshotField::Value, SnapshotError::VarNotFound, or SnapshotError::AmbiguousVar persisting after the live filter narrowed; on Err it falls through to the pre-filter SnapshotError::AmbiguousVar (see next section). The fallback exists so post- crate::scenario::ops::Op::ReplaceScheduler callers who name a global by string don’t have to know about Self::live_var explicitly — the principled active-scheduler walker is consulted automatically when the raw lookup is ambiguous. Self::live_var remains the explicit-opt-in form for callers who want the live filter unconditionally (skip the raw-scan path).

§When AmbiguousVar STILL fires

After the auto-fallback. The raw scan found 2+ hits AND active() failed (no scheduler attached, multi-obj without principled walker resolution, etc.). The found_in list names every map the raw scan saw — the operator needs all of them to reason about which obj they want to address via Self::map.

Source

pub fn vars( &self, name: &str, ) -> impl Iterator<Item = (&'a str, SnapshotField<'a>)> + '_

Iterate every global-section copy that carries a top-level member named name. Yields (owning_map_name, field) pairs in capture order. Use when Self::var errors SnapshotError::AmbiguousVar and the caller needs to reason across every observed copy explicitly (e.g. summing counter deltas across two scheduler instances loaded back-to-back in the same scenario).

Respects the Self::active filter when set, so chained snapshot.active()?.vars(name) is well-defined — it iterates only the active scheduler’s copies (typically exactly one, since active() filters to one obj_name).

Yields nothing on placeholder snapshots (the underlying report.maps is empty by construction so nothing matches anyway — callers needing “is this a placeholder?” use the Snapshot::is_placeholder accessor explicitly).

Source

pub fn active(&self) -> SnapshotResult<Snapshot<'a>>

Project the snapshot to the currently-active scheduler’s maps. Returns a filtered Snapshot whose Self::map / Self::var / Self::vars see only the maps whose name shares the <obj>. prefix of the active scheduler’s BPF object. Composable: snapshot.active()?.var(name).

§When to use

Tests that swap schedulers mid-scenario (via crate::scenario::ops::Op::ReplaceScheduler) reach for .active() after the swap so the per-phase post-swap snapshots resolve the live scheduler’s bss without hitting SnapshotError::AmbiguousVar across both schedulers’ captured copies. Single-scheduler tests never need .active() — there is no ambiguity to resolve.

§Signal source

“Active” comes from two fields the freeze coordinator populates at capture time:

When the walker resolved both fields, active() uses them directly and the obj-prefix scan below is a sanity cross- check against the captured map set. When the walker was unavailable (placeholder dump, transient swap window before the accessor-init worker republished, or kernel built without struct_ops support), the obj-prefix scan with per-section count fallback decides.

§Failure cases
  • SnapshotError::PlaceholderSnapshot: the snapshot is a freeze-rendezvous-failure placeholder.
  • SnapshotError::NoActiveScheduler (no global-section maps): the snapshot has no <obj>.bss/.data/.rodata — either no scheduler is attached, or the capture missed the global sections entirely.
  • SnapshotError::NoActiveScheduler (multiple distinct obj prefixes, walker unavailable): two scheduler instances with DIFFERENT obj names coexist (back-to-back load of distinct binaries, or one scheduler composed of multiple BPF objects) AND the walker did not publish active_obj_name. Use Self::vars to enumerate every copy or Self::map to address a specific scheduler’s bss directly.
  • SnapshotError::NoActiveScheduler (multi-copy same-prefix, walker unavailable): an crate::scenario::ops::Op::ReplaceScheduler swap between two builds of the SAME binary left two <obj>.bss (or .data / .rodata) copies with identical names AND the walker did not publish active_map_kvas to disambiguate. The obj-prefix filter alone cannot pick the live copy without admitting both. Use Self::live_var_via / Self::live_vars_via with crate::scenario::snapshot::pickers::max_by_sum_u64 to pick by counter activity.
§Lifetime

Pure projection over the frozen FailureDumpReport; multiple calls return equivalent views. Caching the result in a let active = snapshot.active()?; binding is fine but not required.

Source

pub fn live_var(&self, name: &str) -> SnapshotField<'a>

Read a single live counter from the active scheduler — the default for single-variable reads. Convenience for self.active()?.var(name).

For multi-variable arithmetic on multiple counters — fractions, ratios, deltas computed across more than one named field — use Self::live_vars_via instead. live_vars_via resolves the picker ONCE across a name set so independent per-name picks cannot corrupt the cross-variable computation by selecting different bss copies for different names. Repeatedly calling live_var for two counters from the same scheduler is correct in the walker-resolved case (both reads land in the same scheduler’s bss) but loses that guarantee on the picker-fallback path — silent corruption of ratios.

Returns a SnapshotField carrying either SnapshotError::NoActiveScheduler (no scheduler identifiable) or the standard Self::var error variants (SnapshotError::VarNotFound / SnapshotError::TypeMismatch from the inner var lookup).

Source

pub fn live_var_via( &self, name: &str, picker: impl FnOnce(&[(&'a str, SnapshotField<'a>)]) -> Option<usize>, ) -> SnapshotField<'a>

Caller-supplied disambiguator for the multi-bss case where Self::live_var cannot resolve a single live copy by itself.

Self::live_var delegates to Self::active to filter the snapshot to one scheduler’s maps. When Self::active cannot pick a single scheduler — multiple BPF objects with global-section maps are present AND the principled prog_idr → prog aux->used_maps → global-section map → obj prefix walker did not identify the live one — it errors with SnapshotError::NoActiveScheduler (the exact reason field is the long-form message constructed at the bail site listing the observed obj_names + the walker’s failure cause), and Self::live_var propagates that as SnapshotField::Missing.

live_var_via is the escape hatch: it skips the Self::active filter entirely, enumerates every observed copy of name via Self::vars, and hands the slice to the caller-supplied picker to pick one by index. Common case: an Op::ReplaceScheduler swap between two builds of the same scheduler that leaves two <obj>.bss maps in the snapshot sharing one obj_name prefix.

For multi-variable arithmetic (ratios, fractions, deltas computed across more than one named field), use Self::live_vars_via instead — it resolves the picker once across a name set so independent per-name picks cannot corrupt the cross-variable computation by selecting different bss copies for different names.

picker receives every observed copy of the named variable (one entry per <obj>.bss/.data/.rodata map carrying it, per Self::vars) and returns the index the caller wants (typically chosen by inspecting each candidate’s value via SnapshotField::as_u64 / as_str and applying a liveness or activity fingerprint — see crate::scenario::snapshot::pickers for predefined pickers such as max_by_counter_value).

Returns SnapshotField::Missing when:

Source

pub fn live_vars_via<P>( &self, names: &[&str], picker: P, ) -> SnapshotResult<Vec<SnapshotField<'a>>>
where P: FnOnce(&[(&'a str, Vec<SnapshotField<'a>>)]) -> Option<usize>,

Caller-supplied disambiguator for the multi-bss case where multiple variables from the same scheduler instance must be read consistently — e.g. computing nr_mig_cross_dispatch / (nr_mig_same_dispatch + nr_mig_cross_dispatch) as a cross-LLC dispatch fraction from one scheduler’s BPF counters.

§Why a separate primitive

Calling Self::live_var_via N times independently risks picking a DIFFERENT bss copy per call: the picker resolves each name’s candidate set independently, so two consecutive live_var_via("a", picker) + live_var_via("b", picker) calls can land on bss copy A for a and bss copy B for b, corrupting any cross-variable arithmetic (ratio, fraction, delta). live_vars_via resolves the picker ONCE across the candidate set for all N names jointly so every returned SnapshotField reads from the same source map.

§Mechanism

Per global-section map, look up each name in input order; keep the map as a candidate row iff it has ALL the names (intersection semantics — partial-coverage maps are absent from the picker’s input). The picker receives &[(map_name, fields_in_input_order)] and returns the chosen row’s index. The returned Vec<SnapshotField> is positional, keyed by the input names order — result[0] is names[0]’s field from the picked map, result[1] is names[1]’s field, etc.

Single-section constraint. All names must reside in the SAME global-section map — typically the scheduler’s <obj>.bss. A bss counter co-picked with a data constant from the same scheduler obj lands in DIFFERENT candidate rows (the obj’s .bss map carries the first name, its .data map carries the second, neither row has both), the intersection collapses to empty, and the helper returns SnapshotError::VarNotFound. If the test reads from multiple sections, issue separate live_vars_via calls (one per section’s name group) and compose the per-call results caller-side.

§See also
§Errors
Source

pub fn map_count(&self) -> usize

Number of maps the current view exposes — every captured map when unfiltered; only maps the Self::active filter admits when set.

Source

pub fn is_placeholder(&self) -> bool

True when the underlying FailureDumpReport is a placeholder produced by FailureDumpReport::placeholder — i.e. the freeze-rendezvous capture pipeline could not produce real data. Periodic-sample temporal patterns use this to skip the BPF axis on a placeholder sample (the stats axis, when present, may still be valid). Bypassing the projection-error path keeps the sample’s diagnostic distinct from “field missing on a real capture”.

Source

pub fn event_counter_timeline(&self) -> &'a [EventCounterSample]

Per-monitor-tick SCX_EV_* event counter samples. Each entry is the cross-CPU sum of the 13 SCX event counters at one monitor tick. Empty when no EventCounterCapture ran, or every sample was suppressed (event-stat offsets unresolved, scx_root unset).

Unlike the walker-backed accessors below, this field carries no *_unavailable companion: an empty timeline is the only signal for “no capture / no events”.

Source

pub fn rq_scx_states(&self) -> &'a [RqScxState]

Per-CPU rq->scx snapshots — one per CPU walked by crate::monitor::scx_walker. Empty when the ScxWalkerCapture was absent or every CPU’s translate failed (see FailureDumpReport::scx_walker_unavailable).

Source

pub fn dsq_states(&self) -> &'a [DsqState]

Per-DSQ snapshots — local, bypass, global, and user DSQs reachable from *scx_root. Each entry carries nr (depth), seq (BPF-iter counter), and the queued task KVAs. Empty when the ScxWalkerCapture was absent (see FailureDumpReport::scx_walker_unavailable).

Source

pub fn scx_sched_state(&self) -> Option<&'a ScxSchedState>

Top-level scx_sched state captured from *scx_root: aborting flag, bypass_depth, exit_kind. None when no scheduler is attached or *scx_root was unreadable (see FailureDumpReport::scx_walker_unavailable).

Source

pub fn per_cpu_time(&self) -> &'a [PerCpuTimeStats]

Per-CPU CPU-time / softirq / IRQ counter rows. One row per CPU enumerated by crate::monitor::dump::CpuTimeCapture. Empty when the capture was not wired or symbol/BTF resolution failed.

Source

pub fn per_cpu_time_at(&self, cpu: u32) -> Option<&'a PerCpuTimeStats>

Per-CPU CPU-time row for CPU cpu, looked up by the cpu field on each PerCpuTimeStats (not by vec position). Returns None when no row matches — typical when the walker skipped that CPU, the capture didn’t run, or cpu exceeded the topology. Returns the first match in walker enumeration order if cpu appears more than once.

Source

pub fn cgroup_psi(&self) -> &'a [CgroupPsiStat]

Per-cgroup PSI-irq rows for the test’s workload cgroups, host-walked from the cgroup hierarchy at this freeze (Phase A). One row per workload-root leaf cgroup with per-cgroup PSI accounting enabled. Empty when the capture was not wired, the workload root isn’t present yet, or psi_cgroups_enabled is off — loud-absent. RAW values; decoded + folded at the metric layer (see crate::monitor::cgroup_walk::CgroupPsiStat).

Source

pub fn per_node_numa(&self) -> &'a [PerNodeNumaStats]

Per-NUMA-node event counter rows captured from pglist_data->node_zones[]->vm_numa_event[]. Empty until the host-side NUMA walker lands (see FailureDumpReport::per_node_numa_unavailable).

Source

pub fn per_node_numa_at(&self, node: u32) -> Option<&'a PerNodeNumaStats>

Per-NUMA-node event-counter row for node, looked up by the node field on each PerNodeNumaStats. Returns None when no row matches. Returns the first match in walker enumeration order if node appears more than once.

Source

pub fn task_enrichments(&self) -> &'a [TaskEnrichment]

Per-task failure-dump enrichments — identity (pid, tgid, comm), process tree, scheduling priority, sched_class name, context-switch counters, watchdog disambiguation, lock slowpath stack matches. Empty when no task walker ran (see FailureDumpReport::task_enrichments_unavailable).

Source

pub fn task_enrichment_by_pid(&self, pid: i32) -> Option<&'a TaskEnrichment>

Look up the enrichment for pid. The returned reference matches the first task whose task_struct.pid equals pid in walker enumeration order. Returns None when no task with that pid was captured. Production captures dedupe by task_kva before push, so duplicate-pid rows do not occur in real dumps.

Source

pub fn prog_runtime_stats(&self) -> &'a [ProgRuntimeStats]

Per-program BPF runtime stats — invocation count, total ns, recursion misses. One entry per struct_ops program reached by the prog walker. Empty when no struct_ops programs are loaded or the prog accessor was unavailable (see FailureDumpReport::prog_runtime_stats_unavailable).

Source

pub fn prog_runtime_stats_by_name( &self, name: &str, ) -> Option<&'a ProgRuntimeStats>

Look up the runtime stats for the program registered with name (kernel-side bpf_prog->aux->name). Returns None when no program with that name was captured. Returns the first match in walker enumeration order if name appears more than once — struct_ops programs in real captures use distinct callback names (select_cpu, enqueue, etc.) so duplicates do not occur in production.

Source

pub fn probe_counters(&self) -> Option<&'a ProbeBssCounters>

Probe BPF program’s per-CPU diagnostic counter snapshot. None when the probe’s .bss map isn’t enumerated (probe not loaded), the program BTF can’t be parsed, or the array’s offset doesn’t resolve. A populated trigger_count > 0 is the structural signal that the tp_btf/sched_ext_exit handler fired during the run.

Source

pub fn scx_walker_unavailable(&self) -> Option<&'a str>

Diagnostic reason recorded when Self::rq_scx_states / Self::dsq_states / Self::scx_sched_state could not be populated. None when the walker fully succeeded; otherwise Some(reason) (e.g. "scx_root null", "no scx walker", or a partial-degradation string from the dump pipeline).

Source

pub fn task_enrichments_unavailable(&self) -> Option<&'a str>

Diagnostic reason recorded when Self::task_enrichments could not be populated. None when the walker yielded at least one enrichment; otherwise Some(reason) (e.g. "no task walker available", "task walker yielded zero tasks").

Source

pub fn prog_runtime_stats_unavailable(&self) -> Option<&'a str>

Diagnostic reason recorded when Self::prog_runtime_stats could not be populated. None when the walker yielded at least one program; otherwise Some(reason) (e.g. "prog accessor unavailable", "no struct_ops programs loaded").

Source

pub fn per_node_numa_unavailable(&self) -> Option<&'a str>

Diagnostic reason recorded when Self::per_node_numa could not be populated — typically "no NUMA walker" until the host-side walker lands.

Source

pub fn sdt_alloc_unavailable(&self) -> Option<&'a str>

Diagnostic reason recorded when the SDT allocator snapshot (still escape-only via Self::report) could not be populated.

Trait Implementations§

Source§

impl<'a> Clone for Snapshot<'a>

Source§

fn clone(&self) -> Snapshot<'a>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<'a> Debug for Snapshot<'a>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl<'a> Freeze for Snapshot<'a>

§

impl<'a> RefUnwindSafe for Snapshot<'a>

§

impl<'a> Send for Snapshot<'a>

§

impl<'a> Sync for Snapshot<'a>

§

impl<'a> Unpin for Snapshot<'a>

§

impl<'a> UnwindSafe for Snapshot<'a>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> MaybeSend for T
where T: Send,

§

impl<T> MaybeSend for T
where T: Send,