Struct FailureDumpReport

Source

#[non_exhaustive]
pub struct FailureDumpReport {Show 26 fields
    pub schema: String,
    pub maps: Vec<FailureDumpMap>,
    pub vcpu_regs: Vec<Option<VcpuRegSnapshot>>,
    pub active_obj_name: Option<String>,
    pub active_map_kvas: Vec<u64>,
    pub sdt_allocations: Vec<SdtAllocatorSnapshot>,
    pub scx_static_ranges: ScxStaticSnapshot,
    pub sdt_alloc_unavailable: Option<String>,
    pub prog_runtime_stats: Vec<ProgRuntimeStats>,
    pub prog_runtime_stats_unavailable: Option<String>,
    pub per_cpu_time: Vec<PerCpuTimeStats>,
    pub cgroup_psi: Vec<CgroupPsiStat>,
    pub per_node_numa: Vec<PerNodeNumaStats>,
    pub per_node_numa_unavailable: Option<String>,
    pub task_enrichments: Vec<TaskEnrichment>,
    pub task_enrichments_unavailable: Option<String>,
    pub event_counter_timeline: Vec<EventCounterSample>,
    pub rq_scx_states: Vec<RqScxState>,
    pub dsq_states: Vec<DsqState>,
    pub scx_sched_state: Option<ScxSchedState>,
    pub scx_walker_unavailable: Option<String>,
    pub vcpu_perf_at_freeze: Vec<Option<VcpuPerfSample>>,
    pub dump_truncated_at_us: Option<u64>,
    pub maps_truncated: u32,
    pub probe_counters: Option<ProbeBssCounters>,
    pub is_placeholder: bool,
}

Expand description

Top-level failure-dump report. One per freeze trigger.

Fields (Non-exhaustive)§

This struct is marked as non-exhaustive

Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.

§schema: String

Wire-format discriminant. Always "single" for this variant, pinning SCHEMA_SINGLE. Consumers branch on this to choose between FailureDumpReport, DualFailureDumpReport, and DegradedFailureDumpReport before deserializing. Single and Dual share top-level field names that would collide without an explicit tag; Degraded carries a distinct field set (reason, watchpoint_hit, bss_latch_state, exit_kind, elapsed_ms) but still gets the tag so FailureDumpReportAny::from_json can dispatch uniformly.

§maps: Vec<FailureDumpMap>

One entry per BPF map enumerated. Order matches the IDR walk (i.e. allocation order); the report is otherwise unsorted so callers that want a stable view should sort by name.

§vcpu_regs: Vec<Option<VcpuRegSnapshot>>

Per-vCPU register snapshots captured on each vCPU thread at freeze time. Index matches vCPU id (BSP at 0, APs at 1..N). None when a vCPU never parked (rendezvous timeout) or its KVM_GET_REGS failed mid-shutdown. Attached to the report by the freeze coordinator after dump_state returns.

§active_obj_name: Option<String>

Obj name of the currently-attached scheduler, identified by matching each BPF_MAP_TYPE_STRUCT_OPS map’s value_kva (the guest-KVA of its kvalue.data payload) against the dereferenced *scx_root value (the guest-KVA of the active struct scx_sched, which is also the KVA of scx_sched.ops since ops sits at offset 0). When the match succeeds, the struct_ops map’s name carries the obj prefix (libbpf convention: <obj>.<struct_ops_var>); the prefix is split at the first . and stored here.

None when:

scx_sched_state is unavailable (no scheduler attached, BTF missing the scx_sched type, or *scx_root could not be resolved at capture time).
No BPF_MAP_TYPE_STRUCT_OPS map had value_kva matching the active sched_kva (capture race during a mid-attach window, or the struct_ops map’s value_kva was not yet populated).
The matched map’s name lacks a <obj>. prefix.

crate::scenario::snapshot::Snapshot::active uses this as the principled tiebreaker when the projection sees multiple obj prefixes in global-section maps. On None the consumer falls back to the prefix-grouping heuristic (single obj → that one; multiple obj → NoActiveScheduler with a diagnostic naming the observed obj_names + the walker’s failure cause).

§active_map_kvas: Vec<u64>

Guest-KVAs of every struct bpf_map belonging to the currently-attached scheduler’s loaded BPF object, captured alongside Self::active_obj_name from the same *scx_root → struct_ops map → owning bpf_prog → used_maps walk. The walker enumerates the matched struct_ops prog’s used_maps array and records each entry’s KVA so a downstream filter can identify the active scheduler’s maps uniquely — even when two scheduler instances loaded from the SAME binary coexist post-crate::scenario::ops::Op::ReplaceScheduler (where their bss / data / rodata maps share the <obj_name>. prefix and cannot be distinguished by name alone).

Empty when:

Self::active_obj_name is None (walker did not resolve the active obj — same reasons; see that field’s doc).
The matched prog’s used_maps could not be safely read (torn race per the kernel’s used_map_cnt / used_maps pointer publication TOCTOU described at monitor/bpf_prog.rs::find_active_struct_ops_obj_no_target).
The walker ran but the kernel published an empty used_maps for the active prog (no maps registered — an unusual but legal sched_ext shape).

KVA aliasing caveat: kernel BPF map allocations are vmalloc/slab-backed; a freed map’s KVA can be reassigned to a new allocation across captures. crate::scenario::snapshot::Snapshot::active combines this set with Self::active_obj_name (both must match) to reject the aliasing case — a KVA hit whose owning map name does not share the active obj prefix is treated as stale and falls through to the obj-prefix heuristic.

Within-run identity ONLY. KVAs reflect kernel address space allocation at capture time (subject to KASLR slide). Stable for the life of the map within a single VM run; NOT comparable across runs. Never persist or compare against checked-in baselines.

§sdt_allocations: Vec<SdtAllocatorSnapshot>

Structured per-allocation views from sdt_alloc-backed allocators. One entry per discovered allocator; each carries every live leaf slot (capped at super::sdt_alloc::MAX_SDT_ALLOC_ENTRIES) BTF-rendered to named field views. Empty when no scheduler-side allocator could be located, when arena offsets / sdt_alloc offsets are absent, or when the program BTF lacks the scx_allocator type (scheduler doesn’t link lib/sdt_alloc.bpf.c).

Populated alongside the page-granular ArenaSnapshot in each map: a consumer can read either representation depending on whether they want raw bytes or named-field allocations.

§scx_static_ranges: ScxStaticSnapshot

Live scx_static bump-allocator regions discovered in .bss. One entry per struct scx_static instance with memory != 0 and an in-range (off, max_alloc_bytes) pair. Distinct from Self::sdt_allocations: scx_static is a program-lifetime bump allocator (lib/sdt_alloc.bpf.c:577) with no per-allocation header, so the surfaced view is region-granular ranges rather than per-slot named allocations. Empty when the scheduler doesn’t link lib/sdt_alloc.bpf.c, when the program BTF lacks struct scx_static, or when no scx_static instance has been initialised at freeze time.

The dump pipeline uses the same ranges to populate the renderer’s ScxStaticRangeIndex so a deferred-resolve arena chase whose target lives inside scx_static memory can fail-closed cleanly (no per-slot type recovery is possible without a per-call-site type hook from cast analysis).

§sdt_alloc_unavailable: Option<String>

Diagnostic reason for sdt_allocations being empty.

None → either the pre-pass ran and produced records (the vec is non-empty), or the pre-pass ran cleanly but the scheduler simply has no live allocations (the vec is empty for legitimate reasons that aren’t worth a diagnostic). Default.
Some(REASON_SDT_ALLOC_*) → the pre-pass skipped before it could surface any allocator state. The string identifies which prerequisite was missing: deadline exhaustion, unaligned user_vm_start, missing scheduler .bss, missing arena snapshot, scheduler BTF without struct scx_allocator, or no .bss scx_allocator instance.

Distinct from dump_truncated_at_us (which records deadline truncation across the whole dump) and from Self::scx_static_ranges (which has its own walker independent of the typed-allocator pre-pass). Mirrors the prog_runtime_stats_unavailable / task_enrichments_unavailable pattern.

§prog_runtime_stats: Vec<ProgRuntimeStats>

Per-program BPF runtime stats summed across CPUs at freeze time (cnt, nsecs, misses). One entry per discovered struct_ops BPF program. Empty when no struct_ops programs are loaded OR when the prog accessor was unavailable to dump_state — see Self::prog_runtime_stats_unavailable for the reason.

Per-CPU offset resolution failure does NOT empty the vec — each program still contributes one entry, but with cnt/nsecs/misses summed only over CPUs whose per-CPU bpf_prog_stats slot translated successfully (out-of-range CPUs return None per super::bpf_map::read_percpu_array_value semantics).

See super::bpf_prog::ProgRuntimeStats for field semantics and the kernel-source-grounded provenance of each counter.

§prog_runtime_stats_unavailable: Option<String>

Diagnostic reason for prog_runtime_stats being empty.

Distinguishes the three causes a consumer can’t otherwise tell apart from an empty vec:

None (field absent on wire) → vec was populated normally (or the dump path didn’t run). Default.
Some("no struct_ops programs loaded") → walker ran, no struct_ops programs were in prog_idr at freeze time.
Some("prog accessor unavailable") → caller passed prog_capture: None. Typical causes: prog_idr symbol missing, BpfProgOffsets BTF parse failed, or __per_cpu_offset resolution didn’t yield non-zero offsets yet (still-booting guest).

Set by dump_state only when prog_runtime_stats ends up empty AND a definite cause is identifiable; left None otherwise so the field stays absent in the JSON for already-populated dumps.

§per_cpu_time: Vec<PerCpuTimeStats>

Per-CPU CPU-time / softirq / IRQ counters captured from kernel_cpustat, kernel_stat, and (under NO_HZ) tick_sched. One entry per CPU enumerated by the walker. Empty when the dump caller passed no CpuTimeCapture or when symbol/BTF resolution failed.

See PerCpuTimeStats for field semantics. Surfaces the per-CPU interrupt and idle-time data the failure dump otherwise leaves implicit (the existing scx walker reads rq->nr_iowait but not the cumulative time accounting).

§cgroup_psi: Vec<CgroupPsiStat>

Per-cgroup PSI-irq samples for the test’s workload cgroups, host-walked from the cgroup hierarchy at this freeze (Phase A). One entry per workload-root leaf cgroup with per-cgroup PSI accounting enabled. Empty when the dump caller passed no CgroupPsiCapture, the workload root isn’t present yet, or psi_cgroups_enabled is off — loud-absent. RAW values; decoded + folded at the metric layer. See super::cgroup_walk::CgroupPsiStat.

§per_node_numa: Vec<PerNodeNumaStats>

Per-node NUMA event counters captured from pglist_data->node_zones[]->vm_numa_event[]. One row per NUMA node enumerated by the walker. Empty when the live walker has not landed yet (the BTF offsets and wire shape are wired; the reader is a follow-up).

See PerNodeNumaStats for field semantics; see Self::per_node_numa_unavailable for the “why empty” diagnostic.

§per_node_numa_unavailable: Option<String>

Diagnostic reason for per_node_numa being empty. None when the vec was populated normally (or the dump path didn’t run); Some(REASON_NO_NUMA_WALKER) until the host-side walker lands.

§task_enrichments: Vec<TaskEnrichment>

Per-task failure-dump enrichments — identity (pid, tgid, comm), process tree (group_leader, real_parent, pgid, sid, nr_threads), scheduling (prio family, sched_class name, scx.weight, core_cookie), context-switch counters, watchdog disambiguation flag, and lock-slowpath stack matches.

One entry per task the dump path’s task walker reaches — today’s task walkers are the rq->scx walker and the DSQ walker; both produce task KVAs that get enriched here. Empty when no task walker ran (typical until walker dispatch lands) or when the TaskEnrichmentCapture was absent.

See super::task_enrichment::TaskEnrichment for field semantics; see Self::task_enrichments_unavailable for the “why empty” diagnostic.

§task_enrichments_unavailable: Option<String>

Diagnostic reason for task_enrichments being empty.

None → vec was populated normally (or the dump path didn’t run).
Some("no task walker available") → the TaskEnrichmentCapture was missing from DumpContext. Until DSQ + rq->scx walker dispatch lands, this is the expected steady state for the dump pipeline; the offsets + walker library is wired and ready to populate as soon as a task-list producer hooks in.
Some("task walker yielded zero tasks") → walker produced no task KVAs (frozen guest with no runnable / queued scx tasks at the dump instant — possible on a completely-idle stall trigger).

§event_counter_timeline: Vec<EventCounterSample>

Per-monitor-tick SCX_EV_* event counter timeline. Each entry is the cross-CPU sum of the 13 SCX_EV_* counters at one monitor sample. Empty when the dump caller passed no EventCounterCapture or no sample reported event counters (event-stat offsets unresolved, scx_root unset). Renderers build sparklines / per-counter delta plots from this vec.

See EventCounterSample for field semantics; the kernel- source provenance lives on super::ScxEventCounters field doc.

§rq_scx_states: Vec<RqScxState>

Per-CPU rq->scx snapshots — scalar fields the kernel’s own scx_dump_state reads plus the runnable_list per-task KVAs that fed into the per-task enrichment capture. One entry per CPU walked. Empty when the ScxWalkerCapture was absent or every CPU’s translate failed.

See super::scx_walker::RqScxState for field semantics.

§dsq_states: Vec<DsqState>

Per-DSQ snapshots — local, bypass, global, and user DSQs reachable from *scx_root. Each entry carries nr (depth), seq (BPF-iter counter), and the queued task KVAs. Surfaces data the kernel’s own scx_dump_state does not emit (per-DSQ depth enumeration), so this vec adds value even on a kernel that prints its own dump.

Empty when the ScxWalkerCapture was absent.

§scx_sched_state: Option<ScxSchedState>

Top-level scx_sched state captured from *scx_root: aborting flag, bypass_depth, exit_kind. None when no scheduler is attached or *scx_root was unreadable.

§scx_walker_unavailable: Option<String>

Diagnostic reason for rq_scx_states / dsq_states / scx_sched_state being absent. Mirrors the prog_runtime_stats_unavailable / task_enrichments_unavailable pattern.

§vcpu_perf_at_freeze: Vec<Option<VcpuPerfSample>>

Per-vCPU hardware perf counter snapshot captured at the instant the failure dump fired. One entry per vCPU; index matches vCPU id (0 = BSP, 1..N = APs). None per-entry when the freeze-time read(2) failed for that vCPU. Empty vec when DumpContext::perf_capture was None (perf unavailable on this host) or the read errored wholesale.

exclude_host=1 means each counter ticks only during guest execution; the values here record the cumulative count from the start of the run. Diff against any super::CpuSnapshot::vcpu_perf in the monitor timeline to recover the count over a freeze-aligned window. See super::perf_counters::VcpuPerfSample for field semantics and the multiplexing math.

§dump_truncated_at_us: Option<u64>

Microseconds from dump_state entry to the phase that exceeded the soft deadline supplied via DumpContext::deadline. None when no deadline was supplied, when every phase finished within the deadline, or when the deadline check happened before the dump started any heavy phase. A Some(us) value means the dump truncated remaining work (skipped further maps / tasks / walkers) at that elapsed offset to keep the freeze window bounded — the freeze coordinator’s parked vCPUs cannot service guest IRQs or MMIO traps while the dump is running, so unbounded dump latency stretches every guest’s KVM_RUN pause and risks the freeze rendezvous timeout firing on the next iteration.

§maps_truncated: u32

Count of scheduler-under-test maps the per-map render loop skipped because the soft deadline had already been crossed (dump_truncated_at_us records WHEN, this records HOW MANY). 0 on a complete dump. A skipped map is absent from maps entirely — without this count a consumer reading maps cannot tell “the scheduler has N maps” from “the scheduler has N+k maps but k were dropped by truncation”, so a degraded dump would silently under-report map state. Excludes ktstr’s own framework maps, which are filtered before the deadline check and never counted here.

§probe_counters: Option<ProbeBssCounters>

Probe BPF program’s per-CPU diagnostic counter snapshot (see ProbeBssCounters). Populated by the host-side reader in decode_probe_counters_snapshot which sums each KTSTR_PCPU_* slot across CPUs. None when the probe .bss map isn’t enumerated (probe not loaded), the program BTF can’t be parsed, or the array’s offset doesn’t resolve.

A populated trigger_count > 0 is the structural signal that the BPF tp_btf/sched_ext_exit handler fired during the run — distinct from the boolean trigger_fired flag in super::probe::process::ProbeDiagnostics (which also records host-side observations like a watchdog teardown). The cross-product is the failure-dump E2E test’s structural assertion: a stall scenario must show both flag=true AND trigger_count > 0, otherwise the probe attached without firing or fired without the host observing.

§is_placeholder: bool

true when this report was produced by Self::placeholder — i.e. the capture pipeline could not produce real data (typical cause: freeze rendezvous timed out). Periodic-sample temporal assertions skip placeholder reports rather than treating their empty vectors as “no progress” signals; the *_unavailable fields carry the reason string for human consumers, but the boolean flag is the machine-checkable discriminant a pattern can branch on without re-deriving placeholder status from the absence of every field.

Struct FailureDumpReport Copy item path

Fields (Non-exhaustive)§

Implementations§

impl FailureDumpReport

pub fn placeholder(reason: impl Into<String>) -> Self

Trait Implementations§

impl Clone for FailureDumpReport

fn clone(&self) -> FailureDumpReport

fn clone_from(&mut self, source: &Self)

impl Debug for FailureDumpReport

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for FailureDumpReport

fn default() -> Self

impl<'de> Deserialize<'de> for FailureDumpReport

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Display for FailureDumpReport

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Serialize for FailureDumpReport

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

Auto Trait Implementations§

impl Freeze for FailureDumpReport

impl RefUnwindSafe for FailureDumpReport

impl Send for FailureDumpReport

impl Sync for FailureDumpReport

impl Unpin for FailureDumpReport

impl UnwindSafe for FailureDumpReport

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

unsafe fn drop(ptr: usize)

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Policy<B, E>, P: Policy<B, E>,

impl<T> Same for T

type Output = T

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T> ToString for Twhere T: Display + ?Sized,

fn to_string(&self) -> String

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

impl<T> MaybeSend for Twhere T: Send,

impl<T> MaybeSend for Twhere T: Send,

Struct FailureDumpReport

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for T
where T: Clone,

impl<T> ToString for T
where T: Display + ?Sized,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

impl<T> MaybeSend for T
where T: Send,

impl<T> MaybeSend for T
where T: Send,