Expand description
Advisory flock(2) primitives shared across every ktstr lock file.
ktstr uses advisory flock(2) in four places:
- LLC reservation locks at
{lock_dir}/ktstr-llc-{N}.lockand per-CPU locks at{lock_dir}/ktstr-cpu-{C}.lockwherelock_diris resolved bycrate::cache::resolve_lock_dir(KTSTR_LOCK_DIRenv var, fallback/tmp). Seecrate::vmm::host_topology::acquire_resource_locksand friends. - Per-cache-entry coordination locks at
{cache_root}/.locks/{cache_key}.lock(seecrate::cache::CacheDir::acquire_shared_lockand friends). - Per-source-tree build locks at
{cache_root}/.locks/source-{path_hash}.lock(seecrate::cli::kernel_build::build::acquire_source_tree_lock) — serialize concurrentmakeinvocations against the same kernel source checkout. - Observational enumeration from
ktstr locks --json— a read-only scan that does NOT acquire flocks; reads /proc/locks throughread_holdersto attribute holders without contending with active acquirers.
All four share:
- Non-blocking
LOCK_NBattempt (the cache-entry path wraps this in a poll loop for timed-wait semantics). O_CLOEXECon every open so the kernel’s “release flock when the last fd referring to the OFD closes” invariant matches whatOwnedFd::dropdoes — a leaked fd acrossexec(2)would keep the lock alive in the child and fool the next acquirer’s/proc/locksscan into naming the wrong pid.- /proc/locks parsing keyed on the mount-point-derived
{major:02x}:{minor:02x}:{inode}triple, resolved via/proc/self/mountinfo(notstat().st_dev— see below). HolderInfowithpid+ truncated/proc/{pid}/cmdlinefor actionable error messages.
§Module layout
Each submodule owns a single, cohesive subsystem:
fs_filter— refuses to operate on filesystems whereflock(2)is unreliable (NFS, CIFS/SMB, CephFS, AFS, FUSE).primitives— the kernel-syscall wrappers (try_flock/block_flock/materialize) that open a lockfile and request a flock operation.mountinfo—/proc/self/mountinfoparser and the{major:02x}:{minor:02x}:{inode}needle derivation thatproc_lockskeys off.proc_locks—/proc/locksscanner that enumerates the PIDs holding a given lockfile’s flock.holder— converts a PID into aHolderInfo(reads/proc/{pid}/cmdline) and renders a&[HolderInfo]into a multi-line operator-facing string.acquire— high-level poll-with-timeout helper that wrapsprimitives::try_flockin a deadline loop and decorates timeout errors with the holder list fromproc_locksandholder.
§Why mountinfo, not stat().st_dev
/proc/locks emits i_sb->s_dev for each held flock — the
filesystem’s superblock device id. For most filesystems that
matches stat().st_dev, but on btrfs, overlayfs, and bind-mounts
the kernel installs a custom getattr implementation that returns
an anonymous device id (anon_dev) distinct from s_dev. That
divergence means the stat-derived needle would never match the
/proc/locks line — a naive read_holders would silently return
empty on every btrfs-backed /tmp, every overlay-rootfs
container, and every bind-mounted /tmp, which is a silent
correctness failure for --cpu-cap contention diagnostics and
the ktstr locks observational command.
Needle production (see mountinfo::needle_from_path):
mountinfo::needle_from_path resolves path to the mount-point
covering it via /proc/self/mountinfo (longest-prefix match on
the mount_point field), then reads the {major:minor} field of
that mount entry. Combines with stat().st_ino for the full
triple. The mountinfo {major:minor} is the kernel’s
i_sb->s_dev verbatim, so the resulting needle matches
/proc/locks by construction. The needle feeds
proc_locks::read_holders_for_needle, which scans
/proc/locks exactly once and byte-compares.
§Remote-filesystem rejection
try_flock refuses to operate on NFS / CIFS / SMB2 / CEPH /
AFS / FUSE (see fs_filter::reject_remote_fs). flock(2) on
those filesystems is either advisory-only under some server
configurations (NFSv3 without NLM coordination) or silently
returns success without serializing peers (FUSE when the
userspace server doesn’t implement the flock op). ktstr’s
resource-budget contract is not robust to that silent
degradation, so the safe call is to reject at lockfile-open
time with an actionable message.
Structs§
- Holder
Info - Identity of a process holding an advisory flock. Used by error
messages in both LLC-coordination and cache-entry paths, plus the
ktstr locksobservational subcommand.
Enums§
- Flock
Mode - Requested sharing mode for
try_flock. Translated to the corresponding non-blocking [rustix::fs::FlockOperation] internally; callers never see the libc-specific constants.
Functions§
- block_
flock - Blocking variant of
try_flock. Opens the lockfile (creating it if absent), then issues a blockingflock(2)that parks the caller in the kernel until the lock is available. Use aftertry_flockreturnsNoneto wait for a live peer to finish. - format_
holder_ list - Format a
HolderInfoslice for inclusion in user-facing error strings. Empty slice yields theNO_HOLDERS_RECORDEDsentinel so the diagnostic is unambiguous — a stale lockfile whose holder has exited presents as empty, and the error should say so rather than print a misleading blank. Non-empty renders onepid={pid} cmd={cmdline}line per holder, newline-separated and indented two spaces, so a multi-holder error stays readable when embedded in a wrapping anyhow chain; the prior comma-joined form ran every holder into a single wide line that terminals wrapped arbitrarily mid-cmdline. - try_
flock - Open a lock file and attempt
flockwithLOCK_NB.