Files
linux/Documentation/driver-api/index.rst
Breno Leitao 3fa805c37d vmcoreinfo: track and log recoverable hardware errors
Introduce a generic infrastructure for tracking recoverable hardware
errors (HW errors that are visible to the OS but does not cause a panic)
and record them for vmcore consumption.  This aids post-mortem crash
analysis tools by preserving a count and timestamp for the last occurrence
of such errors.  On the other side, correctable errors, which the OS
typically remains unaware of because the underlying hardware handles them
transparently, are less relevant for crash dump and therefore are NOT
tracked in this infrastructure.

Add centralized logging for sources of recoverable hardware errors based
on the subsystem it has been notified.

hwerror_data is write-only at kernel runtime, and it is meant to be read
from vmcore using tools like crash/drgn.  For example, this is how it
looks like when opening the crashdump from drgn.

	>>> prog['hwerror_data']
	(struct hwerror_info[1]){
		{
			.count = (int)844,
			.timestamp = (time64_t)1752852018,
		},
		...

This helps fleet operators quickly triage whether a crash may be
influenced by hardware recoverable errors (which executes a uncommon code
path in the kernel), especially when recoverable errors occurred shortly
before a panic, such as the bug fixed by commit ee62ce7a1d ("page_pool:
Track DMA-mapped pages and unmap them when destroying the pool")

This is not intended to replace full hardware diagnostics but provides a
fast way to correlate hardware events with kernel panics quickly.

Rare machine check exceptions—like those indicated by mce_flags.p5 or
mce_flags.winchip—are not accounted for in this method, as they fall
outside the intended usage scope for this feature's user base.

[leitao@debian.org: add hw-recoverable-errors to toctree]
  Link: https://lkml.kernel.org/r/20251127-vmcoreinfo_fix-v1-1-26f5b1c43da9@debian.org
Link: https://lkml.kernel.org/r/20251010-vmcore_hw_error-v5-1-636ede3efe44@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Suggested-by: Tony Luck <tony.luck@intel.com>
Suggested-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>	[APEI]
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Omar Sandoval <osandov@osandov.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27 14:24:44 -08:00

158 lines
2.4 KiB
ReStructuredText

.. SPDX-License-Identifier: GPL-2.0
==============================
Driver implementer's API guide
==============================
The kernel offers a wide variety of interfaces to support the development
of device drivers. This document is an only somewhat organized collection
of some of those interfaces — it will hopefully get better over time! The
available subsections can be seen below.
General information for driver authors
======================================
This section contains documentation that should, at some point or other, be
of interest to most developers working on device drivers.
.. toctree::
:maxdepth: 1
basics
driver-model/index
device_link
infrastructure
ioctl
pm/index
Useful support libraries
========================
This section contains documentation that should, at some point or other, be
of interest to most developers working on device drivers.
.. toctree::
:maxdepth: 1
early-userspace/index
connector
device-io
devfreq
dma-buf
component
io-mapping
io_ordering
uio-howto
vfio-mediated-device
vfio
vfio-pci-device-specific-driver-acceptance
Bus-level documentation
=======================
.. toctree::
:maxdepth: 1
auxiliary_bus
cxl/index
eisa
firewire
i3c/index
isa
men-chameleon-bus
pci/index
rapidio/index
slimbus
usb/index
virtio/index
vme
w1
xillybus
Subsystem-specific APIs
=======================
.. toctree::
:maxdepth: 1
80211/index
acpi/index
backlight/lp855x-driver.rst
clk
coco/index
console
crypto/index
dmaengine/index
dpll
edac
extcon
firmware/index
fpga/index
frame-buffer
aperture
generic-counter
gpio/index
hsi
hte/index
hw-recoverable-errors
i2c
iio/index
infiniband
input
interconnect
ipmb
ipmi
libata
mailbox
md/index
media/index
mei/index
memory-devices/index
message-based
misc_devices
miscellaneous
mmc/index
mtd/index
mtdnand
nfc/index
ntb
nvdimm/index
nvmem
parport-lowlevel
phy/index
pin-control
pldmfw/index
pps
ptp
pwm
pwrseq
regulator
reset
rfkill
s390-drivers
scsi
serial/index
sm501
soundwire/index
spi
surface_aggregator/index
switchtec
sync_file
target
tee
thermal/index
tty/index
wbrf
wmi
xilinx/index
zorro
.. only:: subproject and html
Indices
=======
* :ref:`genindex`