The Linux Kernel/Debugging

Performance

There are many factors that can affect the performance of the Linux kernel, including hardware configurations, software configurations, and workload characteristics.

In this context, performance optimization of the Linux kernel involves identifying and addressing performance bottlenecks in the system. This can involve tuning kernel parameters, optimizing system resources, and identifying and fixing bugs and other issues that may be impacting performance.

Given the complexity of the Linux kernel and the wide range of factors that can affect performance, performance optimization can be a challenging task. However, with the right tools and techniques, it is possible to significantly improve the performance and reliability of Linux-based systems.

Perf_events

Perf_events, short for performance events, is a powerful interface that provides detailed insights into the performance characteristics of software running on a system. By analyzing the data collected by perf_events, developers can identify performance bottlenecks and optimize software to improve performance and reduce resource utilization. Perf_events is designed to be a lightweight, low-overhead monitoring solution that has minimal impact on system performance.


🔧 TODO


⚲ Interfaces

man 1 perf performance analysis tools
Basic commands:
man 1 perf-help display help information about perf
man 1 perf-top System profiling tool.
man 1 perf-record Run a command and record its profile into perf.data
man 1 perf-report Read perf.data (created by perf record) and display the profile
Other commands:
man 1 perf-annotate Read perf.data (created by perf record) and display annotated code
man 1 perf-archive Create archive with object files with build-ids found ...
man 1 perf-arm-spe Support for Arm Statistical Profiling Extension within...
man 1 perf-bench General framework for benchmark suites
man 1 perf-buildid-cache Manage build-id cache.
man 1 perf-buildid-list List the buildids in a perf.data file
man 1 perf-c2c Shared Data C2C/HITM Analyzer.
man 1 perf-config Get and set variables in a configuration file.
man 1 perf-daemon Run record sessions on background
man 1 perf-data Data file related processing
man 1 perf-diff Read perf.data files and display the differential profile
man 1 perf-dlfilter Filter sample events using a dynamically loaded shared...
man 1 perf-evlist List the event names in a perf.data file
man 1 perf-ftrace simple wrapper for kernel's ftrace functionality
man 1 perf-inject Filter to augment the events stream with additional in...
man 1 perf-intel-pt Support for Intel Processor Trace within perf tools
man 1 perf-iostat Show I/O performance metrics
man 1 perf-kallsyms Searches running kernel for symbols
man 1 perf-kmem Tool to trace/measure kernel memory properties
man 1 perf-kvm Tool to trace/measure kvm guest os
man 1 perf-kwork Tool to trace/measure kernel work properties (latencies)
man 1 perf-list List all symbolic event types
man 1 perf-lock Analyze lock events
man 1 perf-mem Profile memory accesses
man 1 perf-probe Define new dynamic tracepoints
man 1 perf-sched Tool to trace/measure scheduler properties (latencies)
man 1 perf-script Read perf.data (created by perf record) and display tr...
man 1 perf-script-perl Process trace data with a Perl script
man 1 perf-script-python Process trace data with a Python script
man 1 perf-stat Run a command and gather performance counter statistics
man 1 perf-test Runs sanity tests.
man 1 perf-timechart Tool to visualize total system behavior during a workload
man 1 perf-trace strace inspired tool
man 1 perf-version display the version of perf binary


⚙️ Internals

man 2 perf_event_open sets up performance monitoring
uapi/linux/perf_event.h inc
tools/perf src
linux/perf_event.h inc
kernel/events/core.c src
kernel/profile.c src simple profiling


📖 References

perf instruments CPU performance counters, tracepoints, kprobes, and uprobes
https://perf.wiki.kernel.org/


📚 Further reading

perf Examples
The Unofficial Linux Perf Events Web-Page


🛠️ Utilities

Performance Co-Pilot, https://pcp.io/ Performance Co-Pilot
Prometheus, https://prometheus.io/
https://github.com/redhat-nfvpe/container-perf-tools
https://github.com/brendangregg/perf-tools performance analysis tools based on Linux perf_events (aka perf) and ftrace
readprofile a tool to read kernel profiling information


📚 Further reading

stress-ng exercises various kernel interfaces
http://trac.gateworks.com/wiki/linux/profiling
Analyzing application performance in RHEL 9
Monitoring and managing system status and performance in RHEL 9
Real-time Linux

User space debug interfaces

⚲ Interfaces

man 1 dmesg prints or control the kernel ring buffer
man 2 syslog system call, which is used to control the kernel printk() buffer
man 1 strace system calls and signals tracing tool
man 2 ptrace process trace system call
man 3 klogctl
man 5 core
/sys/kernel/debug/ debugfs
dmesg --console-level <level>
gdb /usr/src/linux/vmlinux /proc/kcore
/proc/self/stack
dynamic doc debug
⌨️ hands-on:
echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control


⚙️ Internals

handle_sysrq id


📚 References

Development tools for the kernel doc
DebugFS doc, samples/qmi/qmi_sample_client.c src
Kprobe-based Event Tracing doc
Dynamic debug doc
Linux Magic System Request Key Hacks doc
Magic SysRq key

Tracing and logging

⚲ API:

User-space interface:

man 1 dmesg prints or control the kernel ring buffer
man 2 syslog system call, which is used to control the kernel printk() buffer
/proc/kmsg
https://kernelshark.org/ front end reader of trace-cmd
https://trace-cmd.org/, man 1 trace-cmd CLI for Ftrace doc Linux kernel internal tracer /sys/kernel/debug/tracing/

Most common functions

linux/printk.h inc
pr_devel id- conditional debug-level message
pr_debug id- conditional debug-level or dynamic doc message
⌨️ hands-on:
echo "module atkbd +pfl" | sudo tee /sys/kernel/debug/dynamic_debug/control
Log messages with other levels:
pr_info id, pr_notice id, pr_warn id, pr_err id, pr_crit id, pr_alert id, pr_emerg id
asm-generic/bug.h inc
WARN_ON id
WARN id


⚙️ Internals

printk id
kernel/printk/printk.c src
arch/x86/kernel/traps.c src
lib/dump_stack.c src
kernel/trace src
scripts/tracing/draw_functrace.py src
logging ltp, tracing ltp
samples/ftrace src
samples/trace_events src
samples/trace_printk src
linux/instrumentation.h inc


📚 References:

Debugging by printing
Message logging with printk doc
SystemTap
man 1 stap systemtap script translator/driver
strace
man 1 strace trace system calls and signals
LTTng
ftrace
Linux Tracing Technologies doc
Tracepoint Analysis doc
Function Tracer doc function, latency and event tracing
Event Tracing doc
Using ftrace to hook to functions doc
Fprobe - Function entry/exit probe doc
Kprobes doc
Kprobe-based Event Tracing doc
Uprobe-tracer: Uprobe-based Event Tracing doc
Using the Linux Kernel Tracepoints doc
Subsystem Trace Points: kmem doc
Subsystem Trace Points: power doc
NMI Trace Events doc
In-kernel memory-mapped I/O tracing doc
Event Histograms doc
Histogram Design Notes doc
Boot-time tracing doc
Hardware Latency Detector doc
Intel(R) Trace Hub (TH) doc
Lockless Ring Buffer Design doc
System Trace Module doc
CoreSight - ARM Hardware Trace doc

🔧 TODO. 🚀 advanced features

linux/kmemleak.h inc memory leak detector
pr_cont id- continues a previous log message in the same line
print_hex_dump_bytes id
print_hex_dump_debug id
dump_stack id
CONFIG_PRINTK_CALLER id
CONFIG_DEBUG_KERNEL id
CONFIG_DEBUG_INFO id
https://git.kernel.org/pub/scm/libs/libtrace/

kgdb and kdb

⚲ Interfaces

linux/kgdb.h inc
linux/kdb.h inc


⚙️ Internals

kernel/debug src


📚 References

Using kgdb, kdb and the kernel debugger internals doc
kdump
kdump doc
man 8 crash Analyze Linux crash dump data or a live system


eBPF

⚲ API:

man 2 bpfkernel/bpf/syscall.c src


📖 References

eBPF and BPF doc


📚 Further reading

man 7 bpf-helpers
Linux Extended BPF (eBPF) Tracing Tools
bpftrace High-level tracing language for Linux eBPF
BCC Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Example of trace.py
man 8 stapbpf
eBPF Programming for Linux Kernel Tracing
lockdep - Runtime locking correctness validator doc


Watchdogs

The Linux Kernel/Softdog Driver

dev_watchdog id network device watchdog

The NMI watchdog lockup detectors:

⚲ API

/proc/sys/kernel/nmi_watchdog
/proc/sys/kernel/soft_watchdog
/proc/sys/kernel/watchdog
/proc/sys/kernel/watchdog_cpumask
/proc/sys/kernel/watchdog_thresh
/proc/sys/kernel/hardlockup_all_cpu_backtrace
/proc/sys/kernel/hardlockup_panic
/proc/sys/kernel/softlockup_all_cpu_backtrace
/proc/sys/kernel/softlockup_panic
linux/nmi.h inc


👁️ Example

./lib/test_lockup.c src test module to generate lockups

Provoke NMI watchdog without panic:

echo 0 > /proc/sys/kernel/hardlockup_panic
insmod test_lockup.ko disable_irq=1 time_secs=13

⚙️ Internals

kernel/watchdog.c src detects hard and soft lockups on a system
kernel/watchdog_perf.c src detects hard lockups on a system using perf
kernel/watchdog_buddy.c src

📚 References

Documentation for /proc/sys/kernel/ doc
Softlockup detector and hardlockup detector (aka nmi_watchdog) doc
kernel parameters:
nmi_watchdog param
nowatchdog param
nosoftlockup param
softlockup_panic param

...

⚙️ Internals

arch/x86/kernel/traps.c src


📖 References for debugging

Ramoops oops/panic logger doc
pstore block oops/panic logger doc
Fault injection doc
Bisecting a bug doc
Development tools for the kernel doc
Kernel Testing Guide doc
Checkpatch doc, scripts/checkpatch.pl src
Selftests doc, tools/testing/selftests src
linux/tracepoint.h inc


📚 Further reading

https://drgn.readthedocs.io/ programmable debugger
https://crash-utility.github.io/
https://wiki.ubuntu.com/Kernel/Debugging
Intel VTune Profiler
Linux Applications Debugging Techniques
Category:Book:The Linux Kernel#Debugging%20
Category:Book:The Linux Kernel