Linux Crisis Tools

25.03.2024 - 21:45

Brendan Gregg posted the following list of 'crisis tools' which you should install on your Linux servers by default (so they are available when an incident happens).

Package	Provides	Notes
procps	ps(1), vmstat(8), uptime(1), top(1)	basic stats
util-linux	dmesg(1), lsblk(1), lscpu(1)	system log, device info
sysstat	iostat(1), mpstat(1), pidstat(1), sar(1)	device stats
iproute2	ip(8), ss(8), nstat(8), tc(8)	preferred net tools
numactl	numastat(8)	NUMA stats
tcpdump	tcpdump(8)	Network sniffer
linux-tools-common linux-tools-$(uname -r)	perf(1), turbostat(8)	profiler and PMU stats
bpfcc-tools (bcc)	opensnoop(8), execsnoop(8), runqlat(8), softirqs(8), hardirqs(8), ext4slower(8), ext4dist(8), biotop(8), biosnoop(8), biolatency(8), tcptop(8), tcplife(8), trace(8), argdist(8), funccount(8), profile(8), etc.	canned eBPF tools[1]
bpftrace	bpftrace, basic versions of opensnoop(8), execsnoop(8), runqlat(8), biosnoop(8), etc.	eBPF scripting[1]
trace-cmd	trace-cmd(1)	Ftrace CLI
nicstat	nicstat(1)	net device stats
ethtool	ethtool(8)	net device info
tiptop	tiptop(1)	PMU/PMC top
cpuid	cpuid(1)	CPU details
msr-tools	rdmsr(8), wrmsr(8)	CPU digging