Linux Crisis Tools

Brendan Gregg posted the following list of 'crisis tools' which you should install on your Linux servers by default (so they are available when an incident happens).

procpsps(1), vmstat(8), uptime(1), top(1)basic stats
util-linuxdmesg(1), lsblk(1), lscpu(1)system log, device info
sysstatiostat(1), mpstat(1), pidstat(1), sar(1)device stats
iproute2ip(8), ss(8), nstat(8), tc(8)preferred net tools
numactlnumastat(8)NUMA stats
tcpdumptcpdump(8)Network sniffer
linux-tools-$(uname -r)
perf(1), turbostat(8)profiler and PMU stats
bpfcc-tools (bcc)opensnoop(8), execsnoop(8), runqlat(8), softirqs(8),
hardirqs(8), ext4slower(8), ext4dist(8), biotop(8),
biosnoop(8), biolatency(8), tcptop(8), tcplife(8),
trace(8), argdist(8), funccount(8), profile(8), etc.
canned eBPF tools[1]
bpftracebpftrace, basic versions of opensnoop(8),
execsnoop(8), runqlat(8), biosnoop(8), etc.
eBPF scripting[1]
trace-cmdtrace-cmd(1)Ftrace CLI
nicstatnicstat(1)net device stats
ethtoolethtool(8)net device info
tiptoptiptop(1)PMU/PMC top
cpuidcpuid(1)CPU details
msr-toolsrdmsr(8), wrmsr(8)CPU digging

blog comments powered by Disqus