Linux Crisis Tools
Brendan Gregg posted the following list of 'crisis tools' which you should install on your Linux servers by default (so they are available when an incident happens).
| Package | Provides | Notes |
|---|---|---|
| procps | ps(1), vmstat(8), uptime(1), top(1) | basic stats |
| util-linux | dmesg(1), lsblk(1), lscpu(1) | system log, device info |
| sysstat | iostat(1), mpstat(1), pidstat(1), sar(1) | device stats |
| iproute2 | ip(8), ss(8), nstat(8), tc(8) | preferred net tools |
| numactl | numastat(8) | NUMA stats |
| tcpdump | tcpdump(8) | Network sniffer |
| linux-tools-common linux-tools-$(uname -r) | perf(1), turbostat(8) | profiler and PMU stats |
| bpfcc-tools (bcc) | opensnoop(8), execsnoop(8), runqlat(8), softirqs(8), hardirqs(8), ext4slower(8), ext4dist(8), biotop(8), biosnoop(8), biolatency(8), tcptop(8), tcplife(8), trace(8), argdist(8), funccount(8), profile(8), etc. | canned eBPF tools[1] |
| bpftrace | bpftrace, basic versions of opensnoop(8), execsnoop(8), runqlat(8), biosnoop(8), etc. | eBPF scripting[1] |
| trace-cmd | trace-cmd(1) | Ftrace CLI |
| nicstat | nicstat(1) | net device stats |
| ethtool | ethtool(8) | net device info |
| tiptop | tiptop(1) | PMU/PMC top |
| cpuid | cpuid(1) | CPU details |
| msr-tools | rdmsr(8), wrmsr(8) | CPU digging |