Kernel Crash Dumps
This article explains how to capture the kernel crash dumps (kdump). Kdumps are produced by kernel panic or lockup. To be simple, just a single kernel is used both for the ordinary system and recovery. The described method is almost distro independent. This article is based on KDump on Gentoo by rich0, and the first version is posted by the author.
You need to activate the following kernel options:
Processor type and features ---> [*] kexec system call [*] kernel crash dumps [*] Build a relocatable kernel Kernel hacking ---> [*] Kernel debugging Compile-time checks and compiler options ---> [*] Compile the kernel with debug info File systems ---> Pseudo filesystems ---> -*- /proc file system support [*] /proc/vmcore support
USE flags for sys-apps/kexec-tools Load another kernel from the currently executing Linux kernel
emerge --ask kexec-tools
Create /etc/local.d/kdump.start containing:
#!/bin/bash kexec -p /[path-to-kernel] --append="root=[root-device] single irqpoll maxcpus=1 reset_devices"
If you are using an initramfs you also have to pass that as a parameter. For example:
#!/bin/bash kexec -p /boot/kernel-genkernel-x86_64-3.16.1-gentoo \ --initrd=/boot/initramfs-genkernel-x86_64-3.16.1-gentoo \ --append="root=/dev/mapper/lvm-slash single irqpoll maxcpus=1 reset_devices dolvm softlevel=kdump"
Now make this file executable:
chmod u+x /etc/local.d/kdump.start
Note that your kernel has to be readable. (A typical gentoo config leaves /boot unmounted, so you'll either need to remove noauto from your fstab or place a copy of your kernel elsewhere.)
To the kernel boot option, add crashkernel=64M for up to around 12GB of system RAM.
First, run the above script.
It loads the rescue kernel image which is run after kernel crash.
Whenever you get a kernel panic or lockup (hard/soft if the kernel is set to detect them), kexec runs the kernel in crash mode, relocated to a reserved area of memory. The rest of RAM will be untouched. When the system boots up log in and copy /proc/vmcore to a file - this is your crash dump. Then reboot your system to get back to a normal configuration; you shouldn't continue to operate in this state.
You can force a kernel panic by executing the following command (do not forget to save all data, log-out other users and leave the filesystems in a clean state by the invocation of sync before doing this):
echo c | tee /proc/sysrq-trigger