Btrfs/Native System Root Guide

Converting to a "Native" btrfs Based System
This exercise is an update to the original example for re-basing a Gentoo installation's root filesystem to use btrfs found here. In this case, the existing system is a mirror set using two 2TB drives at and. Two fresh 2TB drives have been added at and. In the original exercise, the mdadm array was kept to mirror the partitions on the two drives, while the rest of the partitions were converted to btrfs subvolumes.

This second exercise explores the use of GRUB2 to fully convert the mirror set to use the multi-volume functionality of btrfs, implementing as a mirrored btrfs filesystem and forgoing the need to use an initramfs file on  to provide early userspace mounting of the root subvolume. We also will be using the gptfdisk ebuild tools to make a GPT based partition table on the new mirror set in place of a legacy MBR table.

One of the lessons learned here is that GRUB2 (2.00_p5107-r2 here) and the current kernel (3.10.25 here) are not quite up to the job of directly booting a filesystem in a btrfs subvolume, at least, not one in a multi-disc set. However it is able to find a simple filesystem on the default volume of a multi-disc set.

Partitioning
Emerge gptfdisk if you don't already have it. It provides the gdisk, sgdisk and cgdisk utilities for manipulating GPT partitions. These correspond to the legacy fdisk, sfdisk and cfdisk utilities for MBR tables.

We followed the writeup on GRUB2 here to put together the following GPT scheme that GRUB2 can use and got it working after a false start or two.

The grub2biosboot partition was what got missed on the first round and thus got added later.


 * 1007.0KiB free space - Will eventually get the boot record and leave enough of a gap for GRUB2 to park its BIOS.
 * biosboot - The partition type 0xEF02 (BIOS boot partition) must be set in order for GRUB2 to find it and use it. Some web pages suggest using EF00 (EFI System), but this will not work with the current version of GRUB2.  The bare minimum for this size is 1mb, but some pages suggest using at least 2mb.  We err on the side of caution and future bloat.
 * boot - GRUB2 will take about 32mb more of than you may be used to so the usual suggestion of 200mb in days past is now more like 300mb - 500mb depending on how many kernels and initramfs filesystems you like to keep around.  Set the partition type to the default which is 0x8300 for Linux.
 * root - We took the default for size to allocate the rest of the drive to the root partition. Once again the type is set to 0x8300.

The resulting table looks like the following in gdisk:

Repeat the same partitioning on. Similar to sfdisk, the sgdisk utility has the ability to dump the GPT table to a file and then reload that. However the output is binary, so we just went ahead and used cgdisk to create the same layout on.

Filesystem Creation
Since this is a two disk simple mirror, we specify raid1 for both metadata and data when making the two filesystems. If you have more than two drives, the current stable versions of kernel (3.10.25) and btrfs-progs (3.12-r1) now also make raid5 and raid6 available for options alongside raid1 and raid10.

Root Volume
We mount the default volume for the root partition on but will be putting the actual contents into subvolumes with different btrfs features enabled or disabled.

The new root filesystem will go onto a subvolume (activeroot) which is created on the mirror and then mounted to

We mount the existing root filesystem to and use tar to transfer things over while avoiding the dynamic stuff.

Boot Volume
At this point and the mountpoints for the other subvolumes we will be creating are already in place from the tar. We keep the mount options simple for.

Other Volumes
is an obvious candidate for a subvolume, but they are easy to create and manage so you will probably have others. In our example we have the following:


 * /home
 * /distfiles - It doesn't do any good to enable compression strategies for a directory which just has compressed tarballs for the most part.
 * /vm - Keeping your virtual machine store in a separate volume eases snapshotting and migrations. We will enable compression here.  At various points in btrfs history the use of autodefrag had impacts and issues on vm performance.
 * /vmcrypt - If the VM uses drive encryption, the whole compression strategy gets blown out of the water.

This process takes place overnight and a good bit of the next day, so we will gloss over it

Chrooting into /mnt/newroot
We will chroot into the new root filesystem for the next set of steps:


 * Edit the fstab
 * Update the kernel to use an embedded initram filesystem
 * Install grub2 in place of grub0.97

Pre-chroot Preparation
We do the usual prelims to allow GRUB to find things when installing.

Edit your mtab to look something like this

Updating fstab
We edit our new fstab to make it look something like this.

Embedding an initram filesystem
As we noted in the introduction, the current kernel and GRUB2 combination appear to work fine at least when searching for and mounting the simple /boot mirror btrfs filesystem in the default volume. However we put our new root in a subvolume in order to take advantage of snapshotting and rollback of root as necessary. To do that, we will follow the guide for Early Userspace Mounting again but will also draw upon one of its references in order to have the kernel build process make the archive and then embed it in the resulting bzImage file. This lets us make grub2_mkconfig do all of the heavy lifting without having to do a custom stanza in.

Run menuconfig to use devtmpfs and to specify an initramfs_list that will be used to put an initramfs into the bzImage

initramfs_list
We don't want to bloat the bzImage too much so only add the essential busybox, filesystem tools and the nano editor. Use ldd to figure out any dependencies of additional commands that you might add to yours.

init script Alternatives
We started with the init script from the previous exercise but have modified it to create two alternatives. The first should parse the /proc/cmdline to pull the information needed to mount root. It thus doesn't need to use the embedded fstab, but you can place a copy of your normal one in the initramfs directory in case you are dumped to the rescue shell.

The second is a basic mount script that uses an embedded fstab to explicitly spell out the mount entries for root and boot. We resorted to using this when we ran into problems with the init stage and then found that grub2 was not letting busybox open the console device.

/proc/cmdline parsing init script
This has not yet been shown to work during an actual boot, but the logic was tested against busybox on a running system.

Basic mount init script
This basic fstab based mount version and the accompanying fstab appears to work.

Debug boot environment files
It was interesting to see what the environment inside the kernel looks like.

Rebuilding kernel
Our kernel just got about 3-4mb bigger as a result.

Running grub2_mkconfig
The mkconfig will have a problem probing the multi-device set as follows:

This will also cause grub2-install to error out unless we uncomment and specify the GRUB_DEVICE=/dev/sdc3 in. When we googled around for this, we found a number of pages such as Fedora bug reports that suggested a patch with bash arrays and jiggered quotes in a grub2 script. However it doesn't appear that this patch has gotten far enough upstream or resolved to everyone's liking for Gentoo stable to pick it up yet.

We also uncomment and force GRUB_TERM=console since other pages suggest that there may be issues with grub2 using a framebuffer and then running the proprietary nvidia-driver for X11 later on. This may also be true for recent xf86-video-ati driver versions where we are activating a framebuffer before the GPU firmware gets loaded later on from.

After the edit the mkconfig now runs without any issues.

The resulting file will specify the filesystems by uuid since we let that setting default in. We can do a sanity check as follows and compare the results.

The set root='hd2,gpt2' statements in the grub.cfg will be wrong once we remove the original drive set and make the new set and. However the if-then-else statements right after do a search for the correct uuid for the BTBOOT filesystem and thus will correct the root.

Installing the MBR
We need to run grub2-install on both drives.

With the MBR's installed, we also want to go back and re-edit the file to reset the GRUB_DEVICE=/dev/sda3 before we forget. It may be a while before we run grub2_mkconfig again to update for a new kernel.