User:Sakaki/Sakaki's EFI Install Guide/Preparing the LUKS-LVM Filesystem and Boot USB Key

In this section, we'll be shadowing Chapter 4 of the Gentoo handbook (and, although we're going to start to diverge considerably, you may want to read that chapter before proceeding, as it has some useful background information).

The process we'll be following here is:
 * 1) First, we'll format the smaller USB key (discussed earlier) so that it can support booting under UEFI (although there'll be no bootable kernel on it, yet). We'll then mount it.
 * 2) Next, we'll create a pseudo-random binary blob of key data that will be used to secure the main computer drive, encrypt this with a passphrase using GPG, and store the result on the USB key.
 * 3) Then, we will create a new GPT (GUID partition table) partition on the target machine's main drive, using the space we freed up from Windows earlier in the tutorial.
 * 4) We'll then (optionally) overwrite that partition with pseudo-random data.
 * 5) Next, we'll format the partition using LUKS (secured with the key data created in step 2).
 * 6) We'll then (optionally) add a fallback passphrase to the LUKS container.
 * 7) Then, we'll create an LVM physical volume (PV) on the LUKS partition, create an LVM volume group (VG) with just that one physical volume in it, and then create three logical volumes (LVs) (for the Gentoo root, swap and home partitions) utilizing that physical volume.
 * 8) Finally, we'll format the logical volumes appropriately, and mount them so that they can be used in the rest of the installation.

Let's go!

Formatting and Mounting the UEFI-Bootable USB Key
We are going to use our smaller (>= 128 MB) USB key as the boot device for Gentoo Linux. Since we want it to work under UEFI, we must format it using GPT with a single EFI system partition.

Issue (using of course the /  terminal we have just established):

And note the output. Then, insert the smaller capacity USB key into one of the remaining free USB slots on the target machine, and determine its device path. We will refer to its path in these instructions as, but in reality on your system it will be something like , etc. You can find what it is, by issuing  again, and noting what has changed:

(note that the initial prefix is not shown in the  output)

The minimal-install image shouldn't auto-mount the USB key, even if it has any existing partitions, but double-check to make sure (no mountpoints for the device should be shown in the output of the above command).

Now, using parted, we will create a single primary partition, sized so as to fill the USB key completely (you can of course use a more modest extent if your drive is much larger than the minimum required size), and set its somewhat confusingly named 'boot' flag (i.e., mark the partition as a GPT system partition). Issue:

Next, we need to format the partition fat32:

Now we create a temporary mountpoint and mount the partition:

Creating a Password-Protected Keyfile for LUKS
We will next create a (pseudo) random keyfile (for use with LUKS). This keyfile will be encrypted with GPG (using a typed-in passphrase) and then stored on the USB key.

The point of this is to establish dual-factor security - both the (encrypted) keyfile, and your passphrase (to decrypt it) will be required to access the LUKS data stored on the target machine's hard drive. This means that even if a keylogger is present, should the machine be stolen - powered down but without the USB key - the LUKS data will still be safe (as the thief will not have your encrypted keyfile). Similarly, (assuming no keylogger!) if your machine were to be stolen powered down but with the USB key still in it, it will also not be possible to access your LUKS data (as in this case the thief will not know your passphrase).

Note that we are going to create a (one byte short of) 8192KiB underlying (i.e., binary plaintext) keyfile, even though, for the symmetric LUKS cipher we'll be using (Serpent), the maximum supported key size is 256 bits (32 bytes) (or two 256 bit keys = 512 bits = 64 bytes in XTS mode, as explained later). This works because LUKS / uses the PBKDF2 key derivation function to map the keyfile into the actual (user) key material internally (which in turn is used to unlock the master key actually used for sector encryption / decryption), so we are free, within limits, to choose whatever size keyfile we want. As such, we elect to use the largest legal size, so as to make it (very slightly) harder for any data capture malware (in low-level drivers, for example) to intercept the file and squirrel it away, or transmit it over the network surreptitiously. In theory, the system can support keyfiles up to and including 8192KiB (execute  to verify this); in practice, due to a off-by-one bug, it supports only keyfiles strictly less than 8MiB. We therefore create a keyfile of length (1024 * 8192) - 1 = 8388607 bytes.

Note that we'll use the /dev/urandom source to create the underlying (binary plaintext) pseudo-random keyfile, and then pipe it to gpg to encrypt (using a passphrase of your choosing). The resulting binary ciphertext is saved to the USB key. This avoids ever having the binary plaintext keyfile stored on disk anywhere (and indeed not even you need ever see the unencrypted contents). Enter:

What passphrase you choose to protect your LUKS keyfile is, of course, entirely up to you, but do consider the approach of using a longer list of everyday words, rather than the more traditional cryptic str1ng5 @f characters. Advantages include:
 * it's easier to hit a reasonable level of entropy;
 * you are less likely to forget the resulting passphrase; and
 * your passphrase will be more robust in the face of keymapping snafus at boot time.

Creating a New GPT Partition on the PC's Main Drive
Our next task is to create a new GPT partition on the target PC's hard drive (which we freed up space for earlier).

We will use the tool, instruct it to use sectors for units, and then display the free space on the current drive. We'll then create a new primary partition on that drive using all the available space indicated.

We must first find the device path of the main hard drive on the target machine. We will refer to this as in the following text, but it will be something like,  etc. on your machine. Check the actual path with:

If you are dual booting with Windows, you'll probably see that the desired drive has four existing partitions (note that the initial  prefix is not shown in the  output). None of these should be mounted (all should have blank mountpoints in the output of ).

Now we will create the partition:

Now check that the sector has been created correctly. We'll issue an command again:

Take note of the new sector device path (note that the initial  prefix is not shown in the  output). We will refer to this as in the below, but it will actually be something like,  etc. If you have a non-standard Windows setup, the number of the new partition may also be something other than 5, so do please double check.

Overwriting the New Partition with Pseudo-Random Data (Optional Step)
You can skip this step if you like. The main reasons to perform an overwrite are:
 * to purge any old, unencrypted data that may still be present in the partition (from prior use); and
 * to make it somewhat harder for an attacker to determine how much data is on your drive if the machine is compromised.

However, it may make things slower on a solid-state drive, by forcing any new writes to first delete a sector (once any overcapacity has been exceeded), rather than simply writing to a fresh, unused one (and furthermore, it cannot completely be guaranteed that old data has been wiped, when using such devices).

This command may take a number of hours to complete.

By itself, will not show any progress, however, if you are using, you can send it USR1 signals to make it print I/O statistics. Hit followed by  to start a new virtual console within  (do this from the console where you started the ). Then issue:

Now switch back to the original virtual console with followed by  and you will be able to see  slowly progressing.

Once the overwrite completes, switch to the second console again with followed by, and then use  to kill , and  to close the (second) console, which leaves you back with a single virtual console again.

Formatting the New Partition with LUKS
The next step is to format the partition using LUKS. LUKS, which stands for Linux Unified Key Setup, is as the name suggests primarily a way to manage the encryption keys for whole-partition (or drive) encryption. It does this by first generating a high-entropy, secret, master key, which is then further encrypted under between one and eight user keys (themselves first pre-processed by PBKDF2).

The target partition itself begins with a LUKS metadata header, followed by the key material corresponding to each of the 8 possible user 'slots', and finally the bulk, encrypted (payload) data itself (the encrypted sector data for the partition).

The LUKS master key itself is never stored in unencrypted form on the partition, nor (unless you explicitly request it) even made visible to you, the user.

LUKS uses a cryptographic splitting and chaining technique to artificially inflate the size of the key material for each slot into a number of interdependent 'stripes'. This is done to increase the likelihood that, when a slot is modified (a user key is revoked, or changed, for example), that the old key material is, indeed, irrecoverable (necessary, since under LUKS the partition master key is never changed once created). Be warned though, that with solid-state drives no guarantees can be given, if you change a user key, that the old key material is not retained on the drive somewhere (due to wear-levelling etc.).

LUKS functions are accessed via the cryptsetup program, and use dm-crypt for the back-end processing. Note that LUKS is agnostic as to the actual symmetric encryption method used, provided it is supported by dm-crypt. You can get a list of the (currently loaded) encryption and hash algorithms by issuing:

(You may have others available as kernel modules, which will be loaded when required).

What we need to do is tell :
 * the underlying block cipher we want to use (block ciphers work on fixed-size units, or blocks, of data to encrypt or decrypt at a time),
 * the key length to use with this cipher,
 * the way we'll tweak it to en/decrypt amounts of data larger than one cipher block (many ciphers use a 16-byte block, and sectors, the indexing unit, are larger than this),
 * what processing, if any, should be applied to the sector index number during IV computation, and
 * the hash algorithm used for key derivation (under the PBKDF2 algorithm within LUKS)

This isn't a cryptography primer (see this article for further reading), but here's a thumbnail justification for the choices made:
 * we will use Serpent as the block cipher; this came second in the AES competition mainly for reasons of speed, but has a more conservative design (32 rounds as opposed to 14) and scored a higher safety factor when compared to the Rijndael algorithm that won the competition (and which, accordingly, is now commonly referred to as 'AES');
 * for security, we'll use the longest supported key length for Serpent, which is 256 bits (see the following point, however);
 * we will use XTS mode to both extend the cipher over multiple blocks within a sector, and perform the by-sector-index 'tweaking'; this approach overcomes the security weakness in the more conventional CBC / ESSIV methodology, whereby an attacker, although unable to read the encrypted material, can yet, if they know the cleartext for that sector (possible for some system files), arbitrarily modify alternating blocks to inject shellcode ; this is a non-trivial concern for a dual-boot machine where the Windows side of things is untrusted (and has access to the encrypted contents of the LUKS partition when running). Note that since XTS mode actually requires two keys, we must pass an effective key length of 512 (= 2 x 256) bits to ;
 * as XTS is a counter mode, we will simply pass the untransformed ("plain") 64-bit sector index to it (using a 64-bit index will allow for disks > 2TiB);
 * we will use SHA-512 as the user key hashing function for LUKS' PBKDF2 processing; it is a robust 512 bit hash.

We decrypt our keyfile from the USB key (using ) and pipe it to, to avoid the unencrypted keyfile having to be saved to disk. The and  strings instruct  to use the settings just discussed.

Check that the formatting worked, with:

This should print out information about the LUKS setup on the sector, and show that one of the 8 keyslots (slot 0) is now in use (incidentally, pointing out that LUKS does not provide any plausible deniability about the use of encryption! You can detach the header and store it on a separate device, but we won't do that here as it isn't supported in the standard init scripts that we'll rely on later.).

If the LUKS header gets damaged, your encrypted data will be lost forever, even if you have a backup of the GPG key and passphrase. Therefore, you may wish to consider backing-up the header to a separate device, and storing it securely. See the LUKS FAQ for more details on how to do this.

Adding a Fallback Passphrase (Optional Step)
Since LUKS supports up to 8 user key 'slots', you can, if you wish, add an additional (traditional) passphrase to your LUKS container now. This is not intended for use day-to-day, but simply as a last-resort fallback, should you lose the USB key with the GPG keyfile on it, for example.

Unfortunately, the necessary command requires that we provide an existing valid user key in addition to the new one we want to add. If we pipe this in directly from (as we did earlier), then cryptsetup will not prompt correctly for a new passphrase. To get around this issue, without writing the existing GPG key out in binary plaintext form to a disk file, we'll use a named pipe.

Assuming you're using, hit followed by  to start a new virtual console. Then type:

(The slightly odd approach of piping via is intentional.) This will block once you type in your passphrase, as nothing is connected to the other end our the named pipe (yet). Now switch back to the original virtual console with followed by, and enter:

Verify that this worked by issuing:

You should now see slot 1 is enabled, as well as slot 0. Now, remove the named pipe, since we no longer need it:

Lastly, switch back to the second virtual console with followed by, and then hit  to close it out and return to the original console again.

Creating the LVM Structure (PV->VG<-LVs) on Top of LUKS
Our next step is to set up an LVM structure within the LUKS container we just created. LVM stands for Logical Volume Manager: a useful overview may be found here, and a handy command cheatsheet here. It is a highly flexible virtual partition system. Some important LVM terminology is as follows:
 * A physical volume (PV) is an underlying storage device (for example, an actual disk partition or loopback file), which is managed by LVM. PVs have a special header, and are divided into physical extents.
 * A physical extent (PE) is the smallest allocatable unit of a PV. We will use the default PE size of 4MiB in this tutorial.
 * A logical volume (LV) is LVM's equivalent of a partition. It contains logical extents, which are mapped one-to-one onto the PEs of contributing physical volumes. Note - unlike a conventional partition, because of this architecture an LV can span multiple underlying physical volumes, and a physical volume can host multiple logical volumes, if desired. The LV appears as a standard block device, and so can be formatted with any normal Linux filesystem (e.g. ext4). We will create LVs for the root directory, the user home directory and swap in this tutorial.
 * A volume group (VG) is an administrative unit gathering together a collection of LVs and PVs. We will create a single VG containing a single PV, and (as just mentioned) three LVs.

The main reason we're using LVM here is to provide a simple way to get three 'logical' partitions on top of a single underlying LUKS container (partition). LVM also provides a number of additional advantages when resizing, backing up, or moving partitions, in exchange for a little initial configuration overhead.

To proceed with LVM, the first thing we need to do is open the LUKS volume we just created, as it will host our single PV. Issue:

Check that this worked:

You should see the device 'gentoo' in the device mapper list, as above. This is our unlocked LUKS partition.

Next, we'll create an LVM physical volume (PV) on this partition:

Then, we create a volume group (VG) hosting this PV. We'll call the new VG "vg1". Note that since we're using lvm2 format here, there's no need to set a larger physical extent size - the default of 4MiB per PE will be fine :

Now, we'll create three logical volumes (LVs) in this volume group. The first is for swap. To allow the use of suspend to disk (which we'll setup later) we'll want a swap slightly larger than the size of our RAM. So first, find the size of RAM on your system with:

In the case of the CF-AX3, this shows just under 8GiB, hence we'll allocate 10GiB. Adjust this for your system and preferences. If you don't want to use suspend to disk, a much smaller swap would work just as well.

Next, we'll create a relatively large LV to hold our root partition. This will eventually hold everything apart from the user home directories, and, since this is Gentoo, we'll need a fair amount of room for files and so on. We'll allow 50GiB here - if you wish you can make this smaller or larger of course:

Finally, let's create a third LV to hold the user home directories. We'll instruct LVM to use almost all the remaining space on the LUKS container for this, leaving 5% of the (so far unused space) free (this additional room will come in useful if you want to take a snapshot later, for example).

You should now be able to look at the status of the physical volume (PV), volume group (VG) and logical volumes (LVs), as follows:

The final task in this step is to 'activate' the new volume group (vg1) so that it's logical volumes become available as block devices via the device mapper. Issue:

This should inform you that three LVs in the vg1 volume group have been activated. Check that they are visible via the device mapper:

If your output looks similar to the above, then all is well. The new logical volumes (, and ) can be treated exactly like physical disk partitions (i.e., just like  etc.).

Formatting and Mounting the LVM Logical Volumes (LVs)
Now we have our virtual partitions, we need to <span id="create_lvs">set up their filesystems and then mount them.

First, create the swap:

Next, the root filesystem. We'll create this as ext4 (you can of course modify this if you wish):

Finally, the user home filesystem, also ext4. Note that we use the option here, since ext4 will, by default, reserve 5% of the filesystem for the superuser, and we don't need that in this location, only on the root partition:

Now, we activate the swap:

And, per the handbook, mount the root directory at the pre-existing mountpoint:

Next, we create the mountpoint, a  directory, and a  mountpoint. The purpose of these is as follows:
 * will be the mountpoint for our home directory LV.
 * will be the equivalent of the /boot directory in the Gentoo handbook. We will build our kernel and initramfs targeting this directory as usual, although, since we are booting from an UEFI USB key, this directory will not be used when booting the system itself. Instead, the utility, supplied as part of this tutorial, will be used to copy the final, signed and bootable kernel image onto the USB key (at ) as part of the kernel build process. For that reason, we've converted  from a mountpoint to a regular directory in this tutorial.
 * will be the mountpoint for our USB boot key when inserted in the machine (when installing a new kernel, etc.). We currently have the key mounted at and will need to unmount it.

Create the directories:

Now mount the "home" LVM logical volume from the "vg1" volume group on the mountpoint:

Next, we need to <span id="unmount_efi">unmount the USB boot key's EFI partition from its current temporary mountpoint (we'll remount it later, when we build the kernel):

Finally, <span id="finding_blkids">issue :

Take note of the PARTUUIDs (unique partition identifiers) for these two partitions; we'll make use of them later (in the and the kernel build script's configuration file), rather than relying on the  paths (which can change depending on which devices are plugged in, and the order in which they are recognized).

<span id="next_steps">Next Steps
We're now ready to fetch the additional installation files and setup the build options. Click here to go to the next chapter, "Installing the Gentoo Stage 3 Files".