SSD

From Gentoo Wiki
Jump to:navigation Jump to:search
Resources

This article provides guidelines for basic maintenance, such as enabling discard/trim support, for SSDs (Solid State Drives) on Linux. It presumes the user has a basic understanding of partitioning and formatting disk drives.

Introduction

The term Solid State Drive is commonly used for flash-based block devices. Compared to conventional HDD, flash-based technology offers a much faster access time, lower latency, silent operation, power savings (no moving parts), and more. However, the flash-based technology brings a few issues which require some special system attention and care.

Dealing with empty blocks

Generally, traditional filesystems do not erase deleted data blocks but only flags them as such. Due to nature of flash memory cells any write operation has to be done to empty cells only. Thus writing to physically non-empty cells, flagged as deleted by a filesystem, requires their erasure which makes the operation slower than writing to empty cells. This problem is further amplified by hardware limitations.

For modern kernels it is possible to hint the deleted (not-used) data blocks to SSD. The described mechanism is called discard. Names of implementations differ — TRIM for ATAPI, UNMAP for SCSI, Deallocate for NVMe; MMC and SD cards (although not contained in the term "SSD" technically sharing the same technology: non-volatile flash memory) distinguish between TRIM and ERASE. Filesystem's support is required in order to use discard. Majority of modern filesystems (like Ext4[1], XFS[2] or Btrfs[3]) support discard. Also there are filesystems developed primarily for flash-based devices, such as F2FS.

There are two basic approaches to issue the discard command — using mount discard option (-o discard) for continuous discard[4] or periodic calls of fstrim utility[5].

Slowing wear out

Each write operation performed on a NAND flash cell causes its wear. This fact limits the SSD lifespan. The cell endurance varies with used technology[6]. On the other hand, read operations are straightforward and do not cause cell wear.

A basic method increasing SSD lifespan is to uniformly distribute writes across all the blocks. This method is called wear leveling and is deployed via SSD firmware.

From system point of view, it is appropriate to generally reduce amount of writes.

Considerations

Discard (trim) support

Device's support of discard (sometimes referred to as trim) should be verified before performing any form of discarding on the drive.

It is possible to use lsblk utility from sys-apps/util-linux:

user $lsblk --discard
NAME   DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda           0      512B       2G         0
├─sda1        0      512B       2G         0
├─sda2        0      512B       2G         0
└─sda3        0      512B       2G         0
sdb           0        0B       0B         0
└─sdb1        0        0B       0B         0

A device supporting discard has non-zero values in the columns of DISC-GRAN (discard granularity) and DISC-MAX (discard max bytes). In the example listing above, the /dev/sda device supports discard while /dev/sdb does not.

Warning
Performing discard on a device that does not support it is potentially unsafe.

Initial setup

Partitioning

Sizes of SSD internal data structures (blocks and pages) varies across different devices. Filesystems operates on data structures of different sizes. For optimal performance filesystem data structures should aim not to cross boundaries of underlying SSD internal data structures. Thus effectively minimizing the number of required internal SSD operations. This can be achieved by aligning start of each partition — the common alignment is to 1 MiB.

Both parted and fdisk partitioning utilities support partition alignment. For parted, there is -a optimal option. Recent versions of fdisk should use optimal alignment by default[7].

It is possible to easily check the alignment for given partition using parted:

root #parted /dev/sda
(parted) align-check optimal 1
1 aligned

For further details about the partitioning, follow dedicated handbook chapter.

blkdiscard

blkdiscard utility from sys-apps/util-linux-2.23 (or later) discards all data blocks on given device.

Warning
All data on the discarded device will be lost!

LVM

LVM aligns to MiB boundaries and passes discards to underlying devices by default. No additional configuration is required.

In order to discard all unused space in a Volume Group (VG) use the blkdiscard utility:

root #lvcreate -l100%FREE -n trim yourvg
root #blkdiscard /dev/yourvg/trim
root #lvremove yourvg/trim

Alternatively, there is a discard option in lvm.conf which makes LVM discard entire Logical Volume (LV) on lvremove, lvreduce, pvmove and other actions that free Physical Extents (PE) in a VG.

Warning
Enabling it will immediately render the system unable to undo any changes to the LV layout.
FILE /etc/lvm/lvm.conf
devices {
  issue_discards = 1
}

dm-crypt/LUKS

For discards to pass through encrypted LUKS devices, they have to be opened with the --allow-discards option.

root #cryptsetup luksOpen --allow-discards /dev/thing luks

When root-device exists on LUKS, enabling discards depends on the Initramfs implementation. When using genkernel for creating your initramfs, pass the following kernel option:

FILE /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="root_trim=yes"

When using dracut for creating the initramfs, pass the following kernel option:

FILE /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="rd.luks.allow-discards"

To evaluate if discard is enabled on a LUKS device, check if the output of the following command contains the string allow_discards:

root #dmsetup table /dev/mapper/crypt_dev --showkeys

Formatting

Similarly to partitions, performance can be improved if a filesystem is configured the way it can align its data structures with device's internal structures sizes — namely its erase block size.

This configuration gets important in case of a software RAID, when one really should know the erase block size[8]. Consider this information when making a purchase.

Configuring for erase block size

When device's erase block size is known, it can be used when creating a filesystem.

For example for ext4 using mkfs.ext4 on an average-sized partition, it will apply 4KiB blocks[9]. Using -E stride and -E stripe-width options, it is possible to set the alignment to erase block size. Both options should be set as erase block size / block size.

For a drive with 512KiB erase block size, it makes 512KiB / 4KiB = 128:

root #mkfs.ext4 -E stride=128,stripe-width=128 /dev/sda3
List of devices with known erase block sizes
  • OCZ drives; stride an stripe-width are 128
Note
Erase block size is 512KiB[10]
  • Crucial M500 240GB; stride and stripe-width are 2048
Note
Page size is 16KiB, there are 512 pages per block[11]. 16KiB * 512 = 8192KiB for erase block size. 8192KiB / 4KiB = 2048 for stride and stripe-width size.
  • SanDisk z400s; stride an stripe-width are 4096
Note
According to Dutch customer care service from SanDisk the erase block size = 16KiB.

Mounting

For rootfs it is usually recommended to periodically use fstrim utility. Using the discard mount option results in continuous discard that could potentially cause degradation of older or poor-quality SSDs[5].

The following command can be used manually or be setup as a periodic job to run once a week[12]:

root #fstrim -v /
Note
On a btrfs system, running the fstrim command on any mounted subvolume will perform the discard command on the device.

For mount points with a low amount of disk writes occurring on a SSD it should be safe to use the discard mount option in /etc/fstab. Also it is recommended to use the mount option when maintaining performance is required[13].

Given the considerations above, a discard-enabled /etc/fstab could look like this:

FILE /etc/fstabfstab with discard enabled
/dev/sda3          /mnt/archive          ext4          defaults,relatime,discard          0 1

Once the /etc/fstab has been modified, remount all filesystems mentioned there via:

root #mount -a

Additional configuration

Periodic fstrim jobs

There are multiple ways how to setup a periodic block discarding process. As of 2018, the default recommended frequency is once a week[12].

cron

Run fstrim on all mounted devices that support discard on a weekly basis:

FILE /etc/crontabRun fstrim once per week
# Mins  Hours  Days   Months  Day of the week   Command
  15    13     *      *       1                 /sbin/fstrim --all

Similarly, it is possible to run fstrim only for a selected mount point:

FILE /etc/crontabRun fstrim once per week on rootfs
# Mins  Hours  Days   Months  Day of the week   Command
  15    13     *      *       1                 /sbin/fstrim -v /

SSDcronTRIM

There is also a semi-automatic cron job available on GitHub called SSDcronTRIM which has the following features:

  • Distribution independent script (developed on a Gentoo system).
  • The script decides every time depending on the disk usage how often (monthly, weekly, daily, hourly) each partition has to be trimmed.
  • Recognizes if it should install itself into /etc/cron.{monthly,weekly,daily,hourly}, /etc/cron.d or any other defined directory and if it should make an entry into crontab.
  • Checks if the kernel meets the requirements, the filesystem is able to and if the SSD supports trimming.

SSDcronTRIM-LUKS

There is also a semi-automatic cron job available on GitHub called SSDcronTRIM-LUKS with dm-crypt/LUKS support.

systemd timer

sys-apps/util-linux on systemd-enabled systems comes with a timer unit executing a weekly fstrim. Enable it with:

root #systemctl enable fstrim.timer

Without cron, on system shutdown (with OpenRC)

A /etc/local.d script may be used to trim on poweroff on Fridays:

FILE /etc/local.d/date.stop
# From
# https://fitzcarraldoblog.wordpress.com/2018/01/13/running-a-shell-script-at-shutdown-only-not-at-reboot-a-comparison-between-openrc-and-systemd/
if [ `who -r | awk '{print $2}'` = "0" ] && [ "$(date +%a)" = "Fri" ]; then
    echo /etc/local.d/trim.stop: run SSD trim
    fstrim / --verbose
    sleep 5
fi

Reducing amount of writes

The flash-based SSDs have a limited write lifetime - the number of writes performed[6]. Thus when using a SSD, administrators generally want to reduce the amount of writes.

Portage TMPDIR on tmpfs

When building packages via Portage it is possible to perform the operations on tmpfs and get the tmpfs' benefits. See Portage TMPDIR on tmpfs guide.

Temporal files on tmpfs

Warning
Remember that all data in tmpfs reside in volatile memory. So data on tmpfs will be lost after system reboot, shutdown or crash!

It is possible to mount desired mount points as tmpfs. Since tmpfs stores files in volatile memory all the I/O operations directed to the given mount points are not performed on the solid state disk. This reduces the amount of writes and also improves performance.

This is an example of both /tmp and /var/tmp being mounted as tmpfs:

# temporal mountpoints on tmpfs
tmpfs           /tmp            tmpfs           size=16G,noatime        0 0
tmpfs           /var/tmp        tmpfs           size=1G,noatime         0 0
Warning
/var/tmp will typically also be used by Portage (under the /var/tmp/portage directory). If it is too small, it will lead to build errors. Refer to Portage TMPDIR on tmpfs for appropriately sizing /var/tmp.

XDG cache on tmpfs

When running a Gentoo desktop, many programs, using X Window System (Chromium, Firefox, Skype, etc.) make frequent disk I/O every few seconds to cache[14].

The cache directory location usually complies to XDG Base Directory Specification[15], namely to the XDG_CACHE_HOME environment variable. The default cache location is ~/.cache, which is usually mounted on a hard drive and could be moved to tmpfs.

To remap the cache directory location create a script that exports to directory under /tmp:

FILE /etc/profile.d/xdg_cache_home.sh
if [ ${LOGNAME} ]; then
  export XDG_CACHE_HOME="/tmp/${LOGNAME}/.cache"
fi

Web browser profile(s) and cache on tmpfs

The web browser profile/s, cache, etc. can be relocated to tmpfs. The corresponding I/O associated with using the browser gets redirected from the SSD drive to tmpfs' volatile memory, resulting in reduced wear to the physical drive and also improving browser speed and responsiveness.

It is possible to relocate the browser components mentioned above with the utility www-misc/profile-sync-daemon:

root #emerge --ask www-misc/profile-sync-daemon
Note
Note www-misc/profile-sync-daemon version 6 or greater requires systemd.
systemd

Close all the browsers, start and enable the daemon:

user $systemctl --user enable --now psd

Now it is possible to view all symlinks by printing the status of the started daemon:

user $psd p
OpenRC

Next add the users whose browser(s) profile(s) will get symlinked to a tmpfs or another mountpoint in the variable USERS:

FILE /etc/psd.conf
USERS="user user2 root"

Finally, close all the browsers, start and enable the daemon:

root #rc-update add psd default
root #rc-service psd start

Now it is possible to view all symlinks by printing the status of the started daemon:

user $psd p

See also

  • HDD — describes the setup of an internal SATA or PATA (IDE) rotational hard disk drive.
  • NVMe — flash memory chips connected to a system via the PCI-E bus.

External resources

References

  1. Performance of TRIM command on ext4 filesystem, people.redhat.com. Retrieved on October 29, 2018
  2. FITRIM/discard, XFS.org. Retrieved on October 29, 2018
  3. FAQ - btrfs Wiki, btrfs.wiki.kernel.org. Retrieved on October 29, 2018
  4. mount(8) - Linux manual page, man7.org. Retrieved on October 29, 2018
  5. 5.0 5.1 fstrim(8) - Linux manual page, man7.org. Retrieved on October 29, 2018
  6. 6.0 6.1 Hard Drive - Why Do Solid State Devices (SSD) Wear Out, Dell. Retrieved on October 29, 2018
  7. fdisk(8) - Linux manual page, man7.org. Retrieved on October 31, 2018
  8. RAID setup - Linux Raid Wiki, wiki.kernel.org. Retrieved on November 1, 2018
  9. mke2fs(8) - Linux manual page, man7.org. Retrieved on November 1, 2018
  10. Partition Alignment Spreadsheet, techpowerup.com. Retrieved on November 1, 2018
  11. The Crucial/Micron M500 Review (960GB, 480GB, 240GB, 120GB)
  12. 12.0 12.1 fstrim.timer\sys-utils - util-linux/util-linux.git - The util-linux code repository, kernel.org. Retrieved on October 30, 2018
  13. 2.4. Discard unused blocks, Red Hat. Retrieved on October 30, 2018
  14. Firefox is eating your SSD - here is how to fix it, Loyolan Ventures. Retrieved on October 28, 2018
  15. XDG Base Directory Specification, freedesktop.org. Retrieved on October 28, 2018