ZFS
| External resources |
ZFS is an advanced filesystem and was developed by SUN Microsystems.
Contents |
Features
ZFS includes many features like:
- Manage storage hardware as vdev in zpools
- Manage volumes in zpools (like LVM)
- Redundancy with support for RAIDZ1(RAID5),RAIDZ2(RAID6) and MIRROR(RAID1)
- Resilvering file system
- Data Deduplication
- Data Compression with zle (fast) or gzip (higher compression)
- Snapshots (like differencial backups)
- NFS export of volumes
Installation
Modules
There are out-of-tree Linux kernel modules available from the ZFSOnLinux Project. The current release is version 0.6.1 (zpool version 28). This Version is the first release which is considered to be "ready for wide scale deployment on everything from desktops to super computers", by the ZFSOnLinux Project.
To install ZFS on Gentoo Linux requires ~amd64 keyword for sys-fs/zfs and it's dependencies sys-fs/zfskmod and sys-fs/spl:
root # echo "sys-kernel/spl ~amd64" >> /etc/portage/package.accept_keywords
root # echo "sys-fs/zfs-kmod ~amd64" >> /etc/portage/package.accept_keywords
root # echo "sys-fs/zfs ~amd64" >> /etc/portage/package.accept_keywords
root # emerge -av zfsThe latest upstream versions require keywording the live ebuilds (optional):
root # echo "=sys-kernel/spl-9999 **" >> /etc/portage/package.accept_keywords
root # echo "=sys-fs/zfs-kmod-9999 **" >> /etc/portage/package.accept_keywords
root # echo "=sys-fs/zfs-9999 **" >> /etc/portage/package.accept_keywordsAdd zfs to the boot runlevel to mount all zpools on boot:
root # rc-update add zfs bootUSE flags
| USE flag | Default | Recommended | Description |
|---|---|---|---|
| custom-cflags | No | No | Build with user-specified CFLAGS (unsupported) |
| rootfs | Yes | Yes | Install zfs-shutdown script to support exporting a pool containing rootfs |
| static-libs | No | No | Build static libraries |
| test-suite | No | No | Install regression test suite |
Tweak
Per default ZFS uses as much memory as available for its ARC cache. It should not be less than 512MB and a good value is 1/4 of available memory. This property can only be set during module loading, to restrict how much memory should be used to 512MB:
root # echo "options zfs zfs_arc_max=536870912" >> /etc/modprobe.d/zfs.confInstalling into the kernel directory (for static installs)
This example uses 9999, but just change it to the latest ~ or stable (when that happens) and you should be good. The only issue you may run into is having zfs and zfs-kmod out of sync with eachother. Just try to avoid that :D
This will generate the needed files, and copy them into the kernel sources directory.
root # (cd /var/tmp/portage/sys-kernel/spl-9999/work/spl-9999 && ./copy-builtin /usr/src/linux)
root # (cd /var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/ && ./copy-builtin /usr/src/linux)After this, you just need to edit the kernel config to enable CONFIG_SPL and CONFIG_ZFS and emerge the zfs binaries.
root # mkdir -p /etc/portage/profile
root # echo 'sys-fs/zfs -kernel-builtin' >> /etc/portage/profile/package.use.mask
root # echo 'sys-fs/zfs kernel-builtin' >> /etc/portage/package.use
root # emerge -1v sys-fs/zfsThe echo's only need to be run once, but the emerge needs to be run every time you install a new version of zfs.
Usage
ZFS includes already all programs to manage the hardware and the file systems, there are no additional tools needed.
Preparation
To go through the different commands and scenarios we can create virtual hard drives using loopback devices.
First we need to make sure the loopback module is loaded. If you want to play around with partitions, use the following option:
root # modprobe -r loop
root # modprobe loop max_part=63The following commands create 2GB image files in /var/lib/zfs_img/ that we use as our hard drives (uses ~8GB disk space):
root # mkdir /var/lib/zfs_img
root # dd if=/dev/zero of=/var/lib/zfs_img/zfs0.img bs=1024 count=2097152
root # dd if=/dev/zero of=/var/lib/zfs_img/zfs1.img bs=1024 count=2097152
root # dd if=/dev/zero of=/var/lib/zfs_img/zfs2.img bs=1024 count=2097152
root # dd if=/dev/zero of=/var/lib/zfs_img/zfs3.img bs=1024 count=2097152Now we check which loopback devices are in use:
root # losetup -aWe assume that all loopback devices are available and create our hard drives:
root # losetup /dev/loop0 /var/lib/zfs_img/zfs0.img
root # losetup /dev/loop1 /var/lib/zfs_img/zfs1.img
root # losetup /dev/loop2 /var/lib/zfs_img/zfs2.img
root # losetup /dev/loop3 /var/lib/zfs_img/zfs3.imgWe have now /dev/loop[0-3] as four hard drives available
Zpools
The program /usr/sbin/zpool is used with any operation regarding zpools.
import/export Zpool
To export (unmount) an existing zpool named zfs_test into the file system, you can use the following command:
root # zpool export zfs_test
root # zpool statusTo import (mount) the zpool named zfs_test use this command:
root # zpool import zfs_test
root # zpool statusThe root mountpoint of zfs_test is a property and can be changed the same way as for volumes. To import (mount) the zpool named zfs_test root on /mnt/gentoo, use this command:
root # zpool import -R /mnt/gentoo zfs_test
root # zpool statusOne Hard Drive
Create a new zpool named zfs_test with one hard drive:
root # zpool create zfs_test /dev/loop0The zpool will automatically be mounted, default is the root file system aka /zfs_test
root # zpool statusTo delete a zpool use this command:
root # zpool destroy zfs_testMIRROR Two Hard Drives
In ZFS you can have several harddrives in a MIRROR, where equal copies exist on each storage. This increases the performance and redundancy. To create a new zpool named zfs_test with two hard drives as MIRROR:
root # zpool create zfs_test mirror /dev/loop0 /dev/loop1root # zpool statusTo delete the zpool:
root # zpool destroy zfs_testRAIDZ1 Three Hard Drives
RAIDZ1 is the equivalent to RAID5, where data is written to the first two drives and a parity onto the third. You need at least three hard drives, one can fail and the zpool is still ONLINE but the faulty drive should be replaced as soon as possible.
To create a pool with RAIDZ1 and three hard drives:
root # zpool create zfs_test raidz1 /dev/loop0 /dev/loop1 /dev/loop2root # zpool statusTo delete the zpool:
root # zpool destroy zfs_testRAIDZ2 Four Hard Drives
RAIDZ2 is the equivalent to RAID6, where data is written to the first two drives and a parity onto the next two. You need at least four hard drives, two can fail and the zpool is still ONLINE but the faulty drives should be replaced as soon as possible.
To create a pool with RAIDZ2 and four hard drives:
root # zpool create zfs_test raidz2 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3root # zpool statusTo delete the zpool:
root # zpool destroy zfs_testSpares/Replace vdev
You can add hot-spares into your zpool. In case a failure, those are already installed and available to replace faulty vdevs. In this example, we use RAIDZ1 with three hard drives and a zpool named zfs_test:
root # zpool add zfs_test spare /dev/loop3
root # zpool statusThe status of /dev/loop3 will stay AVAIL until it is set to be online, now we let /dev/loop0 fail:
root # zpool offline zfs_test /dev/loop0
root # zpool status
NAME STATE READ WRITE CKSUM
zfs_test DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
loop0 OFFLINE 0 0 0
loop1 ONLINE 0 0 0
loop2 ONLINE 0 0 0
spares
loop3 AVAIL
We replace /dev/loop0 with our spare /dev/loop3:
root # zpool replace zfs_test /dev/loop0 /dev/loop3
root # zpool status
pool: zfs_test
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using 'zpool online' or replace the device with
'zpool replace'.
scrub: resilver completed after 0h0m with 0 errors on Sun Aug 21 22:29:22 2011
config:
NAME STATE READ WRITE CKSUM
zfs_test DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
spare-0 DEGRADED 0 0 0
loop0 OFFLINE 0 0 0
loop3 ONLINE 0 0 0 46.5K resilvered
loop1 ONLINE 0 0 0
loop2 ONLINE 0 0 0
spares
loop3 INUSE currently in use
errors: No known data errors
Now we remove the failed vdev /dev/loop0 and start a manual scrubbing:
root # zpool detach zfs_test /dev/loop0 && zpool scrub
root # zpool status
pool: zfs_test
state: ONLINE
scrub: scrub completed after 0h0m with 0 errors on Sun Aug 21 22:37:52 2011
config:
NAME STATE READ WRITE CKSUM
zfs_test ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
loop3 ONLINE 0 0 0
loop1 ONLINE 0 0 0
loop2 ONLINE 0 0 0
errors: No known data errors
Zpool Version Update
With every update of sys-fs/zfs, you are likely to also get a more recent ZFS version. Also the status of your zpools will indicate a warning that a new version is available and the zpools could be upgraded.
To display the current version on a zpool:
root # zpool upgrade -v
This system is currently running ZFS pool version 28.
The following versions are supported:
VER DESCRIPTION
--- --------------------------------------------------------
1 Initial ZFS version
2 Ditto blocks (replicated metadata)
3 Hot spares and double parity RAID-Z
4 zpool history
5 Compression using the gzip algorithm
6 bootfs pool property
7 Separate intent log devices
8 Delegated administration
9 refquota and refreservation properties
10 Cache devices
11 Improved scrub performance
12 Snapshot properties
13 snapused property
14 passthrough-x aclinherit
15 user/group space accounting
16 stmf property support
17 Triple-parity RAID-Z
18 Snapshot user holds
19 Log device removal
20 Compression using zle (zero-length encoding)
21 Deduplication
22 Received properties
23 Slim ZIL
24 System attributes
25 Improved scrub stats
26 Improved snapshot deletion performance
27 Improved snapshot creation performance
28 Multiple vdev replacements
For more information on a particular version, including supported releases,
see the ZFS Administration Guide.
To upgrade the version of zpool zfs_test:
root # zpool upgrade zfs_testTo upgrade the version of all zpools in the system:
root # zpool upgrade -aZpool Tips/Tricks
- You cannot shrink a zpool and remove vdevs after it's initial creation.
- It is possible to add more vdevs to a MIRROR after it's initial creation. Use the following command (/dev/loop0 is the first drive in the MIRROR):
root # zpool attach zfs_test /dev/loop0 /dev/loop2- More than 9 vdevs in one RAIDZ could cause performance regression. For example it is better to use 2xRAIDZ with each five vdevs rather than 1xRAIDZ with 10 vdevs in a zpool
- RAIDZ1 and RAIDZ2 cannot be resized after intial creation (you can only add additional hot spares). You can however replace the hard drives with bigger ones (one at a time), e.g. replace 1T drives with 2T drives to double the available space in the zpool.
- It is possible to mix MIRROR, RAIDZ1 and RAIDZ2 in a zpool. For example a zpool with RAIDZ1 named zfs_test, to add two more vdevs in a MIRROR use:
root # zpool add -f zfs_test mirror /dev/loop4 /dev/loop5- It is possible to restore a destroyed zpool, by reimporting it straight after the accident happened:
root # zpool import -D
pool: zfs_test
id: 12744221975042547640
state: ONLINE (DESTROYED)
action: The pool can be imported using its name or numeric identifier.Volumes
The program /usr/sbin/zfs is used with any operation regarding volumes. To control the size of a volume you can set quota and you can reserver a certain amount of storage within a zpool, per default the complete storage size in the zpool is used.
Create Volumes
We use our zpool zfs_test to create a new volume called volume1:
root # zfs create zfs_test/volume1The volume will be mounted automatically as /zfs_test/volumes1/
root # zfs listMount/Umount Volumes
Volumes can be mounted with the following command, the mountpoint is defined by the property mountpoint of the volume:
root # zfs mount zfs_test/volume1To unmount the volume:
root # zfs unmount zfs_test/volume1The folder /zfs_test/volume1 stays without the volume behind it. If you write data to it and then try to mount the volume again, you will see the following error message:
cannot mount '/zfs_test/volume1': directory is not empty
Remove Volumes
To remove volumes volume1 from zpool zfs_test:
root # zfs destroy zfs_test/volume1
root # zfs listProperties
Properties for volumes are inherited from the zpool. So youy can either change the property on the zpool for all volumes or specific for each volume individual or a mix of both.
To set a property for a volume:
root # zfs set <property> zfs_test/volume1To show the setting for a particular property on a volume:
root # zfs get <property> zfs_test/volume1You can get a list of all properties set on any zpool with the following command:
root # zfs get allThis is a partial list of properties that can be set on either zpools or volumes, for a full list see man zfs:
| Property | Value | Function |
| quota= | 20m,none | set a quota of 20MB for the volume |
| reservation= | 20m,none | reserves 20MB for the volume within it's zpool |
| compression= | zle,gzip,on,off | uses the given compression method or the default method for compression which should be gzip |
| sharenfs= | on,off,ro,nfsoptions | shares the volume via NFS |
| exec= | on,off | controls if programs can be executed on the volume |
| setuid= | on,off | controls if SUID or GUID can be set on the volume |
| readonly= | on,off | sets read only atribute to on/off |
| atime= | on,off | update access times for files in the volume |
| dedup= | on,off | sets deduplication on or off |
| mountpoint= | none,path | sets the mountpoint for the volume below the zpool or elsewhere in the file system, a mountpoint set to none prevents the volume from being mounted |
Set Mountpoint
Set the mountpoint for a volume, use the following command:
root # zfs set mountpoint=/mnt/data zfs_test/volume1The volume will be automatically moved to /mnt/data
NFS Volume
Create a volume as NFS share:
root # zfs create -o sharenfs=on zfs_test/volume2Check what file systems are shared via NFS:
root # exportfsPer default the volume is shared to all networks, to specify share options:
root # zfs set sharenfs="-maproot=root -alldir -network 192.168.1.254 -mask 255.255.255.0" zfs_test/volume2
root # exportfsTo stop sharing the volume:
root # zfs set sharenfs=off zfs_test/volume2
root # exportfsSnapshots
Snapshots are volumes which have no initial size and save changes made to another volume. With increasing changes between the snapshot and the original volume it grows in size.
Create Snapshots
To create a snapshot of a volume, use the following command:
root # zfs snapshot zfs_test/volume1@22082011Every time a file in volume1 changes, the old data of the file will be linked into the snapshot.
List Snapshots
List all available snapshots:
root # zfs list -t snapshot -o name,creationRollback Snapshots
To rollback a full volume to a previous state:
root # zfs rollback zfs_test/volume1@21082011Clone Snapshots
ZFS can clone snapshots to new volumes, so you can access the files from previous states individually:
root # zfs clone zfs_test/volume1@21082011 zfs_test/volume1_restoreIn the folder /zfs_test/volume1_restore can now be worked on in the version of a previous state
Remove Snapshots
Remove snapshots of a volume with the following command:
root # zfs destroy zfs_test/volume1@21082011Maintenance
Scrubbing
Start a scrubbing for zpool zfs_test:
root # zpool scrub zfs_testLog Files
To check the history of commands that were executed:
root # zpool historyMonitor I/O
Monitor I/O activity on all zpools (refreshes every 6 seconds):
root # zpool iostat 6