ZFS

From Gentoo Wiki
Revision as of 09:01, 24 April 2012 by Disi (Talk | contribs)

Jump to: navigation, search

ZFS was developed by SUN Microsystems and is an advanced file system.

Contents

Features

ZFS includes many features like:

  • Manage storage hardware as vdev in zpools
  • Manage volumes in zpools (like LVM)
  • Redundancy with support for RAIDZ1(RAID5),RAIDZ2(RAID6) and MIRROR(RAID1)
  • Resilvering file system
  • Data Deduplication
  • Data Compression with zle (fast) or gzip (higher compression)
  • Snapshots (like differencial backups)
  • NFS export of volumes

Installation

There are out-of-tree Linux kernel modules available from the ZFSOnLinux Project. The current release is version 0.6.0_rc8 (zpool version 28).

Note
The testing tree will not have keyworded ebuilds until version 0.6.0_rc9 is released. The stable tree will not have ebuilds until version 0.6.0 is released. All changes to the GIT repository are subject to regression tests by LLNL.

Installing the modules requires keywording the live ebuilds:

root # echo "=sys-kernel/spl-9999 **" >> /etc/portage/package.accept_keywords
root #
echo "=sys-fs/zfs-9999 **" >> /etc/portage/package.accept_keywords

Then install sys-fs/zfs:

root # emerge --ask sys-fs/zfs

Add zfs to the boot runlevel to mount all zpools on boot:

root # rc-update add zfs boot

Usage

ZFS includes already all programs to manage the hardware and the file systems, there are no additional tools needed.

Preparation

To go through the different commands and scenarios we can create virtual hard drives using loopback devices.
First we need to make sure the loopback module is loaded. If you want to play around with partitions, use the following option:

root # modprobe -r loop && modprobe loop max_part=63
Note
you cannot reload the module, if it is built into the kernel

The following commands create 2GB image files in /var/lib/zfs_img/ that we use as our hard drives (uses ~8GB disk space):

root # mkdir /var/lib/zfs_img
root #
dd if=/dev/zero of=/var/lib/zfs_img/zfs0.img bs=1024 count=2097152
root #
dd if=/dev/zero of=/var/lib/zfs_img/zfs1.img bs=1024 count=2097152
root #
dd if=/dev/zero of=/var/lib/zfs_img/zfs2.img bs=1024 count=2097152
root #
dd if=/dev/zero of=/var/lib/zfs_img/zfs3.img bs=1024 count=2097152

Now we check which loopback devices are in use:

root # losetup -a

We assume that all loopback devices are available and create our hard drives:

root # losetup /dev/loop0 /var/lib/zfs_img/zfs0.img
root #
losetup /dev/loop1 /var/lib/zfs_img/zfs1.img
root #
losetup /dev/loop2 /var/lib/zfs_img/zfs2.img
root #
losetup /dev/loop3 /var/lib/zfs_img/zfs3.img

We have now /dev/loop[0-3] as four hard drives available

Note
On the next reboot, all the loopback devices will be released and the folder /var/lib/zfs_img can be deleted

Zpools

The program /usr/sbin/zpool is used with any operation regarding zpools.

import/export Zpool

To export (unmount) an existing zpool named zfs_test into the file system, you can use the following command:

root # zpool export zfs_test
root # zpool status

To import (mount) the zpool named zfs_test use this command:

root # zpool import zfs_test
root # zpool status
Note
ZFS will automatically search on the hard drives for the zpool named zfs_test

One Hard Drive

Create a new zpool named zfs_test with one hard drive:

root # zpool create zfs_test /dev/loop0

The zpool will automatically be mounted, default is the root file system aka /zfs_test

root # zpool status

To delete a zpool use this command:

root # zpool destroy zfs_test
Important
ZFS will not ask if you really want to

MIRROR Two Hard Drives

In ZFS you can have several harddrives in a MIRROR, where equal copies exist on each storage. This increases the performance and redundancy. To create a new zpool named zfs_test with two hard drives as MIRROR:

root # zpool create zfs_test mirror /dev/loop0 /dev/loop1
Note
of the two hard drives only 2GB are effective useable so total_space * 1/n
root # zpool status

To delete the zpool:

root # zpool destroy zfs_test

RAIDZ1 Three Hard Drives

RAIDZ1 is the equivalent to RAID5, where data is written to the first two drives and a parity onto the third. You need at least three hard drives, one can fail and the zpool is still ONLINE but the faulty drive should be replaced as soon as possible.
To create a pool with RAIDZ1 and three hard drives:

root # zpool create zfs_test raidz1 /dev/loop0 /dev/loop1 /dev/loop2
Note
of the three hard drives only 4GB are effective useable so total_space * (1-1/n)
root # zpool status

To delete the zpool:

root # zpool destroy zfs_test

RAIDZ2 Four Hard Drives

RAIDZ2 is the equivalent to RAID6, where data is written to the first two drives and a parity onto the next two. You need at least four hard drives, two can fail and the zpool is still ONLINE but the faulty drives should be replaced as soon as possible.
To create a pool with RAIDZ2 and four hard drives:

root # zpool create zfs_test raidz2 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3
Note
of the four hard drives only 4GB are effective useable so total_space * (1-2/n)
root # zpool status

To delete the zpool:

root # zpool destroy zfs_test

Spares/Replace vdev

You can add hot-spares into your zpool. In case a failure, those are already installed and available to replace faulty vdevs. In this example, we use RAIDZ1 with three hard drives and a zpool named zfs_test:

root # zpool add zfs_test spare /dev/loop3
root #
zpool status

The status of /dev/loop3 will stay AVAIL until it is set to be online, now we let /dev/loop0 fail:

root # zpool offline zfs_test /dev/loop0
root # zpool status
NAME        STATE     READ WRITE CKSUM
zfs_test    DEGRADED     0     0     0
  raidz1-0  DEGRADED     0     0     0
    loop0   OFFLINE      0     0     0
    loop1   ONLINE       0     0     0
    loop2   ONLINE       0     0     0
spares
  loop3     AVAIL

We replace /dev/loop0 with our spare /dev/loop3:

root # zpool replace zfs_test /dev/loop0 /dev/loop3
root # zpool status
pool: zfs_test
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
 scrub: resilver completed after 0h0m with 0 errors on Sun Aug 21 22:29:22 2011
config:

        NAME         STATE     READ WRITE CKSUM
        zfs_test     DEGRADED     0     0     0
          raidz1-0   DEGRADED     0     0     0
            spare-0  DEGRADED     0     0     0
              loop0  OFFLINE      0     0     0
              loop3  ONLINE       0     0     0  46.5K resilvered
            loop1    ONLINE       0     0     0
            loop2    ONLINE       0     0     0
        spares
          loop3      INUSE     currently in use

errors: No known data errors
Note
the file system got automatically resilvered onto /dev/loop3 and the zpool was all the time online

Now we remove the failed vdev /dev/loop0 and start a manual scrubbing:

root # zpool detach zfs_test /dev/loop0 && zpool scrub
root # zpool status
pool: zfs_test
 state: ONLINE
 scrub: scrub completed after 0h0m with 0 errors on Sun Aug 21 22:37:52 2011
config:

        NAME        STATE     READ WRITE CKSUM
        zfs_test    ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            loop3   ONLINE       0     0     0
            loop1   ONLINE       0     0     0
            loop2   ONLINE       0     0     0

errors: No known data errors

Zpool Version Update

With every update of sys-fs/zfs, you are likely to also get a more recent ZFS version. Also the status of your zpools will indicate a warning that a new version is available and the zpools could be upgraded.
To display the current version on a zpool:

root # zpool upgrade -v
This system is currently running ZFS pool version 23.

The following versions are supported:

VER  DESCRIPTION
---  --------------------------------------------------------
 1   Initial ZFS version
 2   Ditto blocks (replicated metadata)
 3   Hot spares and double parity RAID-Z
 4   zpool history
 5   Compression using the gzip algorithm
 6   bootfs pool property
 7   Separate intent log devices
 8   Delegated administration
 9   refquota and refreservation properties
 10  Cache devices
 11  Improved scrub performance
 12  Snapshot properties
 13  snapused property
 14  passthrough-x aclinherit
 15  user/group space accounting
 16  stmf property support
 17  Triple-parity RAID-Z
 18  Snapshot user holds
 19  Log device removal
 20  Compression using zle (zero-length encoding)
 21  Deduplication
 22  Received properties
 23  Slim ZIL
Warning
systems with a lower version installed will not be able to import a zpool of a higher version

To upgrade the version of zpool zfs_test:

root # zpool upgrade zfs_test

To upgrade the version of all zpools in the system:

root # zpool upgrade -a

Zpool Tips/Tricks

  • You cannot shrink a zpool and remove vdevs after it's initial creation.
  • It is possible to add more vdevs to a MIRROR after it's initial creation. Use the following command (/dev/loop0 is the first drive in the MIRROR):
root # zpool attach zfs_test /dev/loop0 /dev/loop2
  • More than 9 vdevs in one RAIDZ could cause performance regression. For example it is better to use 2xRAIDZ with each five vdevs rather than 1xRAIDZ with 10 vdevs in a zpool
  • RAIDZ1 and RAIDZ2 cannot be resized after intial creation (you can only add additional hot spares). You can however replace the hard drives with bigger ones (one at a time), e.g. replace 1T drives with 2T drives to double the available space in the zpool.
  • It is possible to mix MIRROR, RAIDZ1 and RAIDZ2 in a zpool. For example a zpool with RAIDZ1 named zfs_test, to add two more vdevs in a MIRROR use:
root # zpool add -f zfs_test mirror /dev/loop4 /dev/loop5
Note
this needs the -f option
  • It is possible to restore a destroyed zpool, by reimporting it straight after the accident happened:
root # zpool import -D
  pool: zfs_test
    id: 12744221975042547640
 state: ONLINE (DESTROYED)
action: The pool can be imported using its name or numeric identifier.
Note
the option -D searches on all hard drives for existing zpools

Volumes

The program /usr/sbin/zfs is used with any operation regarding volumes. To control the size of a volume you can set quota and you can reserver a certain amount of storage within a zpool, per default the complete storage size in the zpool is used.

Create Volumes

We use our zpool zfs_test to create a new volume called volume1:

root # zfs create zfs_test/volume1

The volume will be mounted automatically as /zfs_test/volumes1/

root # zfs list

Mount/Umount Volumes

Volumes can be mounted with the following command, the mountpoint is defined by the property mountpoint of the volume:

root # zfs mount zfs_test/volume1

To unmount the volume:

root # zfs unmount zfs_test/volume1

The folder /zfs_test/volume1 stays without the volume behind it. If you write data to it and then try to mount the volume again, you will see the following error message:

Code

cannot mount '/zfs_test/volume1': directory is not empty

Remove Volumes

To remove volumes volume1 from zpool zfs_test:

root # zfs destroy zfs_test/volume1
root # zfs list
Note
you cannot destroy a volume if there exist any snapshots of it

Properties

Properties for volumes are inherited from the zpool. So youy can either change the property on the zpool for all volumes or specific for each volume individual or a mix of both.
To set a property for a volume:

root # zfs set <property> zfs_test/volume1

To show the setting for a particular property on a volume:

root # zfs get <property> zfs_test/volume1
Note
The properties are used on a volume e.g. compression, the higher is the version of this volume

You can get a list of all properties set on any zpool with the following command:

root # zfs get all

This is a partial list of properties that can be set on either zpools or volumes, for a full list see man zfs:

Property Value Function
quota= 20m,none set a quota of 20MB for the volume
reservation= 20m,none reserves 20MB for the volume within it's zpool
compression= zle,gzip,on,off uses the given compression method or the default method for compression which should be gzip
sharenfs= on,off,ro,nfsoptions shares the volume via NFS
exec= on,off controls if programs can be executed on the volume
setuid= on,off controls if SUID or GUID can be set on the volume
readonly= on,off sets read only atribute to on/off
atime= on,off update access times for files in the volume
dedup= on,off sets deduplication on or off
mountpoint= none,path sets the mountpoint for the volume below the zpool or elsewhere in the file system, a mountpoint set to none prevents the volume from being mounted

Set Mountpoint

Set the mountpoint for a volume, use the following command:

root # zfs set mountpoint=/mnt/data zfs_test/volume1

The volume will be automatically moved to /mnt/data

NFS Volume

Create a volume as NFS share:

root # zfs create -o sharenfs=on zfs_test/volume2

Check what file systems are shared via NFS:

root # exportfs

Per default the volume is shared to all networks, to specify share options:

root # zfs set sharenfs="-maproot=root -alldir -network 192.168.1.254 -mask 255.255.255.0" zfs_test/volume2
root # exportfs

To stop sharing the volume:

root # zfs set sharenfs=off zfs_test/volume2
root # exportfs

Snapshots

Snapshots are volumes which have no initial size and save changes made to another volume. With increasing changes between the snapshot and the original volume it grows in size.

Create Snapshots

To create a snapshot of a volume, use the following command:

root # zfs snapshot zfs_test/volume1@22082011
Note
volume1@22082011 is the full name of the snapshot, everything after the @ symbol can be any alphanumeric combination

Every time a file in volume1 changes, the old data of the file will be linked into the snapshot.

List Snapshots

List all available snapshots:

root # zfs list -t snapshot -o name,creation
Rollback Snapshots

To rollback a full volume to a previous state:

root # zfs rollback zfs_test/volume1@21082011
Note
if there are other snapshots in between, then you have to use the -r option. This would remove all snapshots between the one you want to rollback and the original volume
Clone Snapshots

ZFS can clone snapshots to new volumes, so you can access the files from previous states individually:

root # zfs clone zfs_test/volume1@21082011 zfs_test/volume1_restore

In the folder /zfs_test/volume1_restore can now be worked on in the version of a previous state

Remove Snapshots

Remove snapshots of a volume with the following command:

root # zfs destroy zfs_test/volume1@21082011

Maintenance

Scrubbing

Start a scrubbing for zpool zfs_test:

root # zpool scrub zfs_test
Note
this might take some time and is quiet I/O intensive

Log Files

To check the history of commands that were executed:

root # zpool history

Monitor I/O

Monitor I/O activity on all zpools (refreshes every 6 seconds):

root # zpool iostat 6

Links

Personal tools
Namespaces

Variants
Actions
Gentoo Websites logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Navigation
Toolbox
Categories