1 Introduction
Gather around boys and girls and let me share with you my experiences with ZFS on Linux. Please note, these are just my experiences, maybe I’m doing something wrong, maybe I’m doing it differently. Whatever the case may be please comment and let me know how you’ve done it or if you have any questions please feel free to ask.
2 What is ZFS?
A quick background on ZFS for those that are not familiar with it. ZFS is a file system (and much more) developed by Sun (Oracle). Its current licensing makes it so that it can’t be included natively in the Linux source code (bummer) but there are other ways to get it working with Linux. One common way is through FUSE which works but in my experience running through FUSE in general has bad performance. Another method is compiling modules and adding userland utilities as the fine folks over at zfsonlinux.org have done. I’ll be focusing on the latter.
3 Why ZFS?
There are a plethora of file systems available to Linux so why ZFS? Here’s some bullet points for the reasons I use it:
- Physical volume management. Instead of buying an expensive RAID controller or having to deal with multiple tools in the userland such as LVMS (for volume management), software RAIDs (for redundancy), ZFS can handle it all. It has RAID-like support at levels 0 (default), 1 (mirror), 5 and 6 (RAID-Z).
- Snapshots. Unlike LVMS, you don’t have to specify the amount of space to be used for the snapshot data. Instead ZFS will use available space until the volume runs out. Also, you can have a bunch, and I mean a bunch, with very little space used. The snapshots will also be cloneable and writable and you can even take snapshots of your snapshots.
- Filesystem streaming. If you have a primary/secondary server and you want to replicate the filesystem traditionally you could use RSYNC which can take time and CPU cycles depending on the amount of files, size of files, number of changes and etc. With ZFS you can take the difference between snapshot X and snapshot Y and send them over an ssh pipe from a primary to secondary server making the filesystem the exact same as of snapshot Y and generally using a lot less resources to do it.
- Other comparable filesystem are not mature enough or do not exist yet for Linux. The best alternative is BTRFS and although I’ve used it successfully in development environments it’s still very much in development. Also, there are some key features missing such as the file system streaming.
- The other reasons such as data integrity, copy-on-write architecture, deduplication support, compression support, encryption and etc.
I should point out that currently zfsonlinux.org doesn’t have a POSIX layer, this means you can’t mount it natively. Wait, why does the title say native? What they’ve managed to do is make it export the ZFS filesystems as a block device, more or less making it a block device manager (akin to LVMS) but with all the benefits that ZFS offers (raiding, snapshots, FS streaming, etc). Once you have those blocks devices you can format them with another filesystem (such as EXT4), then snapshot the entire volume, clone it, mount it again, make a snapshot again, etc. For the purpose of this post, even though they are technically block devices, I’ll still refer to them as “filesystems”.
4 Right then, lets get to it
As previously mentioned we can’t (yet) mount ZFS directly in Linux (due to the lack of POSIX layer but zfsonlinux.org says one is coming). So we’ll be creating some block devices with it, formatting them, mounting them and making them the root device. As well as I’ll be showing some examples of how this is cool and useful (such as upgrading the OS and reverting to a snapshot if the upgrade failed and exporting a stream to a backup server).
5 What you’ll need
- A machine you’re willing to potentially destroy. Something that can install Ubuntu on it fresh. For that purpose I will be using VirtualBox but feel free to use VMWare, QEMU, a physical machine or etc.
- 1 or more hard drives. I’ll give examples of mirroring, RAID0 and RAID5 equivalents.
- Ubuntu Alternate Install 10.10 (preferable 64-bit)
- Plenty of time, this is a long process.
6 Overview
I basically got this idea from this post Native ZFS On Ubuntu. The thing I found lacking is after the author described how to install ZFS they didn’t cover usage nor using it as a root device. The steps are going to be very similar to that post but I hope to go into more detail about ZFS usage especially concerning a root device. Here are the basic steps that I’ll be going into detail about:
- Installing Ubuntu minimally on a “staging partition” which can also be used to recover a broken ZFS root. The reason we have to do this is because ZFS is far from being native to Linux. I, or someone, could make an Ubuntu package to handle things like building the modules automatically on kernel upgrade, searching for ZFS partitions on boot, mounting and etc but I’m way too lazy to do that. Instead I’m writing this post to do it manually!
- Using staging install of Ubuntu, download and build modules and userland tools, similar to the blog post from above.
- Setup initrd to automatically load modules, scan for ZFS and mount on boot.
- Setup ZFS/ZPOOL filesystem and install Ubuntu on it.
- Setup grub to use new ZFS filesystem as root partition and show some fun usage examples.
7 Lets begin - Installing Staging Ubuntu
Boot up the Ubuntu 10.10 alternate installer and on the boot screen make sure to select to install only a command line.
Follow the regular install (alas, this guide does cover how to install Ubuntu). When you get to ‘Partition disks’ make sure to select ‘Manual’. At this point if you’re using more than one disk or a single disk select the primary boot disk and create 2 volumes. One boot volume at 1GB and a root volume at 5GB, mapped accordingly. The root will be our staging environment.
The extra space we’ll mess with once we’ve successfully installed the staging environment. Continue the installation and boot into Ubuntu as you usually would. Also, feel free to ignore the warning about swap space you can add that after ZFS is setup.
8 Installing ZFS
First thing I like to do is run any updates that need to be ran:
~# apt-get update; apt-get dist-upgrade
Reboot as required. Then we’ll need to install the tools to be able to compile programs in Linux and some dependencies for ZFS.
~# apt-get install build-essential gawk zlib1g-dev uuid-dev vim-nox
At this point we’ll be basically following the same thing from the blog post mentioned. This includes downloading and installing any userland utils (zpool/zfs) and modules as required.
Pick a suitable directory and download the source for ZFS from zfsonlinux.org.
(PS, not sure what githubs deal is with the SSL cert errors but adding the –no-check-certificate will get you past it).
For each one of the tars, extract, configure, compile and install them.
~# tar xzvf spl-0.5.2.tar.gz; cd spl-0.5.2/; ./configure; make; make install
~# tar xzvf zfs-0.5.2.tar.gz; cd zfs-0.5.2/; ./configure; make; make install
After this is ran, for some reason, the compile script doesn’t include updating the library links so you have to run the following manually:
9 Setting up the pool
ZFS works with “pools” of hard drive(s). I think of a pool as a single hard drive or a collection of hard drives. The collection can be organized in ZFS RAID equivalents at levels 0 (default), 1 (mirror), 5 (raidz with single, double or triple parity). If you’re coming from the LVMS world then it’s sort of like creating a group without any logical volumes in it. The program for manipulating (creating, deleting, replacing broken hard drives, etc) in the pool is called ’zpool’.
First thing we’ll need to do is setup the hard drives. I use parted but feel free to use whatever. Take the space that was left over from the install of Ubuntu (plus any other hard drives you may want to include in the pool) and create a partition filling the rest of the disk. Here is what mine ended up looking like on /dev/sda and /dev/sdb.
Notice how the zpool partitions aren’t quite the same size. Even though I’m going to be setting these up as mirrors the sizes don’t have to be exactly the same. ZFS is smart enough to handle it.
Next we create the pool. I’ll be naming mine ’zpool0’ and using the ’mirror’ method from below. Here are a few ways to create the pool.
Create it as a RAID0, basically one big pool with no redundancy built in. This can be used with one or more hard drive/partitions. Here is an example for one hard drive (replace the partition number/hard drive path to your own).
~# zpool create zpool0 /dev/sda3
An example with two hard drives.
~# zpool create zpool0 /dev/sda3 /dev/sdb1
I should point out that if you type ’zpool’ by itself you’ll get a list of all the possible commands (simplified) and of course ’man zpool’ is your friend. To explain the commands above we start the command with ’zpool’. Followed by the command ’create’ which tells ’zpool’ to create a new pool. Followed by what we want to name the new pool or in this case ’zpool0’. After which we have the hard drive path(s) (to be used in the pool).
If we want to mirror the drives the command is very similar (but has to be used with at least 2 or more drives).
~# zpool create zpool0 mirror /dev/sda3 /dev/sdb1
Notice the keyword ’mirror’ after the pool name.
To create a raidz (RAID5):
~# zpool create zpool0 raidz /dev/sda3 /dev/sdb1 /dev/sdc1
Note, because of the nature of a raidz (similar to RAID5) it requires a minimum of 3 drives.
Regardless of how you created the pool you shouldn’t get any output from the command. If you did an error was probably in the output.
After you have a pool created you can look at the structure of it by issuing the following:
Here is my output:
A quick explanation of each field. pool: is the name of the pool. state: is the state of the pool and ’ONLINE’ is usually a good thing. There are other statuses when the pool is degraded or in need of repair but you’ll have to google the full list. However, I’ll include examples of a degraded pool later. scan: I think of it like a filesystem check. This can be issued by running: ’zpool scrub <pool name>’ which can be done online and while the filesystem is in use. config: the config created by running ’zpool create’ (or subsequential additions/subtractions of physical disks by using ’zpool’). It will also output the status of each physical volume, mirror/raidz status and etc. errors: usually you want this to say ’No known data errors’. If you throw a hard drive and you’re using raidz or mirroring this may say something about a device not working and that it needs to be replaced and whether or not the integrity of the pool has been affected.
10 Setting up ZFS
Now that there is a pool to work with we can begin using ZFS. To start run the following:
Here is my output:
This will display all the ZFS filesystems available. If this were ZFS native (to say Sun) each MOUNTPOINT would already be mounted and we’d be done. However, since this is a port and the POSIX layer is missing, as we start adding ZFS filesystems they’ll appear as logical volumes that we can format with another filesystem and mount.
To start create a root partition.
~# zfs create zpool0/root0 -V 20G
Alright, lets break that command down. ’zfs’ is the command to manipulate ZFS. Like ’zpool’ if you run ’zfs’ by itself you’ll see a list of all the commands that ’zfs’ can run. In this case we used ’create’ (to create a ZFS filesystem). Also, a good read: ’zfs man’. The first part of the path references the name of the pool we want the filesystem in (zpool0 in this case). Note: you can have more than one pool created. The part after the ’/’ is the name of the new filesystem to create (in this case I’m naming it root0). Also, since this is technically creating devices (and not filesystems) we have to specify how big we want the device to be ’-V 20G’. In this case I’m making my root partition 20GB.
To see what was created/changed run:
Notice the addition of the new filesystem root0.
So what did that do? It created a logical volume at /dev/zpool0/root0 . Notice the path includes the ’zpool0/root0’ from the ’zfs list’ output. If we would have created a new ZFS filesystem named ’fuzzywuzzy0’ then the path would be ’/dev/zpool0/fuzzywuzzy0’ and so on. This volume will work like a regular volume. The next step is to format it. I’ll be using ext4 but feel free to use whatever.
~# mkfs.ext4 /dev/zpool0/root0
After it’s formatted you can mount it and use it as a normal formatted volume. I’ll be describing how to install Ubuntu on it later.
11 Setting up initrd/initramfs to detect ZFS on startup
Now that we have /dev/zpool0/root0, why can’t we just copy Ubuntu on it, setup grub to boot to it and reboot? Well, the kernel modules for ZFS/ZPOOL won’t load and so on reboot Ubuntu wouldn’t be able to find what it’s supposed to be mounting. That’s why we have to setup the initrd to have the modules, userland tools (zfs/zpool) and startup scripts in place so that by the time the kernel gets ready to mount the root partition it’s there. Luckily, you should now have most of the tools/modules on your staging environment and it’ll just be a matter of setting up a custom initrd/initramfs.
To start download the following care package: initramfs-tools. I must point out that, by using the scripts in the package, you acknowledge that they are to be used at your own risk.
Inside of the tar you’ll find a directory that overlays the directory at /etc/initramfs-tools. It’ll overwrite 1 file and create two others. Here is an overview of the files.
- /etc/initramfs-tools/modules will have a list of the zfs/spl modules that need to be loaded.
- /etc/initramfs-tools/hooks/zfs will include the hooks to copy the userland tools (and some other helper tools) from the source file system.
- /etc/initramfs-tools/scripts/init-premount/zfs is a bash script that will run on boot that will check for the devices and then import the pool (via ’zpool’). This is the only file that should be edited by you. There are two lines near the top. The variable ’POOL’ needs to be set to the pool name you created with ’zpool create’ and ’PATHS’ needs to be set to the device path(s) that you used to create the pool with, separated by a space.
After these files are in place issue the following command to rebuild your initrd:
~# dpkg-reconfigure initramfs-tools
If all goes well you should now have a new initrd in place and ready to be booted. To test it reboot your box. Hopefully you should see some output like: ”waiting for devices to appear: $PATHS”. After your machine is booted back up issue the following command to verify that the pool was imported on reboot:
12 Setting up Ubuntu on ZFS partition
Now that we can detect the ZFS partition on boot we can install Ubuntu on it. We’re going to RSYNC from our staging filesystem to the ZFS filesystem. To begin we’ll boot up into the recovery environment. Reboot the machine and hold down the shift key to get GRUB to display the boot menu.
Once at the boot menu hit ’e’ on the first item to edit it. Cursor down to the line that begins with ’linux’. Arrow over to the argument that says ’root’ and change its value to something made up such as ’root=/dev/sdz’. This will make it so that when the system boots the init won’t find the root device and thus force it to go into recovery mode (thanks to Ubuntu for setting that all up for us)! After changing the root variable hit ’Control+X’ to boot.
It will take longer to boot because the init is setup to wait a good while before giving up on trying to mount root but eventually you’ll be dropped to a shell.
Once there, verify that you can see your zpool:
If it looks good create a directory to mount the ’target’ such as /target and then mount the ZFS filesystem we created earlier:
~# mkdir /target; mount /dev/zpool0/root0 /target
After you have the target mounted do the same thing for the staging environment and mount it to something like /source:
~# mkdir /source; mount /dev/sda2 /source
After that it’s simply a matter of copying the /source to the /target (and if you’ll note with the initramfs-tools modification I included rsync).
~# rsync -a –progress –inplace /source/ /target/
Depending on the speed of your setup this could take a few minutes. However, since this is a very base install of Ubuntu it shouldn’t take too long.
After it’s done copying we need to configure the target to boot as root. Things like change the fstab, update grub and etc.
To start we’ll need to bind some pseudo filesystems to the target such as dev and proc by issuing the following commands:
~# mount –bind /dev /target/dev ; mount –bind /proc /target/proc
Then mount the boot partition (your specific path will vary):
~# mount /dev/sda1 /target/boot
At this point we can chroot to it:
~# chroot /target /bin/bash
Ignore any bash errors, we’re not at a full run level so errors are to be expected.
Edit the fstab at /etc/fstab
Find the line that is mounting ’/’ usually the first. Take out the UUID device reference and replace it to the path of your ZFS root partition. Mine ended up looking like:
You can also run ’blkid’ and use the devices UUID. It’s really a preference, but I generally prefer using UUIDs. For simplicity sake I’ll just stick with device paths.
Now reconfigure grub by issuing the following:
~# dpkg-reconfigure grub-pc
Make sure NOT to select the ZFS partition as the boot device, but rather the /dev/sda, sdb or etc. Usually the same hard drive that the boot partition is on.
After it’s done type ’exit’ to get out of the chroot. Then the following to unmount the target:
~# umount /target/*; umount /target
At this point running ’reboot’ should boot you into the new system running on ZFS. You can check this by typing ’mount’ and you should see the ZFS filesystem mounted as root. Good job and high-five.
13 I have Ubuntu running on a ZFS root filesystem, now what?
This is where the fun begins, however please note the following:
- If you upgrade the kernel make sure that you recompile spl and zfs from section 8. However, please note, I haven’t found an easy way to set the target kernel source when compiling the modules. I find it easier to reboot, hold down the shift key to get into GRUB, and set the root partition to boot to the old staging partition. Once in the staging environment, upgrade the kernel there, mount the ZFS filesystem and copy the modules over then run ’dpkg-reconfigure initramfs-tools’ as described in section 11 without all the hassle of copying the files and etc (unless you lost them or something silly).
14 ZFS - Creating a snapshot
Lets look at ZFS snapshot feature. To create a snapshot the syntax is: ’zfs snapshot <pool name>/<file system name>@<snapshot increment>’. In my case I like to start the increment at ’0’ and the resulting command looks like this:
~# zfs snapshot zpool0/root0@0
If you get no output then you now have a snapshot of the filesytem at that exact point in time. To list the snapshots run the following:
~# zfs list -r -t snapshot
If you create another snapshot like:
~# zfs snapshot zpool0/root0@1
Then list your snapshots you should clearly see the two snapshots created. With these snapshots we can clone them and mount them somewhere in the filesystem to see what the filesystem looked like at that time or even ’rollback’ to them if we mess something up when upgrading the OS or such.
15 ZFS - Mounting a snapshot
First we have to clone a snapshot and then we can mount it. Pick a snapshot name to use. I’ll be using ’zpool0/root0@0’ as an example from ’zfs list -r -t snapshot’. To clone it run the following:
~# zfs clone zpool0/root0@0 zpool0/root0-0-clone
The first argument (after ’clone’) is the snapshot to clone and the second argument is the new filesystem to create. Now run ’zfs list’ and you should see at least two filesystems. One of them the root filesystem and one of them the clone filesystem. Using the example above the relating device file in /dev would be /dev/zpool0/root0-0-clone which we can now use as an independent device with an understanding that any changes we make to it won’t affect the snapshot nor the mounted root partition. We can mount it like so:
~# mount /dev/zpool0/root0-0-clone /mnt
Changing directory to /mnt and running ’ls’ you can see that we have a writable clone at the time the snapshot was created.
To remove the clone (but not the snapshot), umount the device:
And run the following to destroy the clone:
~# zfs destroy zpool0/root0-0-clone
Now ’zfs list’ will show just the single root filesystem and ’zfs list -r -t snapshot’ will still show the snapshot that was cloned as available.
16 ZFS - Rollback (not for the faint of heart)
To start with, the ’rollback’ command doesn’t work as it does with ZFS natively and the following is more of a trick to get the filesystem to rollback.
Rolling back to a previous state is a handy if something in the filesystem/OS becomes unstable. Let us simulate that. First, make sure you have a snapshot that you can rollback to. After which install something in Ubuntu using apt-get. For demonstration purposes I’ll install a command line version of php by running the following:
~# apt-get install php5-cli
Lets pretend that this install (or upgrade or whatever changes to the OS we’re making) breaks the system for some reason. Oh no! Traditionally it’s a matter of pulling a backup, reinstalling the OS or etc. Sometimes there isn’t time for that so a quick fix in this case would be to ’roll back’ to a previous state that was known to work and then work around the now known ’upgrade issue’ later.
If we run ’php -v’ we can see that php is now installed and it outputs the appropriate version. Lets pretend this is an issue. To start the rollback we follow steps similar to section 12 concerning getting into a recovery mode in the initrd. So, reboot and hold down shift to get into the GRUB menu.
Once there edit the first entry ’e’ and change ’root=’ on the ’linux’ line to something made up (I like ’root=/dev/sdz’) and continue to boot grub. Once you’re at the initramfs prompt issue a ’zfs list -r -t snapshot’ and it should show the snapshot that you want to revert back to. In my case I’ll be reverting back to ’zpool0/root0@0’.
First we’ll need to clone the snapshot and then promote it. Promoting a clone makes it independent of the original filesystem/snapshots it was created from and there must be enough space in the pool to accommodate for this. Clone the snapshot as from section 15. Here I’m taking the snapshot @0 (which didn’t have php) and cloning it to the new filesystem ’zpoo0/root1’.
~# zfs clone zpool0/root0@0 zpool0/root1
After which a ’zfs list’ should now show the cloned file system. To promote it run the following:
~# zfs promote zpool0/root1
At this point the original filesystem needs to be destroyed (as well as snapshots) because they share the same UUID (and there can be only one). Mine is ’zpool0/root0’ so that’s what I’m going to destroy by running the following:
~# zfs destroy -r zpool0/root0
Also, when the cloned filesystem was promoted any snapshots (up to the point of the clone) were also cloned and they too must be destroyed (because of the same UUID). Mine only had the one so I’ll be destroying ’zpool0/root1@0’.
~# zfs destroy zpool0/root1@0
Make sure to not get that confused with the actual cloned filesystem and only destroy the snapshots of the cloned filesystem.
We shouldn’t have to configure grub again (because the devices shared the same UUID, however if you run into boot issues do the following). To configure grub to use the new filesystem follow similar steps in section 12. However, we don’t have to worry about rsyncing the filesystem. To start create the target directory and mount the filesystem.
~# mkdir /target; mount /dev/zpool0/root1 /target
Then bind the /dev and /proc directory to the appropriate /target directories.
~# mount –bind /dev /target/dev; mount –bind /proc /target/proc
Mount the boot partition:
~# mount /dev/sda1 /target/boot
Then chroot to the target directory.
~# chroot /target /bin/bash
Edit /etc/fstab and change the root device path to the new one such as ’/dev/zpool/root1’. Run ’dpkg-reconfigure grub-pc’, exit from the chroot, unmout /target and reboot.
At this point, if all went according to plan, when running ’mount’ you should see the new device mounted to / as well as when running ’php -v’ you should see an error.
You’ve now successfully rolled back. Like I said, not for the faint of heart, but it is possible to rollback. It may not be exactly as it should be (compared to working with ZFS in its native environment) but at least it’s something. In a pinch it can be a savior if one needs to roll back to an operating point.
17 ZFS send/recv (streaming, incremental backups, etc)
ZFS is great for backing up or streaming filesystem states from one server to another or to a stand-alone file that can be restored/replayed to a ZFS filesystem. Lets suppose that I have two snapshots 0 and 1 (zpool0/root0@0 and zpool0/root0@1) on a source server. To start streaming a backup first I need to send a full backup to a remote server. Lets say I have two servers: 192.168.1.1 and 192.168.1.2 and .1 is my source and .2 is my target. Lets also assume I have ZFS on both except that .2 doesn’t do anything except be a backup (meaning the filesystem doesn’t change on it’s own, it only changes because of ’zfs recv’). To send my first full snapshot I’d run the following:
~# zfs send zpool0/root0@0 | ssh root@192.168.1.2 “zfs recv zpool0/root-backup0”
On both sides of the pipe I’m running ’zfs’. On the left (source) I’m running the ’send’ command which is basically saying ’start streaming the snapshot to standard-out’. Then we pipe the stream to ssh into the target server. At which point we run ’zfs recv’. This is saying ’take standard-in and recv a zfs stream to the given filesystem’. Once this completes we now have an exact replica of the filesystem from the source snapshot on the target server (including the snapshot. If you were to run ’zfs list -r -t snapshot’ on the target you’d see the snapshot).
Once you have the initial snapshot sent you can start sending over the incremental snapshots. To send from 0 to 1 from the source to the target I’d run the following command:
~# zfs send -i zpool0/root0@0 zpool0/root0@1 | ssh root@192.168.1.2 “zfs recv zpool0/root-backup0”
You can see that only the left-hand side of the pipe changed. We added the argument -i (which means send an incremental of the two snapshots) and we added a source/destination snapshot to use (0 and 1). The target must already be up-to-date on it’s snapshots (meaning 0 should already have been sent). If all goes well a ’zfs list -r -t snapshot’ on the target will now show the two snapshots. Also, the actual filesystem will be as of the second snapshot (1). However, an important note, the filesystem can’t change on the target or streaming of snapshots will break. If something does change adding a -f to the right-hand side will ’force’ ZFS to revert to the source snapshot and accept the stream.
Now, if you were to create a third snapshot (2) you could follow the above except the incremental would be from zpool0/root0@1 to zpool0/root0@2 and so on.
Since ’zfs send/recv’ works with standard-in/out you can redirect the output to something other than another ZFS filesystem. For instance, if I wanted to backup the stream to a file I could do the following:
~# zfs send zpool0/root0@0 | gzip - > /path/to/backups/root0.0-0.zfs.gz
Instead of piping it to an ssh stream it pipes it through gzip (to compress) and then into a file. This file can now be put onto a USB backup drive, tape drive, Amazon S3 or etc for safe storage. If you want to save an incremental it’s a similar command:
~# zfs send -i zpool0/root0@0 zpool0/root0@1 | gzip - > /path/to/backups/root0.0-1.zfs.gz
Note, it’s a wise idea to keep the incrementals named something logical. That’s why the first one I added 0-0 in the file name to indicate that this is an original whereas the incremental is 0-1 meaning (from snapshot 0 to 1).
In the event of failure of the source all you’d have to do is get ZFS to the point where a pool is available and then start applying the snapshots. Start with the first like:
~# zcat /path/to/backups/root0.0-0.zfs.gz | zfs recv zpool0/root0
This is taking the compressed zfs stream and sending it through zcat (thus decompressing it) and then piping it to a ’zfs recv’. This will restore the original snapshot.
To apply the next increment:
~# zcat /path/to/backups/root0.0-1.zfs.gz | zfs recv zpool0/root0
And so on.
18 ZFS - Disk failure
The last ZFS goodie I’ll cover is disk failure. If you’re in a raidz or a mirror you’ll still be able to boot and access your filesystem (assuming it wasn’t your boot/staging device that broke, in which case you have to almost start all over but eventually you’ll be able to import the old filesystem). With a mirror you need to have at least 1 good working drive. With a raidz it depends on if you did single, double or triple parity. In this example I’ve removed my secondary virtual device (in VirtualBox) from my mirror and ’zpool status’ shows the following:
It’s clear to see that /dev/sdb1 has failed and is no longer available. A nice thing about ZFS is that it attempts to communicate in a clear form what is wrong. In this case there is a device that’s missing but there are enough replicas to continue.
At this point, before adding new hard drives and etc, I usually like to create a snapshot and stream an increment to my backup (whether it be a separate server or the file method as described in section 17).
Next, I’ll create a new /dev/sdb and add it to my VirtualBox machine (effectively replacing the hard drive). After partitioning it, I’ll add it running the following command:
~# zpool replace zpool0 8672231509236420021 /dev/sdb1
Wait, what? The syntax is: ’zpool replace <pool name> <old device> <new device>’. Since the old device is gone zpool is referencing the ID (8672231509236420021) and that’s why we have to use the ID instead of an ’old device’.
After this when I show a ’zpool status’ I get the following:
As you can see in the zpool0 tree it’s ’replacing’ and ’resilvering’ the device. Also note the status has changed to show that it’s currently ’resilvering’. This is sort of a file system check, similar to an fsck but it can be ran while the filesystem is being used. This resilvering was forced because of the replacement of the new drive. However, one can be issued manually by running ’zpool scrub <pool name>’ and similar output will appear in ’zpool status’.
This is the output after the device has been replaced and the scrub/resilver has been completed:
Note the handy output about how long the resilver took and how much data it had to deal with. Obviously if you’re dealing with several hundred gigs or terabytes it could take several hours to complete. The cool part is you can run this while the system is being used with minimal impact.
Anyway, my system is resilvered and I’m good until my next hard drive failure.
19 That’s all folks
I do hope you’ve found something in this post useful. ZFS natively is a powerful tool, on Linux it’s getting there and to-date this is the best case scenario/usage that I’ve found for it. I’m looking forward to future releases from zfsonlinux.org (or from other sources). As soon as a POSIX layer is added (and not on top of FUSE) it’ll change the way we use it but it’ll be even more powerful.
Please excuse any typos, confusion, long windedness or inaccuracies. Feel free to comment and correct me being crazy and I’ll update straight away.
I think that’s enough for now, until next time!