This post describes each step to move from RAID6 to ZFS in detail. For an introduction to the approach, refer to the first part.
Implementation
I started by creating a new pool zstorage
using proper alignment for big hard
disks. The LZ4 compression is pretty lightweight, which is why I turn it on.
As I don’t care about recording access times, I turn this off. autoexpand=on
makes sure the pool gets automatic expanded if bigger devices are attached.
# Create Pool
zpool create -o ashift=12 zstorage /dev/sdi1 /dev/sdi2 /dev/sdf1
zpool set autoexpand=on zstorage
zfs set compression=lz4 zstorage
zfs set atime=off zstorage
The ZFS pool has been created and zpool status zstorage
shows the following:
pool: zstorage
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
Then I copied all files from my existing storage at /mnt
to the newly created
ZFS pool zstorage
. My first idea was to do this with cp -ax
, but I changed
my mind and used rsync
instead.
# Copy all files to the new ZFS pool
rsync -avHP /srv/data/ /zstorage/
After that, I manually failed 2 devices of the existing RAID pool so I can remove them the regular way. This is possible because it’s a RAID6. If another harddisk fails now, the original RAID pool would be damaged. That wouldn’t be too bad though as all files have been copied to the ZFS pool already.
It is important to also erase the superblock. Otherwise Linux’s MD bootup script might detect the manually failed device and attempt to add it back to the RAID pool. If there is already ZFS data on it, this might corrupt it.
# Fail Devices
mdadm /dev/md0 -f /dev/sdg1
mdadm --zero-superblock /dev/sdg1
mdadm /dev/md0 -f /dev/sdh1
mdadm --zero-superblock /dev/sdh1
These free devices are added to the ZFS pool to add redundancy to the existing
devices.
Data from sdi1
will be mirrored to sdg1
and data from sdi2
will be
mirrored to sdh1
.
# Move failed Devices to zpool
zpool attach zstorage /dev/sdi1 /dev/sdg1
zpool attach zstorage /dev/sdi2 /dev/sdh1
The resilvering will take a while.
pool: zstorage
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sun Aug 21 12:02:54 2016
14,0G scanned out of 10,0T at 184M/s, 15h52m to go
6,96G resilvered, 0,14% done
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0 (resilvering)
mirror-1 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sdh1 ONLINE 0 0 0 (resilvering)
sdf1 ONLINE 0 0 0
errors: No known data errors
In the meantime, the RAID can be unmounted and stopped.
# Stop RAID
umount /mnt
vgchange -a n silverstone-data
mdadm --stop /dev/md0
After the resilvering has been completed, it is getting a bit more dangerous.
If sdf
dies now before the next steps have been completed successfully, data
from that disk will be lost.
After the resilvering, the pool looks like this.
pool: zstorage
state: ONLINE
scan: resilvered 5,01T in 18h31m with 0 errors on Mon Aug 22 06:34:11 2016
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
Previously sdg1
has been added to mirror sdi1
and sdh1
has been added to
mirror sdi2
respectively. To replace sdi1
and sdi2
, sdc1
and sda1
are
added.
Too ease the hard disks a bit, I performed the changes one by one.
# Replace splitted devices by real devices (1/2)
zpool replace zstorage /dev/sdi1 /dev/sdc1
Again, resilvering takes time.
pool: zstorage
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Aug 22 08:21:29 2016
43,2G scanned out of 10,0T at 391M/s, 7h25m to go
10,5G resilvered, 0,42% done
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
replacing-0 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
sdc1 ONLINE 0 0 0 (resilvering)
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
Before performing with the next disk, I waited for the resilvering to complete.
pool: zstorage
state: ONLINE
scan: resilvered 2,50T in 8h13m with 0 errors on Mon Aug 22 16:35:08 2016
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
Now replace the next one.
# Replace splitted devices by real devices (2/2)
zpool replace zstorage /dev/sdi2 /dev/sda1
… and wait…
pool: zstorage
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Aug 22 19:57:44 2016
343G scanned out of 10,0T at 407M/s, 6h56m to go
87,5G resilvered, 3,35% done
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
replacing-0 ONLINE 0 0 0
sdi2 ONLINE 0 0 0
sda1 ONLINE 0 0 0 (resilvering)
sdh1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
… for the resilvering to finish.
pool: zstorage
state: ONLINE
scan: resilvered 2,51T in 8h16m with 0 errors on Tue Aug 23 04:13:49 2016
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
errors: No known data errors
After the resilvering sdi
can be repartitioned to use one partition instead
of two.
parted /dev/sdi -a opt
The partitioning then looks like this.
Model: ATA WDC WD60EFRX-68L (scsi)
Disk /dev/sdi: 6001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Number Start End Size File system Name Flags
1 1049kB 6001GB 6001GB primary
The repartitioned sdi1
is added as mirror for sdf1
.
# Re-add the now not splitted device back to the pool
zpool attach zstorage /dev/sdf1 /dev/sdi1
One last resilvering is required. After that, all data is stored redundant and if a harddisk fails, the pool is still functional.
pool: zstorage
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Aug 23 08:32:22 2016
9,98T scanned out of 10,0T at 183M/s, 0h2m to go
4,99T resilvered, 99,69% done
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
sdi1 ONLINE 0 0 0 (resilvering)
errors: No known data errors
Then finally everything is resilvered for good.
pool: zstorage
state: ONLINE
scan: resilvered 5,00T in 15h55m with 0 errors on Wed Aug 24 00:27:30 2016
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
sdc1 ONLINE 0 0 0
sdg1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
sda1 ONLINE 0 0 0
sdh1 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
sdf1 ONLINE 0 0 0
sdi1 ONLINE 0 0 0
errors: No known data errors
The last step is to export the zstorage
ZFS pool in order to get it imported
without issues on FreeBSD.
# prepare for import on FreeBSD
zpool export zstorage
After booting into FreeBSD, the ZFS pool can be imported.
# import on FreeBSD
zpool import zstorage
That was the last step.
This is how the pool looks on FreeBSD:
pool: zstorage
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zstorage ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
diskid/DISK-WD-WCC4N3763073p1 ONLINE 0 0 0
diskid/DISK-WD-WCC4N2568313p1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
diskid/DISK-WD-WMC1G2637960p1 ONLINE 0 0 0
diskid/DISK-WD-WCC4P4964291p1 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
diskid/DISK-WD-WX21X53245KAp1 ONLINE 0 0 0
diskid/DISK-WD-WXBR8375257Xp1 ONLINE 0 0 0
errors: No known data errors
That’s it. I am feeling a lot better now with ZFS and FreeBSD.