How I moved from CentOS with RAID6 to FreeBSD with ZFS: Part 2

This post describes each step to move from RAID6 to ZFS in detail. For an introduction to the approach, refer to the first part.

Implementation

I started by creating a new pool zstorage using proper alignment for big hard disks. The LZ4 compression is pretty lightweight, which is why I turn it on. As I don’t care about recording access times, I turn this off. autoexpand=on makes sure the pool gets automatic expanded if bigger devices are attached.

# Create Pool
zpool create -o ashift=12 zstorage /dev/sdi1 /dev/sdi2 /dev/sdf1
zpool set autoexpand=on zstorage
zfs set compression=lz4 zstorage
zfs set atime=off zstorage

The ZFS pool has been created and zpool status zstorage shows the following:

  pool: zstorage
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          sdi1      ONLINE       0     0     0
          sdi2      ONLINE       0     0     0
          sdf1      ONLINE       0     0     0

errors: No known data errors

Then I copied all files from my existing storage at /mnt to the newly created ZFS pool zstorage. My first idea was to do this with cp -ax, but I changed my mind and used rsync instead.

# Copy all files to the new ZFS pool
rsync -avHP /srv/data/ /zstorage/

After that, I manually failed 2 devices of the existing RAID pool so I can remove them the regular way. This is possible because it’s a RAID6. If another harddisk fails now, the original RAID pool would be damaged. That wouldn’t be too bad though as all files have been copied to the ZFS pool already.

It is important to also erase the superblock. Otherwise Linux’s MD bootup script might detect the manually failed device and attempt to add it back to the RAID pool. If there is already ZFS data on it, this might corrupt it.

# Fail Devices
mdadm /dev/md0 -f /dev/sdg1
mdadm --zero-superblock /dev/sdg1

mdadm /dev/md0 -f /dev/sdh1
mdadm --zero-superblock /dev/sdh1

These free devices are added to the ZFS pool to add redundancy to the existing devices. Data from sdi1 will be mirrored to sdg1 and data from sdi2 will be mirrored to sdh1.

# Move failed Devices to zpool
zpool attach zstorage /dev/sdi1 /dev/sdg1
zpool attach zstorage /dev/sdi2 /dev/sdh1

The resilvering will take a while.

  pool: zstorage
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Aug 21 12:02:54 2016
    14,0G scanned out of 10,0T at 184M/s, 15h52m to go
    6,96G resilvered, 0,14% done
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdi1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0  (resilvering)
          mirror-1  ONLINE       0     0     0
            sdi2    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0  (resilvering)
          sdf1      ONLINE       0     0     0

errors: No known data errors

In the meantime, the RAID can be unmounted and stopped.

# Stop RAID
umount /mnt
vgchange -a n silverstone-data
mdadm --stop /dev/md0

After the resilvering has been completed, it is getting a bit more dangerous. If sdf dies now before the next steps have been completed successfully, data from that disk will be lost.

After the resilvering, the pool looks like this.

  pool: zstorage
 state: ONLINE
  scan: resilvered 5,01T in 18h31m with 0 errors on Mon Aug 22 06:34:11 2016
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdi1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sdi2    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0
          sdf1      ONLINE       0     0     0

errors: No known data errors

Previously sdg1 has been added to mirror sdi1 and sdh1 has been added to mirror sdi2 respectively. To replace sdi1 and sdi2, sdc1 and sda1 are added.

Too ease the hard disks a bit, I performed the changes one by one.

# Replace splitted devices by real devices (1/2)
zpool replace zstorage /dev/sdi1 /dev/sdc1

Again, resilvering takes time.

  pool: zstorage
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Aug 22 08:21:29 2016
    43,2G scanned out of 10,0T at 391M/s, 7h25m to go
    10,5G resilvered, 0,42% done
config:

        NAME             STATE     READ WRITE CKSUM
        zstorage         ONLINE       0     0     0
          mirror-0       ONLINE       0     0     0
            replacing-0  ONLINE       0     0     0
              sdi1       ONLINE       0     0     0
              sdc1       ONLINE       0     0     0  (resilvering)
            sdg1         ONLINE       0     0     0
          mirror-1       ONLINE       0     0     0
            sdi2         ONLINE       0     0     0
            sdh1         ONLINE       0     0     0
          sdf1           ONLINE       0     0     0

errors: No known data errors

Before performing with the next disk, I waited for the resilvering to complete.

  pool: zstorage
 state: ONLINE
  scan: resilvered 2,50T in 8h13m with 0 errors on Mon Aug 22 16:35:08 2016
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sdi2    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0
          sdf1      ONLINE       0     0     0

errors: No known data errors

Now replace the next one.

# Replace splitted devices by real devices (2/2)
zpool replace zstorage /dev/sdi2 /dev/sda1

… and wait…

  pool: zstorage
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Aug 22 19:57:44 2016
    343G scanned out of 10,0T at 407M/s, 6h56m to go
    87,5G resilvered, 3,35% done
config:

        NAME             STATE     READ WRITE CKSUM
        zstorage         ONLINE       0     0     0
          mirror-0       ONLINE       0     0     0
            sdc1         ONLINE       0     0     0
            sdg1         ONLINE       0     0     0
          mirror-1       ONLINE       0     0     0
            replacing-0  ONLINE       0     0     0
              sdi2       ONLINE       0     0     0
              sda1       ONLINE       0     0     0  (resilvering)
            sdh1         ONLINE       0     0     0
          sdf1           ONLINE       0     0     0

errors: No known data errors

… for the resilvering to finish.

  pool: zstorage
 state: ONLINE
  scan: resilvered 2,51T in 8h16m with 0 errors on Tue Aug 23 04:13:49 2016
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sda1    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0
          sdf1      ONLINE       0     0     0

errors: No known data errors

After the resilvering sdi can be repartitioned to use one partition instead of two.

parted /dev/sdi -a opt

The partitioning then looks like this.

Model: ATA WDC WD60EFRX-68L (scsi)
Disk /dev/sdi: 6001GB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      1049kB  6001GB  6001GB               primary

The repartitioned sdi1 is added as mirror for sdf1.

# Re-add the now not splitted device back to the pool
zpool attach zstorage /dev/sdf1 /dev/sdi1

One last resilvering is required. After that, all data is stored redundant and if a harddisk fails, the pool is still functional.

  pool: zstorage
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Aug 23 08:32:22 2016
    9,98T scanned out of 10,0T at 183M/s, 0h2m to go
    4,99T resilvered, 99,69% done
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sda1    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0
          mirror-2  ONLINE       0     0     0
            sdf1    ONLINE       0     0     0
            sdi1    ONLINE       0     0     0  (resilvering)

errors: No known data errors

Then finally everything is resilvered for good.

  pool: zstorage
 state: ONLINE
  scan: resilvered 5,00T in 15h55m with 0 errors on Wed Aug 24 00:27:30 2016
config:

        NAME        STATE     READ WRITE CKSUM
        zstorage    ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdc1    ONLINE       0     0     0
            sdg1    ONLINE       0     0     0
          mirror-1  ONLINE       0     0     0
            sda1    ONLINE       0     0     0
            sdh1    ONLINE       0     0     0
          mirror-2  ONLINE       0     0     0
            sdf1    ONLINE       0     0     0
            sdi1    ONLINE       0     0     0

errors: No known data errors

The last step is to export the zstorage ZFS pool in order to get it imported without issues on FreeBSD.

# prepare for import on FreeBSD
zpool export zstorage

After booting into FreeBSD, the ZFS pool can be imported.

# import on FreeBSD
zpool import zstorage

That was the last step.

This is how the pool looks on FreeBSD:

  pool: zstorage
 state: ONLINE
  scan: none requested
config:

        NAME                               STATE     READ WRITE CKSUM
        zstorage                           ONLINE       0     0     0
          mirror-0                         ONLINE       0     0     0
            diskid/DISK-WD-WCC4N3763073p1  ONLINE       0     0     0
            diskid/DISK-WD-WCC4N2568313p1  ONLINE       0     0     0
          mirror-1                         ONLINE       0     0     0
            diskid/DISK-WD-WMC1G2637960p1  ONLINE       0     0     0
            diskid/DISK-WD-WCC4P4964291p1  ONLINE       0     0     0
          mirror-2                         ONLINE       0     0     0
            diskid/DISK-WD-WX21X53245KAp1  ONLINE       0     0     0
            diskid/DISK-WD-WXBR8375257Xp1  ONLINE       0     0     0

errors: No known data errors

That’s it. I am feeling a lot better now with ZFS and FreeBSD.

CHRISTOPHER BEPPLER

Implementation