ZFS Send & Receive – Part 2

Receiving data from another host

After successfully importing a dataset from a usb disk, we now want to import a dataset from another host via network. Let’s assume you’re on the source server and there is a dataset that you would like to send to a remote server. There is a specific snapshot that you would like to send, and after a while you might even want to update the dataset on the remote server with a further (more fresh) snapshot. Assuming that we don’t control the network and would like to not spill the beans on what we’re sending, we will use SSH as channel.

ZFS dataset on the receiving host (remote)

On the receiving end there is a ZFS pool that we want to send our dataset into it. We should make sure that there is enough free space on the receiving pool.

root@nas[~]# zfs list tank       
NAME   USED  AVAIL     REFER  MOUNTPOINT
tank  11.1T  9.70T      170K  /mnt/tank

At this point we need to be aware of the fact that receiving a dataset will override an existing dataset with identical name (if such dataset already exists). So let’s be really sure and check that no dataset by the name of media exists already.

root@nas[~]# zfs list -rt all tank/media
cannot open 'tank/media': dataset does not exist

More or less this isn’t anything else that we need to check from a ZFS point of view. Of course we need to make sure that the firewall will let us through, but given the fact that we will send data via SSH and we probably already logged in via SSH we should be good to go.

O.K. – “just one more thing…” we need to be able to access our remote host via SSH without a password. The authorized_keys file on the remote host should thus contain the sending host’s public key.

root@nas[~]# ls -lah /root/.ssh 
total 5
drwxr-xr-x  2 root  wheel     4B Jul 18 14:33 .
drwxr-xr-x  5 root  wheel    16B Nov 12 22:26 ..
-rw-------  1 root  wheel   805B Jul 13 20:00 authorized_keys
-rw-r--r--  1 root  wheel   179B Jul 18 14:33 known_hosts

ZFS dataset on the sending host (local)

On the sender side there is a ZFS dataset that we would like to send. To be more precise there is a snapshot that belongs to a dataset we want to send.

root@jeojang[~]# zfs list -rt all tank/incoming/media
NAME                                             USED  AVAIL     REFER  MOUNTPOINT
tank/incoming/media                             1.31T  6.21T     1.31T  /mnt/tank/incoming/media
tank/incoming/media@manual_2020-07-04_10-45-00  3.01M      -     1.31T  -

Very similar to sending/receiving a dataset between local host and an attached USB disk we use the same command, but add SSH into the command pipeline.

As the command will run for a while, it makes sense to use a screen or tmux session to protect the command from breaking when closing your SSH session.

root@jeojang[~]# zfs send tank/incoming/media@manual_2020-07-04_10-45-00 | pv | ssh root@nas zfs receive tank/media
1.32TiB 4:13:09 [91.3MiB/s] [                                            <=>                                                  ]

While the above command runs, let’s take some time to dissect the command. Left of the pipe we have:

zfs send tank/incoming/media@manual_2020-07-04_10-45-00

What it means is that we are sending the snapshot named media@manual_2020-07-04_10-45-00 that is located inside the incoming dataset, which in turn is underneath the pool called tank.

Between the pipes we have the pv command which gives us some progress indication.

Right of the pipe we have:

ssh root@nas zfs receive tank/media

What happens here is that we login to the host nas using the root user. Because the ssh command can accept parameters that in turn will be executed as command on the remote host, we append zfs receive tank/media as a command. Basically what ever is sent from ZFS on our local host through the pipe will be received by ZFS on the other (remote) side. The received dataset will be placed under the tank pool on the remote host and be stored as a new dataset by the name of media. Again, if the receiving host already has a media dataset under the tank pool, that dataset will be overridden by our receive command.

Checking the result and cleanup on the receiving host (remote)

After the command has finished, we should see both the dataset and its snapshot in the receiving pool.

root@nas[~]# zfs list -rt all tank/media
NAME                                    USED  AVAIL     REFER  MOUNTPOINT
tank/media                             1.31T  8.60T     1.31T  /mnt/tank/media
tank/media@manual_2020-07-04_10-45-00     0B      -     1.31T  -

If we don’t have any further use for the snapshot, we can clean it up via the zfs destroy command. Deleting the one and only snapshot of a dataset will not lead to any data loss. If there would be anything depending on such snapshot (i.e. a clone), ZFS would not allow for the snapshot to be deleted and indicate the situation with an appropriate message.

root@nas[~]# zfs destroy tank/media@manual_2020-07-04_10-45-00

If desired we can check the dataset and its sub-contents recursively again…

root@nas[~]# zfs list -rt all tank/media                      
NAME         USED  AVAIL     REFER  MOUNTPOINT
tank/media  1.31T  8.60T     1.31T  /mnt/tank/media

All done.

ZFS Send & Receive – Part 1

Receiving data from a USB disk

Think of the scenario were you have stored a ZFS dataset on a USB disk for safekeeping and you want to reimport the dataset back to your server. Let’s further assume that you don’t remember much details from back when exporting the dataset and all you know is that the dataset had previously been exported to that USB disk you found in our desk drawer.

Determining USB device and ZFS pool details

The first thing you should do is have a look at your USB devices before you connect the disk. We can use the usbconfig, the camcontrol and zpool commands for that. Let’s start with the USB configuration.

root@jeojang[~]# usbconfig         
ugen0.1: <Intel EHCI root HUB> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.1: <Intel EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.2: <vendor 0x8087 product 0x0024> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.2: <vendor 0x8087 product 0x0024> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.3: <vendor 0x05e3 USB Storage> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (500mA)

Now let’s have a look at the list of devices known to the FreeBSD CAM subsystem.

root@jeojang[~]# camcontrol devlist
<ST3000DM001-9YN166 CC4C>          at scbus0 target 0 lun 0 (pass0,ada0)
<ST3000DM001-1CH166 CC27>          at scbus1 target 0 lun 0 (pass1,ada1)
<ST3000DM001-1ER166 CC25>          at scbus2 target 0 lun 0 (pass2,ada2)
<ST3000DM001-1ER166 CC25>          at scbus3 target 0 lun 0 (pass3,ada3)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus4 target 0 lun 0 (pass4,ses0)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 0 (pass5,da0)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 1 (pass6,da1)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 2 (pass7,da2)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 3 (pass8,da3)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 4 (pass9,da4)

Last but not least let’s see which ZFS pools we already have.

root@jeojang[~]# zpool status
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:59 with 0 errors on Tue Jul 14 03:45:59 2020
config:

	NAME        STATE     READ WRITE CKSUM
	freenas-boot  ONLINE       0     0     0
	  da4p2     ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: resilvered 6.21M in 0 days 00:00:04 with 0 errors on Tue Nov 10 11:11:55 2020
config:

	NAME                                            STATE     READ WRITE CKSUM
	tank                                            ONLINE       0     0     0
	  raidz1-0                                      ONLINE       0     0     0
	    gptid/0130909f-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/017b7353-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/01a6574e-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/01b57eb4-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0

errors: No known data errors

Plugging in the USB disk

Time to connect the USB disk and to see what happens.

SPOILER-ALERT: looking at the dmesg output already tells us a lot, but still – let’s go through usbconfig, the camcontrol and zpool step by step.

root@jeojang[~]# usbconfig         
ugen0.1: <Intel EHCI root HUB> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.1: <Intel EHCI root HUB> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.2: <vendor 0x8087 product 0x0024> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen1.2: <vendor 0x8087 product 0x0024> at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=SAVE (0mA)
ugen0.3: <vendor 0x05e3 USB Storage> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (500mA)
ugen0.4: <Western Digital My Passport 0748> at usbus0, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (500mA)

As can be seen above, the output of usbconfig has grown by one more entry and ugen0.4 shows a Western Digital My Passport USB device introduced to the kernel. Let’s look at the CAM subsystem to find out more about device mapping.

root@jeojang[~]# camcontrol devlist
<ST3000DM001-9YN166 CC4C>          at scbus0 target 0 lun 0 (pass0,ada0)
<ST3000DM001-1CH166 CC27>          at scbus1 target 0 lun 0 (pass1,ada1)
<ST3000DM001-1ER166 CC25>          at scbus2 target 0 lun 0 (pass2,ada2)
<ST3000DM001-1ER166 CC25>          at scbus3 target 0 lun 0 (pass3,ada3)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus4 target 0 lun 0 (pass4,ses0)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 0 (pass5,da0)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 1 (pass6,da1)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 2 (pass7,da2)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 3 (pass8,da3)
<Generic STORAGE DEVICE 9744>      at scbus6 target 0 lun 4 (pass9,da4)
<WD My Passport 0748 1019>         at scbus7 target 0 lun 0 (da5,pass10)
<WD SES Device 1019>               at scbus7 target 0 lun 1 (ses1,pass11)

The USB disk has been attached to the kernel as device node da5 and a corresponding SCSI environmental system driver (ses1).

I am not showing the output of the zpool status command because nothing has changed. This is actually expected because the kernel doesn’t trigger the ZFS file system to start importing pools from newly connected USB mass storage devices on its own. We need to do that ourselves.

ZFS pool discovery and import

Actually, ZFS pool discovery is fairly easy. The zpool import command allows for both, discovery and import of ZFS pools.

root@jeojang[~]# zpool import
   pool: wdpool
     id: 6303543710831443128
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

	wdpool      ONLINE
	  da5       ONLINE

As can be read in the action field above, we can go ahead and import the pool wdpool, which we do with the following command:

root@jeojang[~]# zpool import wdpool

No output is good news in this case and we can double-check the success by looking at the zpool status command again.

root@jeojang[~]# zpool status                     
  pool: freenas-boot
 state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:59 with 0 errors on Tue Jul 14 03:45:59 2020
config:

	NAME        STATE     READ WRITE CKSUM
	freenas-boot  ONLINE       0     0     0
	  da4p2     ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: ONLINE
  scan: resilvered 6.21M in 0 days 00:00:04 with 0 errors on Tue Nov 10 11:11:55 2020
config:

	NAME                                            STATE     READ WRITE CKSUM
	tank                                            ONLINE       0     0     0
	  raidz1-0                                      ONLINE       0     0     0
	    gptid/0130909f-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/017b7353-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/01a6574e-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0
	    gptid/01b57eb4-ba99-11ea-8702-f46d04d37d65  ONLINE       0     0     0

errors: No known data errors

  pool: wdpool
 state: ONLINE
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	wdpool      ONLINE       0     0     0
	  da5       ONLINE       0     0     0

errors: No known data errors

Sure enough, our pool is online and appears free of errors. Finally we should have a quick look at the datasets in the freshly imported pool.

root@jeojang[~]# zfs list -rt all wdpool
NAME                                                                                        USED  AVAIL  REFER  MOUNTPOINT
wdpool                                                                                     1.45T   317G    88K  /wdpool
wdpool/andreas                                                                              112G   317G   112G  /wdpool/andreas
wdpool/andreas@manual_2020-07-04_10-11-00                                                  63.6M      -   112G  -
wdpool/jails                                                                               16.9G   317G   288K  /wdpool/jails
wdpool/jails@manual_2020-07-04_12:58:00                                                        0      -   288K  -
wdpool/jails/.warden-template-stable-11                                                    3.02G   317G  3.00G  /bigpool/jailset/.warden-template-stable-11
wdpool/jails/.warden-template-stable-11@clean                                              13.5M      -  3.00G  -
wdpool/jails/.warden-template-stable-11@manual_2020-07-04_12:58:00                             0      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64                                            3.00G   317G  3.00G  /bigpool/jailset/.warden-template-standard-11.0-x64
wdpool/jails/.warden-template-standard-11.0-x64@clean                                       104K      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64@manual_2020-07-04_12:58:00                     0      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64-20180406194538                             3.00G   317G  3.00G  /bigpool/jailset/.warden-template-standard-11.0-x64-20180406194538
wdpool/jails/.warden-template-standard-11.0-x64-20180406194538@clean                        104K      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64-20180406194538@manual_2020-07-04_12:58:00      0      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64-20190107155553                             3.00G   317G  3.00G  /bigpool/jailset/.warden-template-standard-11.0-x64-20190107155553
wdpool/jails/.warden-template-standard-11.0-x64-20190107155553@clean                        104K      -  3.00G  -
wdpool/jails/.warden-template-standard-11.0-x64-20190107155553@manual_2020-07-04_12:58:00      0      -  3.00G  -
wdpool/jails/ca                                                                            1.30G   317G  4.19G  /wdpool/jails/ca
wdpool/jails/ca@manual_2020-07-04_12:58:00                                                     0      -  4.19G  -
wdpool/jails/ldap                                                                          1.55G   317G  4.22G  /wdpool/jails/ldap
wdpool/jails/ldap@manual_2020-07-04_12:58:00                                                176K      -  4.22G  -
wdpool/jails/wiki                                                                          2.03G   317G  4.66G  /wdpool/jails/wiki
wdpool/jails/wiki@manual_2020-07-04_12:58:00                                                200K      -  4.66G  -
wdpool/media                                                                               1.32T   317G  1.32T  /wdpool/media
wdpool/media@manual_2020-07-04_10-45-00                                                     104K      -  1.32T  -
wdpool/rsynch                                                                               260M   317G   260M  /wdpool/rsynch
wdpool/rsynch@manual_2020-07-04_12-52-00                                                       0      -   260M  -

At this point we could already access the data via the mount points that are being displayed in the right most column (beware of line-breaks in the above text box!). However, what we want is to receive the complete datasets which allows for receiving snapshots entirely or even incremental.

ZFS Receive

We use a piped communication with zfs send on one side and zfs receive on the other side. Because we want to see progress with pipe everything through the pv command in the middle.

ATTENTION: depending on the size of the dataset the command will run for a long (as in hours) time and you should execute the command from a screen or tmux.

root@jeojang[~]# zfs send wdpool/media@manual_2020-07-04_10-45-00 | pv | zfs receive tank/incoming/media
 438MiB 0:00:15 [35.7MiB/s] [                                                            <=>            ]

For the next hours you can glean at the progress via zfs list or zpool iostat.

root@jeojang[~]# zpool iostat tank 10
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank         196G  10.7T     10     37  85.1K   606K
tank         196G  10.7T      0    332      0  35.9M
tank         197G  10.7T      0    331      0  35.3M
tank         197G  10.7T      0    333  5.20K  35.8M
tank         198G  10.7T     16    358   146K  36.3M
tank         198G  10.7T     23    359   143K  34.8M
tank         199G  10.7T     31    377   178K  35.5M

ZFS pool export and USB disk ejection

After the zfs receive command has finished and everything has been imported without errors you should export the ZFS pool using the zpool export command. This is to make sure that any mounted file systems are being unmounted before continuing.

root@jeojang[~]# zpool export wdpool

As far as the zpool export command is concerned, no news is good news and if there is no output from the command you can assume that no errors have occured. To double check you can issue a zpool status command to see for yourself that the pool is gone.

Ejecting the USB disk can be done using the camcontrol eject command. Make sure you eject the correct device as very bad things can happen if you eject the wrong device.

root@jeojang[~]# camcontrol eject /dev/da5
Unit stopped successfully, Media ejected