ZFS Send & Receive – Part 2

Receiving data from another host

After successfully importing a dataset from a usb disk, we now want to import a dataset from another host via network. Let’s assume you’re on the source server and there is a dataset that you would like to send to a remote server. There is a specific snapshot that you would like to send, and after a while you might even want to update the dataset on the remote server with a further (more fresh) snapshot. Assuming that we don’t control the network and would like to not spill the beans on what we’re sending, we will use SSH as channel.

ZFS dataset on the receiving host (remote)

On the receiving end there is a ZFS pool that we want to send our dataset into it. We should make sure that there is enough free space on the receiving pool.

root@nas[~]# zfs list tank       
NAME   USED  AVAIL     REFER  MOUNTPOINT
tank  11.1T  9.70T      170K  /mnt/tank

At this point we need to be aware of the fact that receiving a dataset will override an existing dataset with identical name (if such dataset already exists). So let’s be really sure and check that no dataset by the name of media exists already.

root@nas[~]# zfs list -rt all tank/media
cannot open 'tank/media': dataset does not exist

More or less this isn’t anything else that we need to check from a ZFS point of view. Of course we need to make sure that the firewall will let us through, but given the fact that we will send data via SSH and we probably already logged in via SSH we should be good to go.

O.K. – “just one more thing…” we need to be able to access our remote host via SSH without a password. The authorized_keys file on the remote host should thus contain the sending host’s public key.

root@nas[~]# ls -lah /root/.ssh 
total 5
drwxr-xr-x  2 root  wheel     4B Jul 18 14:33 .
drwxr-xr-x  5 root  wheel    16B Nov 12 22:26 ..
-rw-------  1 root  wheel   805B Jul 13 20:00 authorized_keys
-rw-r--r--  1 root  wheel   179B Jul 18 14:33 known_hosts

ZFS dataset on the sending host (local)

On the sender side there is a ZFS dataset that we would like to send. To be more precise there is a snapshot that belongs to a dataset we want to send.

root@jeojang[~]# zfs list -rt all tank/incoming/media
NAME                                             USED  AVAIL     REFER  MOUNTPOINT
tank/incoming/media                             1.31T  6.21T     1.31T  /mnt/tank/incoming/media
tank/incoming/media@manual_2020-07-04_10-45-00  3.01M      -     1.31T  -

Very similar to sending/receiving a dataset between local host and an attached USB disk we use the same command, but add SSH into the command pipeline.

As the command will run for a while, it makes sense to use a screen or tmux session to protect the command from breaking when closing your SSH session.

root@jeojang[~]# zfs send tank/incoming/media@manual_2020-07-04_10-45-00 | pv | ssh root@nas zfs receive tank/media
1.32TiB 4:13:09 [91.3MiB/s] [                                            <=>                                                  ]

While the above command runs, let’s take some time to dissect the command. Left of the pipe we have:

zfs send tank/incoming/media@manual_2020-07-04_10-45-00

What it means is that we are sending the snapshot named media@manual_2020-07-04_10-45-00 that is located inside the incoming dataset, which in turn is underneath the pool called tank.

Between the pipes we have the pv command which gives us some progress indication.

Right of the pipe we have:

ssh root@nas zfs receive tank/media

What happens here is that we login to the host nas using the root user. Because the ssh command can accept parameters that in turn will be executed as command on the remote host, we append zfs receive tank/media as a command. Basically what ever is sent from ZFS on our local host through the pipe will be received by ZFS on the other (remote) side. The received dataset will be placed under the tank pool on the remote host and be stored as a new dataset by the name of media. Again, if the receiving host already has a media dataset under the tank pool, that dataset will be overridden by our receive command.

Checking the result and cleanup on the receiving host (remote)

After the command has finished, we should see both the dataset and its snapshot in the receiving pool.

root@nas[~]# zfs list -rt all tank/media
NAME                                    USED  AVAIL     REFER  MOUNTPOINT
tank/media                             1.31T  8.60T     1.31T  /mnt/tank/media
tank/media@manual_2020-07-04_10-45-00     0B      -     1.31T  -

If we don’t have any further use for the snapshot, we can clean it up via the zfs destroy command. Deleting the one and only snapshot of a dataset will not lead to any data loss. If there would be anything depending on such snapshot (i.e. a clone), ZFS would not allow for the snapshot to be deleted and indicate the situation with an appropriate message.

root@nas[~]# zfs destroy tank/media@manual_2020-07-04_10-45-00

If desired we can check the dataset and its sub-contents recursively again…

root@nas[~]# zfs list -rt all tank/media                      
NAME         USED  AVAIL     REFER  MOUNTPOINT
tank/media  1.31T  8.60T     1.31T  /mnt/tank/media

All done.