Filesystem metadata extraction/storage with early bup versions
I’ve invested in extended attributes for tagging, am revisiting backups (having
recently moved back to Linux), have found
rsync --link-dest just a bit too
slow and inefficient and am determined to use bup. When used in low-level
tar -cf - / | bup split mode, bup treats its input as a blob and therefore
preserves whatever metadata your archiver does, but it’s inefficient and slow
to archive the whole tree every time. High-level metadata preservation is
more efficient, feeding bup only what changed, and apparently should make
it into bup 0.25, but today’s Ubuntu has 0.22a-1. That seems to leave the
- Use an extended attribute-preserving archiver and low-level bup, put up with the speed hit for now and switch to high-level bup once it supports metadata preservation. My understanding is that you could switch from low-level to high-level bup without changing repository and the first high-level backup would enjoy deduplication, because the efficiency advantage of the high-level stuff is just to feed bup only files known to have changed, rather than all files.
- Extract the extended attributes to one or more files, in some format that’s easy to apply to metadata-less files freshly extracted from bup, and use high-level bup today.
I do my incremental backups manually, almost every night. (Yeah,
if it’s not automatic, it isn’t a backup — but if you’re away from home
more nights than you’re home, on flaky Internet connections where you’re
competing with iPlayer and YouTube downloads, fast manual backup to a caddied
USB 3 drive beats slow automatic backup over the network.) Duration is
important; I want to go to bed, now, knowing the backup is done. 20 minutes is
about as long as I can be bothered to wait. Just creating a full dar archive
and writing to
/dev/null took ~10m for
/root and ~1h34m for
Choosing an archive tool/format
So, what tool/format (in descending order of importance):
- preserves extended attributes as well as other metadata?
- is widely used/tested/reliable and stable?
- is trivial to restore from, over the top of a bup-restored tree with incorrect/incomplete metadata?
- has minimal dependencies? For disaster recovery my bup backups will be the third in line (after two geographically and physically redundant full system images), but in that unlikely event I don’t want to need much more than I’d find on a typical rescue CD.
Ideas that didn’t survive scrutiny:
- rsync — I couldn’t see any combination of options that excluded the contents of files.
- I had hoped that a dar isolated catalogue would contain extended attributes, but the format docs explain that extended attributes are part of the archive proper. Dar is also not as popular as it deserves to be, which is of course something of a self-fulfilling prophecy.
- Extract all the extended attributes to a sqlite database. I have scripts to do this scripts lying around from when I migrated them from OS X, but it felt hacky, would add a requirement for non-standard tools and libraries to both backup and restore, and by the time I’d added and debugged the other metadata I’d have written a clone of …
- metastore does exactly what I want, but again isn’t as widely used or tested as I’d like. I ran into a bug straight away which dented my confidence, and the stock version is wedded to git and needs patching to include .git directories.
It turns out GNU cp has a
--attributes-only option and includes extended
attributes, so I settled on the idea of a loopback mounted filesystem. My
underlying filesystem is ext3, journalling isn’t needed and SquashFS, while
compact, would require enough inodes for a separate copy of the tree, so ext2
Trying it out
My naïve first try didn’t work:
root@jin:~# dd if=/dev/null of=/home/meta.ext2 bs=1G seek=1 0+0 records in 0+0 records out 0 bytes (0 B) copied, 3.0289e-05 s, 0.0 kB/s root@jin:~# mke2fs /home/meta.ext2 [...] Block size=4096 (log=2) [...] 65536 inodes, 262144 blocks 13107 blocks (5.00%) reserved for the super user [...] root@jin:~# mkdir test-backup root@jin:~# mount -o loop /home/meta.ext2 test-backup/ root@jin:~# cp --archive --attributes-only --one-file-system /. test-backup/ [...] cp: cannot create directory `test-backup/./var': No space left on device root@jin:~# df --block-size=1K test-backup/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/loop0 1032088 47144 932516 5% /root/test-backup root@jin:~# df --inodes test-backup/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/loop0 65536 65536 0 100% /root/test-backup
Too much space for data blocks and too little for inodes. The number of inodes needed is simply however many the source filesystem is using right now:
root@jin:~# df --inodes / Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/jin-root--enc_crypt 286720 206800 79920 73% /
Specifying this is easy:
root@jin:~# mke2fs -N 206800 /home/meta.ext2
The easy approach to allocating data blocks is to wildly overestimate. Provided
the underlying file is freshly, sparsely allocated for each backup then bup will
de-dup the virtual, zero blocks. (If bup didn’t, you could still use
resize2fs.) A quick test shows that bup does throw away zeros:
root@jin:~# umount test-backup/ root@jin:~# dd if=/dev/null of=/home/meta.ext2 bs=1G seek=1 0+0 records in 0+0 records out 0 bytes (0 B) copied, 3.2749e-05 s, 0.0 kB/s root@jin:~# export BUP_DIR=bup-zeros-test root@jin:~# bup init Initialized empty Git repository in /root/bup-zeros-test/ root@jin:~# bup split -n zeros -v /home/meta.ext2 [...] root@jin:~# du -hs $BUP_DIR 7.7M bup-zeros-test root@jin:~# rm -rf $BUP_DIR
Probably pointless fine-tuning
Just out of curiosity though, and perhaps to get an estimate within the right order of magnitude, could we easily get the number of data blocks a bit closer to the required number? Data blocks are required for:
- symlinks, which require blocks (typically just one) if they’re 60 or more characters (this doesn’t seem to depend on the inode size)
- extended attributes, which require blocks if they don’t fit in the inode, which depends on the inode/xattr sizes
Ignoring xattrs (like the GNU find maintainers) and other filesystem overheads for a moment the total filesystem size in bytes would be:
block_size * num_oversize_links + directory_bytes + inode_size * num_inodes
root@jin:~# find /mnt/root-enc-snap/ -type l -size +59c -printf "x" | wc -c 14500
root@jin:~# find /mnt/root-enc-snap/ -type d -printf "%b * 512\n" | paste -s -d + | bc 84413440
With 4096b blocks and 256b inodes, that’s (bytes/blocks):
root@jin:~# echo '4096 * 14500 + 84413440 + 256 * 206800' | bc 196746240 root@jin:~# echo '(4096 * 14500 + 84413440 + 256 * 206800) / 4096' | bc 48033
You need a to pick value for block and inode sizes to make an estimate, but is it worth playing with them? I haven’t; since the filesystem image is going to be submitted to bup deduplication then discarded it probably doesn’t matter much. Extended attributes can’t exceed one block, and while most of mine are under 100 bytes I like to keep my options open. I’m sure I also read somewhere that some ext2 implementations only support 4096b blocks, so perhaps that size makes your backups just a bit more portable.
Let’s try it:
root@jin:~# mke2fs -N 206800 -b 4096 -I 256 -m 0 /home/meta.ext2 48033 [...] 206864 inodes, 48033 blocks [...] root@jin:~# dumpe2fs -h /home/meta.ext2 [...] Inode count: 206864 Block count: 48033 Reserved block count: 0 Free blocks: 34880 Free inodes: 206853 [...] root@jin:~# mount -o loop /home/meta.ext2 test-backup/ root@jin:~# cp --archive --attributes-only --one-file-system /. test-backup/ root@jin:~# df --inodes test-backup/ Filesystem Inodes IUsed IFree IUse% Mounted on /dev/loop0 206864 206711 153 100% /root/test-backup root@jin:~# df -k test-backup/ Filesystem 1K-blocks Used Available Use% Mounted on /dev/loop0 140328 138860 1468 99% /root/test-backup
Much better. It’s a little surprising that it’s big enough, given that I made
no allowance for overheads or xattrs, but the root filesystem probably doesn’t
have any xattrs (mine are in
/home on a separate mount) and the directories
seem to take up less space in the backup than on the original (perhaps deleted
entries don’t get reclaimed, so the original has holes that the clone doesn’t?)
which compensates for overheads and leaves a little room to spare.
I initially chose ext2 over SquashFS because SquashFS has no attributes-only mode and would therefore be inode-hungry. Of course, once there’s a populated image with the right number of inodes and data blocks, it’s a tiny extra step to put it in a SquashFS.
Space efficiency comparison
How big are the filesystem images and how well do they deduplicate?
|Format||sparse loopback ext2 with 10% more than estimated required blocks||metastore||squashfs (with default options)|
|Apparent size (MiB)||573.74||91.05||13.56|
|On-disk size (MiB)||524.45||91.05||13.56|
|Gzipped size (MiB)||13.48||19.23||13.44|
|Size when bup-saved, uncompressed, into an empty per-format repository (MiB)||19.07||24.11||13.64|
|After three days (three bup-saves) as above (MiB)||58.73 (19.62/day)||42.60 (9.09/day)||41.10 (13.65/day)|
My choice: loopback ext2
I’m not too fussed about squeezing every last byte out of the metadata storage. I am very fussed about using only reliable, stable code when it comes to backups, and minimising the number of extra scripts, tools and dependencies. (I make an exception for bup because without it I couldn’t afford nearly as many snapshots for the same disk space/backup time.) A loopback ext2 filesystem therefore seems right to me (even before the space efficiency comparison results are in). Perhaps one day someone (me?) will look at porting Backup Bouncer to Linux so the accuracy of these solutions can be objectively compared.
Notes and afterthoughts
Perhaps SquashFS would deduplicate better with inode table compression disabled, in keeping with the tar-before-gzip approach. I haven’t tried this.
Perhaps it’s worth mentioning that
cp --attributes-only doesn’t delete
existing attributes, so can only be used accurately on a tree freshly restored
from bup. I’ve not tested whether it copies ext2-only flags like the immutable
flag. It doesn’t detect hard links.
It’s also worth pointing out that — once stable and reliable — bup’s own metadata format will probably use optimisations tailored specifically for bup’s deduplication, so I’ll prefer it when it gets there.