Backup your Mac!
Sunday, September 24th, 2006Last week the Seagate 120 GB drive in my MacBook failed. This is a drive upgrade I performed, so I have to send the disk back to Seagate for replacement. (Honestly, I think I was just unlucky here. The MacBook came with a 60 GB Seagate from the same disk series, so there is nothing magic about it. Poisson statistics just suck sometimes.)
For the time being, I have reverted to the original 60 GB and started reconstructing my files from my crappy backups. In the process, I have learned two obvious lessons about backups:
- Thou shalt backup your entire disk.
- Thou shalt backup daily.
Failure to obey Lesson #1 now means that I have to reinstall all of my software and get everything setup like I had before. This has taken a day, and I’ve probably missed some things. Failure to obey Lesson #2 has cost me a couple dozen photos, a week of work, a bunch of notes, and some music I had bought.
This traumatic experience has forced me to evaluate why I was not doing backups more frequently. The reasons were:
- I didn’t have a simple system.
- It took too long.
To fix this, I went looking again for a new backup utility. On my Linux systems, I usually use rsync, which is pretty easy to operate, works through the network, and is smart about how much data it sends. OS X is a different beast however. On Linux, for the most part, you only need to worry about the file contents, the UID/GID, file permissions, and hard/soft links. OS X has to cope with all the resource forks, HFS+ pixie dust, Spotlight metadata, ACLs, plus the standard UNIX metadata. Check out this article on cloning in OS X for a great review of the various issues. Frankly, it’s a mess. Apple’s got some feature creep going on with their filesystem, and nearly everyone gets the metadata at least partially wrong.
With that in the back of my mind, I’ve been worried about whether UNIX tools like cp and rsync were sufficient for a system backup. For my personal files, it doesn’t matter. I don’t make much use of the extra info. But, Lesson #1 says that your backup must be good enough to restore the entire OS, not just some of your files. So in my early Mac days, I went looking for a more Mac-specific backup tool.
I stumbled on Carbon Copy Cloner, which had lots of advantages. It was free, it backed up the entire disk, and it even made the target bootable. PowerPC Macs have had the ability to boot an external Firewire hard drive for a while (hold the Option key when you power on), and the Intel Macs also can boot USB 2.0 disks as well (though I have not tested that). Being able to boot the external disk provides a great disaster recovery system, especially if your laptop is vital to your work. If your internal disk dies, switch to the external disk while you replace the internal disk. Then just clone the external disk back over once the internal disk has been replaced. We have a simple and symmetric solution that a physicist can love. With the addition of psync, an rsync-like Perl program that claims to copy more of the Mac metadata, CCC can also be smart and only copy files which have changed. This is very important, especially for network backups where bandwidth and network overhead are the speed limitation.
CCC has some problems though. First, the interface is pretty clunky. It becomes unresponsive while copying, and the progress indicator isn’t much help. I usually have to Force Quit the program if I need to abort a backup while it is running. Also, psync has the same annoying “feature” of rsync that it has to walk the ENTIRE directory tree on both source and target before copying anything. I have never liked this because it wastes time when the I/O link to the backup volume could be used for moving data, and the memory requirement scales with the number of files. For example, with really large rsync copies, it might spend 20-30 minutes just building up a list of files and not really using the network for anything. CCC mitigates this problem by synchronizing each directory in the root directory separately (first Applications, then Developer, and so on), but the long setup phase still annoys the user. And, if there is one thing I have learned about backups, you cannot ignore user psychology even if it is irrational. You want backups to be easy and to start copying files as quickly as possible. Activity makes the process feel shorter, regardless of what the clock says.
This, along with learning that psync does not in fact copy all the HFS file attributes finally led me to look for another backup solution. After some random Google searching, I settled on SuperDuper! (yes, the name has an exclamation point in it). This tool has a stupid-easy interface, provides great feedback on the progress of the backup, and starts copying files almost immediately. It also has the ability like rsync and psync to only copy file changes. I can backup 40 GB of data in 2 hrs from scratch, and update the backup in 8 minutes. According to this guy’s review of GUI backup utilities, SuperDuper! was the only tool that got everything right in his tests. You can also read the article comments for suggestions of tools that he did not try.
SuperDuper! handles network backups by creating a sparse DMG disk image file on the network folder. The DMG grows to the size of your source volume as you fill it up. Thus, you can backup to network file system OS X knows how to mount, like AFP, NFS, SMB, and maybe even WebDAV. The only requirement is that the network filesystem, and the underlying filesystem on the server have to support files as big as your hard disk. This is rules out FAT32, but is not a problem for ext3, reiser, NTFS, HFS+, etc.
There are some disadvantages to SuperDuper!, of course:
- Not free: It costs $28 for the full version, but the free, unregistered version still lets you do full disk copies, just not the smart update and some other features. So even for free, it’s quite usable to clone disks.
- The copy engine cannot be used from the command line, though this is not a huge disaster.
- Because network backups write to a DMG, there is a lot more traffic than with rsync. It’s not a problem over a fast LAN, but it is rather slow on my WiFi network (arguably, I have a lot of interference from my neighbors) and probably won’t be fast enough to use over a cable modem to a remote server.
I’ll keep rsync in my box o’ tools still, but SuperDuper! is now my preferred solution for backing up my laptop.
