31 May 2007

Why Use RCS for Backups?

Indrect response to Subversion for backup critics

A little while ago I wrote a series on using Subversion along with some shell scripts and other fun things to provide versioned backups of files. Subversion and other Revision Control Systems have some key features that make it very well suited for this task. After all, that's at the core of what RCS do.

RCS aren't always the best choice for this type of thing. Dedicated, versioned backup systems are much better at this core task. They are more specialized, have less cruft, and less chance of breaking. All that said, sometimes doing things the unnecessarily hard way has huge benefits. What are they?

Tagging


A common feature RCS offers that one wont find in backup offerings is tagging. Tagging allows you to mark a copy of your data, or subset there-of, in some meaningful way. This is a bit different than just marking a copy of your backup from some specific point in time. Tagging allows you to be selective about what gets marked, and marked items can span different time periods.

As an example, you can mark a selection of family photos in your backups as “Pictures for Grandma”. Exporting images with that tag will allow you to quickly put together a set, and store that set with very little overhead. When your aunt sees grandma's new photo-filled DVD and requests a copy, its a simple matter to re-export that set and send it off. Photo album software can do this, but what if you want to include video clips as well? How about the poetry your kid wrote in school? Tagging the files lets you mix and match data of any type.

Branching


Branching allows you to maintain multiple, parallel data sets. This feature will let you back up originals of files and work on them separately from your originals while maintaining backups of both. This is less useful in a backup scenario than tagging because you can always roll back files to specific versions with a regular backup setup. The real advantage comes into play when you want to have easy access to both the original and modified data sets at the same time.

One of the more obvious applications is photo editing. Photo sets may be thumbed through, with a limited number of shots making it through to the next stage, be it another elimination, basic manipulation, cataloging, or other actions. Utilizing branching, backups can be made of both the raw, original set as well as intermediate steps without wasting storage space on-disk and in backups. If you have 5 rounds of eliminations or selections, and want to preserve each round, you don't have to store your final selections 5 times.

Multiple Working Copies


One of the biggest benefits of using a RCS for backups is the ability to have multiple working copies. You can mirror your data on multiple systems, editing what you want, where you want. Changes at all locations are merged back into your backup data, and can be propagated back out to all working copies. Simply, you can have the same data anywhere, with changes at all locations backed up an an efficient manner.

In the very common case of having a work and a home computer, or having one desktop machine and one laptop, the convenience of multiple working copies becomes apparent immediately. So long as you keep up with minor housekeeping, the work you do on one computer is available on another. Should one machine be unavailable for any reason, you can pick up where you left off elsewhere.

Portable Backup Data


Many snapshot-preserving backup methods use file system-specific tricks or features, making it difficult if not impossible to move backup data to a new system or volume without loosing key information. RCS like Subversion can be quite happily picked up and moved to different locations, regardless of disk format or operating system. The ability to store this data easily on portable, rewrite able media like a USB drive is a very convenient feature as well, since you can have all of your versions readily accessible wherever you go.

Hopefully I have shed some light on the craze regarding using revision control systems for backing up data. In most situations, its more trouble than its worth. But for those of us with special needs, nothing else will do.

No comments: