05 May 2007

Silhouette Clone: Part 1

Impromptu

A feature of a server operating system I will not name enables administrators to create snapshots of network shares automagically. This is above and beyond storing data on a server to facilitate regular backups. Users who clobber each other's work can view the share as it was previously and recover missing or modified documents themselves. If this sounds familiar, it should. Its simply automated version control.

Subversion is a great tool for storing versioned copies of filesystems. Combined with Apache HTTPD and a few modules, access to versioned, writable Subversion repositories over the network is possible, but not ideal for many situations. In order to prevent data loss, many applications do a 3-step shuffle when saving files. Even this would be fine, except for the fact that this doesn't always happen efficiently over WebDAV, the protocol used to share Subversion repositories in this manner.

A decent compromise is to share data over the network as one normally would, and automatically commit changes to a Subversion repository at regular intervals.

Setting up a Subversion repository is quite simple.

svnadmin create /path/to/repository


After that, simply check out your (empty) repository.

svn checkout /path/to/repository /path/to/working/copy


Then, simply share the working copy via the method of your choice. You may want to restrict access to the .svn directory to prevent accidental destruction.

A simple shell script can be run from cron at regular intervals to add new documents to the repository. Modified documents will be committed automatically during the commit command.

#!/bin/bash
svn status /path/to/working/copy | grep \? | cut -c 8- | xargs svn add
svn commit -m "Automatic snapshot" /path/to/working/copy


Once your cron job starts firing, you will have point-in-time snapshots of your share. In future posts, we'll deal with deletes and simple remote access to the versioned data.

2 comments:

Christopher Ingram said...

Don't want to use Subversion? Perhaps you want a solution designed from the ground up for this sort of thing? Check out Dave Harding's version of this solution. Note that Subversion lets you do branching and tags efficiently, and rdiff-backup doesn't. Chances are, you won't need these features, but some of us do.

Christopher Ingram said...

Part 2