Windows Backup Solution
I have found a lot of backup solutions out there on the web. Some of them seem great but they are locked into some format. One such solution came with the YellowMachine that I purchased for work. This was Dantz Retrospect Software. But before I get into what that particular software package was lacking let me talk about the YellowMachine. It is a quaint little bright yellow box with four 250 GB IDE harddrives (RAID 5) in it. Arm processor and with the YellowMachine image it runs Debian. This is all well and good but as a backup server I would feel much more comfortable with OpenBSD. But hey, you all seen that coming. After a significant amount of time trying to figure out how I can exchange the debian image for my own image (be it Debian or oBSD) I decided to uninstall all of the unneeded services and apt-get my way to being fully updated.
Learned lesson: Build your own from scratch.
The YellowMachine is a nice solution but at the end of the day the company doesn’t seem to exist anymore. Only in the form of a User Group. So onto why I didn’t want to use Dantz software… the same reason. Why should I trust a proprietary backup scheme which in 5 years, when I need my data, nothing may work. I needed a longterm solution
By the way, I have 100+GB’s of data
The first longterm solution I looked into was RAR. I could easily break the files into network manageable sizes. The problem was apparent right away: Any change in the directory structure or additions to files will result in a total re-RAR’ing of the entire backed up directories. Which is a few days task.
Long story short: cygwin + ssh + rsync. I started off by installing Cygwin. This is a fairly straight forward install. Make sure you install ssh and rsync, otherwise just the base system is enough. There is only one gotcha. In my work environment Cygwin made the home directory m:/ which is a network drive that I do not have write permissions on. A quick fix was to edit /etc/profile and add the following:
$mkdir -p /cygdrive/c/cygwin/home/jsidabra
$vi /etc/profile
HOME=/cygdrive/c/cygwin/home/jsidabra
export HOME
Now when you re-”login” to cygwin you should be in your new home. This is needed to get the next step: passwordless login. If you follow my advice above and create your own machine this should be straight forward following these directions.
This tutorial assumes you have created a place on the linux/bsd box that has proper permissions to be writable by your user. The next step is to create an rsync_script.sh file to do all of the work:
#!/usr/bin/bashC=”/cygdrive/c”
FILES=”$C/Mail $C/Math $C/Maxwell $C/Misc”
INCLUDE=”$C/Misc/includes.rsync”
DIR=”/mnt/backup/jsidabra”
SERVER=”jsidabra@yellowmachine:$DIR”rsync -av –delete-during –progress –include-from=$INCLUDE \
$C/Maxwell/ $SERVER/Maxwell_Files/
rsync -av –delete-during –progress $FILES $SERVER/
The –include-from option is used to specify the file extensions that you want. In my case I grab all of the files (the second rsync command) but additionally I take all of my “project files” which are simple text files
+ */
+ *cls
+ *mxwl
+ *hfss
- *
In this include file grabs the directory structure (*/) and all *cls,*mxwl and *hfss files then excludes the rest (- *). Order matters! Place the rysnc_script.sh in your $PATH and you should be set.
The last trick is to schedule your backup. There is a way to use the cygwin cron deamon, but I wasn’t in the mood to figure it out. So I used windows scheduler under Start->Control Panel->Scheduled Tasks. Creating a batch file named rsync.bat:
@echo offC:
chdir C:\cygwin\binbash –login -i rsync_script.sh
I have been running this scheme for 2 months now with no known problems. The initial rsync took 2 days for 100+ GB of data. But after that it only takes a few minutes to make the structure updates. With the correct permissions on your linux/bsd box you have a nice secure way to store your data. I have three computers currently running this scheme on the YellowMachine.
Some of you may have noticed that the YellowMachine is RAID 5, which only allows for 1 hd failure. Well don’t worry your pretty little heads. Each workstation has RAID 10 which allows for 2 hd to fail, only if the two mirrored drives don’t fail. I’d calculate the probability, but right now I hate probability.
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
March 13th, 2008 at 1:22 am
Thanks for this great tutorial. The only problem for me I can see is that you need unix box somewhere in the net… Could be a problem.
March 13th, 2008 at 8:33 am
That is true. But most NAS boxes are linux boxes and they allow ssh or telnet access to the box. So setting up on the other end is pretty straight forward.