Offsite backups using rsync

 · Systeemkabouter

Linux/Unix has a lot of good software. Some of the tools really stand out for being so very useful. rsync is one of them IMHO.

rsync lets you synchronise two directories by only exchanging the differences between files/directories. local from disk to disk or remote using a special daemon or an alternate transport like ssh.

The net result is that I am able to keep a current backup of all my websites (15 GB or so), mail (2 GB), files (?? GB) and config files (small but complicated) while using a 8100/1024 ADSL connection. So even if my apartment would be destroyed by fire or some other disaster, I will have a pretty good backup on some remote location. I don't think a lot of people have the luxury of a fully automated backup at home.

For the purpose of backups, I got myself and Sun Ultra 5, installed a 160 GB disk, installed Debian GNU/Linux, installed rsync and ssh and I was ready to go.

Sure, the initial copying of data took forever (3 weeks or so, with some interruptions) and I should have made the initial copy while the system was still connected to my local LAN ;). But after that initial copy, I've been running rsync jobs multiple times a day without major problems.

The big hurdle may be finding a cheap place to store your data remote. But maybe you can find a co-worker or a friend who is willing to store your off-site backup if you return him the favor. Get some old computers, add a couple of cheap IDE drives and off you go.

For your enjoyment, I will paste part of the script I'm running on all my machines here. I actually wrote it today to replace a bunch of separate scripts on different hosts:

#!/bin/bash
#
# By SyncAddict
# Januari 1st, 2006
#
# This scripts tries to be the one
# script used for all backups / rsyncs
# in my home.
#
# This program is free software. You can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation; either version 2 of the License.       #
# More information about GNU can be obtained at http://www.gnu.org/    #
# The latest version if the sourcecode should be available at          #
# http://prutsclub.nl/.                            #


BCK_SVR=offsite.server.somewhere
BCK_USERID=mybackupuser
DATE=`date +%Y_%m_%d_%H`
BASE_DIR=/home/backup
HOSTNAME=`hostname -s`

umask 077
cd /

if [[ ! -e $BASE_DIR/$DATE ]];
then
mkdir -p $BASE_DIR/$DATE
fi

## just for my convienance when manually
## CD te restore something.
if [[ -h $BASE_DIR/latest ]];
then
rm $BASE_DIR/latest
ln -sf $BASE_DIR/$DATE $BASE_DIR/latest
fi


# if rsync is not installed, apt-get it
if [[ ! -x /usr/bin/rsync ]];
then
apt-get install rsync
fi

### Files in backup regardless of the host

### save a list of packages installed on the system
### this will allow for easier recovery
### packages can be installed on a new (debian based) system
### using the 'dpkg --set-selections < packages_file' command
### and then running 'apt-get upgrade' or something similar
if [[ -x /usr/bin/dpkg ]];
then
/usr/bin/dpkg --get-selections | gzip > $BASE_DIR/$DATE/$HOSTNAME.dpkg_packagelist.gz
fi

tar -czf $BASE_DIR/$DATE/$HOSTNAME.basefiles.tgz \
etc/passwd etc/shadow etc/group etc/motd \
etc/resolv.conf etc/nsswitch.conf etc/hostname \
etc/network/interfaces \
etc/fstab etc/hosts etc/nagios/nrpe.cfg \
var/spool/cron/crontabs etc/crontab \
etc/ssh/ etc/profile


if [[ -e /etc/network/iptables ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.firewall.tgz etc/init.d/iptables \
etc/network/iptables
fi

if [[ -d /etc/openvpn ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.openvpn.tgz etc/openvpn/
fi

if [[ -d /etc/postfix ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.postfix.tgz etc/postfix/ etc/aliases
fi

if [[ -d /etc/apache2 ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.apache.tgz etc/apache2
fi

if [[ -d /etc/php4 ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.php4.tgz etc/php4
fi

if [[ -d /etc/bind9 ]];
then
tar -czf $BASE_DIR/$DATE/$HOSTNAME.bind.tgz etc/bind /etc/default/bind9
fi

### end of backups regardless of host


if [ $HOSTNAME = "myhost1" ];
then
CONFIGURED=1
tar -czf $BASE_DIR/$DATE/$HOSTNAME.dhcp.tgz etc/dhcp3/ \
etc/default/dhcp3-server var/lib/dhcp3/

    tar -czf $BASE_DIR/$DATE/$HOSTNAME.nagios.tgz etc/nagios/
fi

if [ $HOSTNAME = "myotherhost" ];
then
CONFIGURED=1
tar -czf $BASE_DIR/$DATE/$HOSTNAME.samba.tgz etc/samba/ etc/default/samba
fi

if [ $HOSTNAME = "xendomain3" ];
then
CONFIGURED=1
tar -czf $BASE_DIR/$DATE/$HOSTNAME.mailserver.tgz etc/courier etc/amavis etc/spamassass
in etc/clamav
RSYNC_DATA="/home/vmail"
fi

if [ $HOSTNAME = "mywebserver" ];
then
CONFIGURED=1
tar -czf $BASE_DIR/$DATE/$HOSTNAME.other.tgz etc/proftpd.conf \
etc/imapproxy.conf etc/ftpusers  etc/webalizer

    RSYNC_DATA="/home/websites /var/log/apache2"
fi


ssh $BCK_USERID@$BCK_SVR "if [[ ! -d hosts/$HOSTNAME/settings ]]; then mkdir -p hosts/$HOSTNAME/settings; fi"

/usr/bin/rsync -e "/usr/bin/ssh -l $BCK_USERID" -tvruz --delete \
$BASE_DIR/$DATE/ $BCK_SVR:hosts/$HOSTNAME/settings/ >> /var/log/sync-thishost.log

if [ ! $CONFIGURED ];
then
echo "Script not configured for host $HOSTNAME!"
exit 1
fi

if [[ $RSYNC_DATA ]];
then
ssh $BCK_USERID@$BCK_SVR "if [[ ! -d hosts/$HOSTNAME/data ]]; then mkdir -p hosts/$HOSTNAME/data; fi"

/usr/bin/rsync -e "/usr/bin/ssh -l $BCK_USERID" -tvruz --delete \
$RSYNC_DATA $BCK_SVR:hosts/$HOSTNAME/data/ >> /var/log/sync-thishost.log
fi

exit 0;