Incremental Backups with RSYNC in Windows



I was pointed to the cool rsync-backup script which uses cygwin's rsync and hardlinks in NTFS to provide a method for incremental backups which is probably the best I have seen so far. Basically, the effect is as follows: Whenever you backup your data to the backup location (which is somewhere on an NTFS drive), the only actual data that will be copied is for the files that have changed since the last backup. Nevertheless, there will be a new folder in your backup location which contains all files and folders you just backed up. Just: Those that did not change have not been copied, they have just been hardlinked to the old files instead. Now here comes the best part: The actual contents of a file with more than one hardlink to it are deleted only after the last link to it has been removed. Thus, you can delete old backups and the only data that will actually be deleted is the old data, the data that changed in later backups. Now, I modified the script slightly to better suit my needs. First of all, I only back up a single folder. If you want to backup multiple folders to a single location, move along. This is not for you. Use the original rsync-backup if you wish. Mine is different. The download is at the bottom, but let me explain how it works first. You invoke it like this:
cscript rsyncbackup.vbs /src:D:\data /dst:B:\backup
and it will backup D:\data incrementally in the above way to B:\backup. There is a twist. I did not really like the way excludes were handled in rsync-backup, this is why I use the exclude-from option. In practice, this means that you have to create a file D:\data\exclude which in my case looks a little something like this:
code/eclipse/.metadata
mailboxes
*.aux
*.toc
*.bbl
*.blg
*.synctex.gz
*.obj
*.o
That's because I do not want to backup the metadata from eclipse's project directory, the local data stored by thunderbird (mailboxes), or those temporary files created by $\LaTeX$ and gcc. Finally, I made myself a little batch file rsyncbackup.bat, which is placed together with rsyncbackup.vbs in the same directory D:\data\backup and looks like this:
@echo off&setlocal
for %%i in ("%~dp0..") do set "folder=%%~fi"
cscript rsyncbackup.vbs /dst:B:\backup /src:%folder%
pause
The second line takes argument %0, which is D:\data\backup\rsyncbackup.bat, and only looks at the directory, that's what %~dp0 is. Thus, %~dp0.. is the path D:\data\backup\.. and then I set the environment variable %folder% to that path. Except! Using %%~fi, the for loop canonicalizes it for me and %folder% will end up being the string D:\data. Why is this great? Because I synchronize my data across several computers, and on each of them I can now simply double-click this batch file as long as my backup location is mapped to B:\backup. You can now download my altered rsync-backup script.

Leave a Reply

Your email address will not be published. Required fields are marked *