The Best File Backup Scheme

Executive Summary
The "best" backup? By "best" I mean that the scheme described here offers the most protection for the least amount of work on the part of the person owning the computer. While all backups offer protection from data loss, this scheme makes both local and off-site backups at the same time, keeps multiple copies of files that change often and encrypts the files that were copied. And, if the backups are made in the middle of the night (a very cool trick described below) then the users involvement is limited to reviewing the activity log to insure the backup ran without error. Less than 30 seconds a day is all that's needed for this log review.

Every day the files that were changed or created that day are backed up. These files are first combined into a single file, then compressed, encrypted, password protected and date/time stamped. Then they are sent off-site via FTP, leaving the local copy behind. Backups can be scheduled to run in the middle of the night (very cool), at system startup, at system shutdown or they can be run manually at any time. After the backup, the log file is shown to the user for review. When backups are made in the middle of the night, the log file is shown to the user the next time the computer is started.
Below: Introduction | Scheduling | Backing Up | Summary | Cons | The Worst Backup
 
Introduction

I teach a class on Backing Up Your Computer that, despite being scheduled for four hours, never seems to provide enough time. So, as a sort of cheat sheet, here I'll explain what I consider to be the "best" backups. Needless to say, "best" is subjective and dependent on your particular needs, but this scheme has a lot going for it, such as: it makes both local and remote backups at same time, the backups are encrypted and password protected and, after having been set up, it involves the least possible effort on your part.  

This article is for people with a serious need for backing up files on their computers. It's a bit complex to set up and involves commercial software, so the time and expense are not warranted unless you have irreplaceable files on your computer. Someone who works at home would be a good candidate for this backup scheme.

The goal here is to:

  1. Schedule the backups to run automatically at a time that is guaranteed not to conflict with anything else running on the computer
  2. For security, encrypt and password protect the backed up files
  3. To minimize transmission time and storage space, compress the backed up files
  4. Send the backups off-site

This procedure best applies to data files. Things like a multi-gigabyte collection of MP3 files don't lend themselves to off-site storage. Likewise, videos and email can be large enough to require a scheme other than the one described here. For huge amounts of data, local backups are probably best, but on-line off-site backups (like the scheme described here) may be still be appropriate for backing up one day's work at a time.

 
Scheduling

You should never backup files while they are in-use, so your backups should run when nothing else is going on. The best options are just after turning on the computer, just before turning it off or the middle of the night when you're sleeping. A potential problem with running them at startup is they may interfere with the normal startup of Windows and you have to wait for the backup to finish before using the computer. A problem with shutdown time is that the Windows scheduler can not schedule programs to run just before shutdown (although other software can).

To me, the middle of the night is preferable. The problem here is that it can mean leaving the computer on all the time - a very bad thing (for a number of reasons). But there is an elegant solution - you can configure the computer to wake itself up in the middle of the night, run your backups and then turn itself off. Just the sort of thing the word "cool" was invented for. 

This is actually pretty easy. If you look in the BIOS setup for your computer you may find a Schedule or Startup section. In many computers the BIOS is capable of waking the machine up at a set time. And this is a true wake-up, from a totally off state, not from a Windows hibernating or suspended state. Suppose, for example, you configure the BIOS to wake up the computer at 3 AM. Then Windows should be finished booting by 3:10 AM. 

But, your computer boots to a logon screen instead of the Windows desktop? A default userid can be configured using the "control userpasswords2" command in Windows XP. It's even easier to do this in Windows 2000.

I haven't tested whether the Windows scheduler will run a scheduled program while Windows is sitting at the login screen.

Now that it's 3:10 AM and Windows is running, we need to schedule the backups. This is easily done with the Windows Scheduler (Control Panel -> Scheduled Tasks). Just schedule the backups for 3:15AM. 

But after the backups run, we want the computer to turn itself off. Assuming the backups take a maximum of 15 minutes, then schedule Windows to shut down at 3:45AM to insure they have finished. Windows XP includes a "shutdown" command (not in Windows 2000) that can be put into a .BAT file and the BAT file can be scheduled for 3:45AM. Point one above is done.

 
Backing Up

As for points 2 through 4, there are probably many backup programs capable of these tasks. The only one I have used is WinZip. Yes, that WinZip. It's now at version 11 and costs $50, but there is a 30 day free trial. Other potential programs, that I have not tried, are Handy Backup, SyncBack, Backup For All, CoreFTP, Power Archiver ($20 as of September 2007) and Urgent Backup. Robo FTP claims to be scriptable, so a computer nerd should be able to make it dance and sing. But at $150, it's pricey. No doubt there are many more, but the devil is in the details. Upcoming are some of those details.

Note: It's interesting that these programs are from three different product categories: Backup, FTP and Zip. The approach described here does not fit neatly into one category.

Obviously, WinZip compresses files. It also combines all the files to be backed up into a single file and can encrypt and password protect the file. In fact, WinZip can do a lot of processing under the control of a "job". This is a common term in backup programs and refers to the description of what to backup, where, when, how, etc.

For selecting files, WinZip has an option to backup only files with the archive flag on. And it can do this for a single folder or multiple folders. And it can include or exclude files based on their names or types. And it can either reset the archive flag or not - up to you. How you deal with the archive flag depends on other your other backups. If you have no other backups, then have WinZip turn it off. If you use replication type backups, then turn it off. If your local backups also depend on the archive flag, then WinZip should not change it. 

For sending files off-site WinZip supports FTP. As for naming the backup file, it can automatically append the current date. Thus, your backup files might be named:

MyDocuments.MyNovel.TodaysFiles.2006-12-29.zip

Every day, WinZip changes the date at the end of the file name. It can also add the time, if you prefer. 

And of course the backup Zip file is created on your computer before sent off-site with FTP. For convenience, you can have WinZip put all your backups in a single folder, thus making both a local and remote backup at the same time. Don't want local backups? WinZip can delete the backup file after uploading it. 

WinZip backups can be scheduled using the Windows Scheduler. Each job exists as a ".WJF" file. Simply schedule this file directly, it works just like a .BAT file in this regard. For example, point the scheduler to the .wjf file as such: 

c:\somefolder\somesubfolder\myfavoritewinzipjob.wjf

Logging

As for auditing, WinZip can log it's activity. In fact, the program has many good logging features. In particular 

Where to keep the logs on your computer? I like to use a single dedicated folder for all the logs from all WinZip jobs; it makes it easier to find them. I also like to keep the local backups created by WinZip. Specifically, I suggest a single folder for all the local backups and a subfolder there for the logs. Something like this: 

E:\winzipbackups
  MyDocuments.MyNovel.TodaysFiles.2006-12-28.zip
  MyDocuments.MyNovel.TodaysFiles.2006-12-29.zip
  MyDocuments.MyNovel.TodaysFiles.2006-12-30.zip
E:\winzipbackups\logs
  MyDocuments.MyNovel.TodaysFiles.2006-12-28.log.txt
  MyDocuments.MyNovel.TodaysFiles.2006-12-29.log.txt
  MyDocuments.MyNovel.TodaysFiles.2006-12-30.log.txt

Optimally this "official" WinZip backup folder will reside in a different partition than the one with the original files.

But at 8 AM when you turn on your computer how do you know if the middle of the night backups worked? It's easy to forget to review the activity logs. WinZip can email you a copy of the log, but I prefer a simpler scheme. The Windows scheduler can schedule Windows Explorer to run at start-up time and go directly to the folder where the WinZip log files reside. If sorted by date, the most recent logs will always be at the top, and this also lets you review more than one log at a time.

To do this, first schedule Windows explorer (c:\windows\explorer.exe) to run "When my computer starts". Then, go back and modify the scheduled task and in the "Run" box add the folder name at the end, after a space. For example, it should look something like this.

c:\windows\explorer.exe "c:\winziplogfilefolder" 

At this point, we are backing up all new and updated files every day. Depending on how big your important files are, you may want to schedule a full backup to run once a week, every other week or once a month. This requires a new WinZip job since we are selecting all files not just the updated ones, but the job definitions are simple text files, so copying them is easy. 

Of course, backing up all files is much slower than just the files updated/created in a single day. This complicates the scheduling of shutting down the computer. Instead of scheduling the shutdown for a set time, we need it to run after the full backup completes regardless of how long the full backup takes. We can do this by scheduling .BAT files. Simply make a .BAT file with the WinZip job and the "shutdown" command. For example:  

c:\somefolder\somesubfolder\myfavoritewinzipjob.wjf
shutdown -s

Off-site Storage

As for where to store backup files, there are hundreds of website hosting companies that offer online storage that can be used for either a website or backup files or both. There are also dedicated file storage companies, but not all support FTP uploads and many require you to install their software (not a good thing for many reasons). The best bargain I've seen so far is from 1and1.com which provides 100 gigabytes of storage space for $5/month. I've been using them for a while and can attest that every time I try to connect with FTP it works. And they support secure FTP, specifically SSH/FTP (there are other types). 

Many of the online storage companies provide software that you have to install to interface with them. I'm not a big fan of this approach mostly because the less software running on a PC the more stable it will be. This especially rules out any new startup as I prefer dealing with mature software in the hopes that most of the bugs have been found. Very often this software is constantly running in the background, something else I don't like. And it may watch for updated files and automatically upload them which sounds good but means that the software is interfacing with Windows on a low level which also scares me. Some of the software tries to only backup the portions of a file that have changed as opposed to the whole file which is not a game I want to play with my backups. And, there is always the chance that the software is spying on you (it's constantly running and gets a free pass through any and all firewalls). And what if you delete a file in a folder that is being monitored for changes? If the software is not configured correctly, the deletion could be mirrored at the backup site too. Not good.

And if none of that worries you, then there is the issue of encryption. If the same company is both storing your files and encrypting them, the possibility exists for them to be able to decrypt your files. After all, their software does the encryption. I prefer getting my encryption and data storage from different sources.

At some point, old backups should be deleted. There is no one right answer though for when. It depends on your particular backup needs, the local backups you make and the size of your files vs. the total amount of backup storage space available. This is not something WinZip can do - I'm thinking of writing a script for this.

Any experienced computer nerd knows about the need to backup the backup. In general, high end FTP programs (costing about $30 to $50) are capable of moving files from one remote site directly to another. And needless to say, make multiple backups of WinZip (or whatever backup program is used) since it is needed to recover the encrypted files. 
 

Summary

To recap, the computer turns itself on in the middle of the night, runs the backups and then turns itself off. Each day we backup only files that were changed or created that day. Backups are combined into a single file, compressed, encrypted, password protected and sent off-site via FTP. The file name has the date appended at the end so we can easily tell which day it was created. The backed up files can reside both locally and remotely. Whenever the computer starts up, it opens Windows Explorer positioned at the folder with the log files so we can easily verify that the middle of the night backups ran successfully. We can also make occasional full backups.

Whew.
 

Cons

There are two sides to every coin, and this scheme has its downsides too.

For one, recovery does not lend itself to quickly finding the last version of a specific file. If you don't know the day when the needed file was last updated, then you have to look in the file for yesterday, then the file for day before, etc. etc. But making periodic full copies, limits the number of daily files that need to be searched to recover a single file. And there is an upside to this too, if you update a file every day, there are multiple copies of the file to fall back to. And, I would trade ease of recovery for ease of backup every time. To me, the highest priority has to be making the backup process simple, easy and quick.

Also, the scheme requires WinZip 11 to restore files. I'd prefer to use software that generated self-extracting files, but WinZip charges an extra $50 for this feature, which seems over-priced to me.

There is one annoying feature to WinZip's logging; there is either too much information or not enough. What I want to see in the log is a list of the files that were copied and any errors. It can't do this. In summary mode, it offers only the errors. In detail mode it displays every folder that was processed, even if no files in that folder were copied. In my case, the number of files copied on a daily basis is small but the number of folders examined to find those files is huge. Finding the few files that were copied when using detailed mode is like finding a needle in a haystack. 

Sometimes, when the WinZip window gets covered up and then re-displayed, it is unaware of the fact that it needs to refresh itself. The result is the blank status window shown here. This is only a display bug, the program ran fine.

And WinZip is lacking when it comes to FTP. The FTP protocol is not secure. While this does not matter for the data files being uploaded (they are encrypted before the upload) it does matter for the FTP userid/password. There are many types of secure FTP, but WinZip does not support any of them. WinZip also can't re-try failed connection attempts, which I would think any FTP program can do. I needed this because when I first started using this scheme, connections to the offsite location I was using failed very often. One day, FTP connections were hanging due a problem on the server side. Eventually, the WinZip upload failed with an error code 12030. Not WinZip's fault at all. However, the nature of the problem was that the connection to the FTP server hung and WinZip waited about 15 minutes or so before failing. The big gripe is that while it was waiting/hung there is no user interface at all from WinZip. That is, there is nothing to see other than Task Manager showing the program running and doing nothing.

I don't like the error handling defaults in WinZip and, as far as I know, there is no way to change this. One day, when there was an error creating the .zip file, the job stopped at that point and did not make the offsite copy. There error was not being able to open a file for reading. Then again, another time when it detected that a file was in-use while being copied it produced an error message in the log but kept on running and finished the job. I haven't researched this in detail.

While WinZip is running, both while creating the Zip file and while FTPing it off-site, there is no entry for WinZip in the task bar. It does have a Window showing the progress of each operation but this window easily gets hidden behind others and without an entry in the taskbar, it's easy to lose track of things.

As noted above, WinZip has no facility for automatically deleting very old backups.

NEW: After using this scheme for over two years, I ran into a new problem with it - the udpate/archive flag gets reset when the sharing status of a folder changes. That is, if you are backing up a folder that was not previously shared on a network, and now it is shared, the first backup after this change in the share status will backup all the files in the folder.