Background
I have suffered massive data loss twice in my life.
Once was in high school, and I knew backing up my files was a good idea. I bought a second hard drive. As I was about to copy my files onto it, someone tripped on the power cord and knocked the spinning hard drive containing all of my data onto the tile floor. Angry sounds ensued, and all my data was gone.
The second time was near the end of the my second-to-last semester of undergrad. I used a Macbook at the time and was using the encrypted file system option that came with OS X 10.6. One day, right before finals week (Murphy's Law), the computer just lost the ability to unencrypt it... leaving me with a massive, encrypted .sparsebundle that was essentially useless. Well, not essentially: actually useless.
Data Backups Are NOT Optional!
From both major data losses I experience I had partial, unorganized backups. A DVD of homework here, a flash drive of music and movies there, but nothing cohesive. Nothing deserving of the name "backup". For about the past year, I've been regularly backing up my data. I prioritize backing up data now. I plan for it. If I upgrade the hard drive in my laptop, I'll only do so when I can afford to upgrade the backup drive as well. Internalizing the importance of backups is vital.
When I say that I've been regularly backing up my data, I mean backing up all of my files to an external hard drive (weekly for my desktop and monthly (worst-case scenario) for my laptop). The reason the laptop is monthly is, frankly, because using the external hard drive can be hard to remember. Moreover, the projects I work on regularly are copied at somewhat irregular intervals to flash drives and files emailed back and forth, so recent changes are often in several places. Not ideal, but I think that this system allows me to say "Yes, I do backup my data."
Or do I?
The current state of my backups
Right now I'm using the built-in "Backup and Restore" feature in Windows 7 and KBackup on Linux.
On Linux, I know that I'm just backing up my home directory and some configuration files I've edited. (Whenever I edit a configuration file, I copy it and the original to /root - this is included in the backups.)
The Windows 7 backup options seem to have some pros and cons (about which I could be mistaken). If a system image is created, ostensibly everything is backed up... but it can only be used in an all-or-nothing fashion. If individual files are backed up, they can be individually restored in the case of accidental deletion but are only useful individually. On my laptop in Windows I backup "files in libraries and personal folders for all users and system image". On my desktop I backup "All local data files". (I got my desktop before my laptop and initially shied away from the system image option because I thought it was akin to the System Restore Points introduced with Windows Me.)
Right now, at the start of my Ph.D. studies I feel pretty good about my responsibility with backing things up... but I will NOT lose my data again without a fight. I will not passively accept my backups as infallible.
The stress test
I decided to delete my data. To simulate a hard drive failure I used DBAN on my desktop: my hard drive was completely erased — just as if I had replaced it with a brand new one. I primarily use my desktop for Netflix, casual gaming (Morrowind, Minecraft, etc.), and browsing the internet; all of my teaching and research is done on my laptop. I checked reading my desktop backup on my laptop and was able to access the files through the Restore option: if everything went wrong, I could still recover what I wanted to. But what would everything going wrong look like?
I guess I should describe what I want from a backup solution at some point, and now is as good as any. My main purpose for backing up is to be there in the case of hardware failure or other major data loss event. I'm not too concerned with losing individual files (knock on wood) because I rarely delete anything, though being able to restore those would be a nice secondary goal. In a crunch, I would like to be able to open up the external hard drive enclosure, swap the backup drive for the original, and boot directly from the backup drive to continue working.
Previously I had created a Windows 7 Repair Disk, so I thought that booting from it would allow me to restore from my backup. While I think this would work, I wasn't able to test it. After the graphical system had loaded and the mouse worked I received Error 0x4001100200001012. I'm not sure what this error is, nor do I care at this point. After retrying it two or three times I gave up and concluded that the disk wasn't working. Lesson learned: always check your repair/recovery disks to make sure they work before you need them!
The Windows 7 Repair Disk didn't work, but I did have Dell Windows 7 System Restore Disks that I had previously checked. Basically the same thing, right?
Wrong. While these disks give me the option to restore my computer the factory image (which is what I expected them to do), that isn't my goal. After booting the Recovery Disk, I noticed that these disks also allow one to restore the computer from backups. Success! I thought naïvely. The disks seemed to only look for backups made with Dell's DataSafe software, not the Windows Backup and Restore feature. Lesson learned: don't assume options to restore backups refer to backups made with the system you use!
Fine, I begrudgingly thought. A full system restore to factory settings — at least that would get me to the Windows Backup and Restore software (and a usable computer). A few minutes and a disk change later and I was at Windows desktop. I opened the Backup and Restore feature and chose to restore all of the files from my external hard drive to their original locations on my desktop... and was immediately struck with the realization that the backup options I had been using for this were unsuitable for restoring all files at once. I had incorrectly chosen the individual-files only option. As it was copying the files, I had to choose to Copy and Replace all of the system files. I also ran into Error 0x80070020 indicating that the computer was trying to replace files currently in use. Not ideal. I had it skip those files and move on.
After all of the files finished copying there was a very clear problem: not all of the files copied. Essentially, all files related to installed applications weren't copied. While none of the data was lost, this backup did not allow me to quickly get up and running again as none of the programs I use work.
Moving forward
This stress test of my backup solution taught me that, while my data are sufficiently backed up, they are not as accessible as I would like. Not nearly so. In the end, it took several hours to get to this point which is both too slow and not where I want it to be.
My desired ability to boot from my backup in a crunch also implies that I should be able to read the files individually when the drive is mounted in another computer. This last point is key, because neither option that Windows Backup and Restore has allows this. Backing up files individually results in many individual zip files each containing many files without much clear organization. The system image option results in one large file which is difficult (impossible?) to use to access individual files. I had known about the latter situation before, but I learned the former during this ordeal. Lesson learned: always understand what the backup options actually mean!
I want to change my backup solution for both Windows and Linux. Essentially, I want to be able to clone the hard drive on a regular basis, but not maintain parity with my computer every minute. Keeping the backup identical to the original all the time (like RAID 1) doesn't allow me to restore files in the case of an accidental deletion (a secondary goal). At the same time, creating a clone of the hard drive every week would undoubtedly be a drain on system resources when dealing with large hard drives.
At the moment, I don't know my ideal backup solution, but I will keep looking for it and will post my findings. For now, I'll keep using Windows Backup and Restore and KBackup while having a better understanding of the very real pros and cons associated with them. A summary of my wants and lessons-learned are below.
What I want in a backup solution
- Backup of my entire hard drive occurs regularly (but not a RAID 1 style mirror)
- Backup is bootable
- Backup is readable when mounted by another computer
- Backup is created while the computer is in use (I would prefer to not have to use an external system to periodically mirror my hard drive)
- Backup software is Free Open Source Software (technically this is optional, but I doubt I'll go with any solution that isn't open source)
Lessons learned
- Always check critical media before it is needed
- Always make sure the tools you have do what you think they do
- Always make sure you fully understand the meaning and implications of the options you have selected
- Testing your backup system before you depend on it is a good thing