Backups prevent loss of data by ensuring it can be recovered.
Backup and recovery go together: backups are never taken without supporting a recovery, and recovery cannot be completed without having a backup from which to recover. For this reason, many methods focus on the recovery side (as that is the most vital part of any backup scheme).
There are several backup methods available, ranging from bare-metal backup (and recovery) to record-based backups in a database system. Backup solutions offer a number of different features such as backup snapshots based on time, deduplication, the ability to recover single modified files or whole directory trees, etc.
Bare metal recovery
In case of bare-metal recovery, software is used without installing it on the operating system that is under backup/recovery. The result of a bare-metal restore is a fully bootable system again.
Most of these recovery solutions are based on partition imaging (like with dd, CloneZilla, PartImage, or FSArchiver), although in Gentoo, stage4 snapshots can also be used as some sort of bare-metal recovery solution (captures files, but not disk partition data).
File and directory recovery
For a more selective approach, a file- and directory-based backup/recovery model is used. For such situations, on-system software is responsible for regularly taking copies (or patches/diffs) from a predefined list of files and directories. Many solutions exist, such as Bacula or BackupPC, but simple schemes can also be obtained by properly using rsync or just plain copies.
Some applications offer a more specific approach on backup and restores. Databases are a prime example (as their job is to guard over data) but others, like version control systems, often have specific backup/restore routines too.
When hosting one or more services, it is wise to look at the backup/restore routines for each service and implement them on top of the other backup schemes.
A few principles need to be closely guarded when implementing backups:
- Always verify that the backups can be used to restore. Either restore to another location or system, or restore immediately after taking a backup. Too often users forget this and are severely disappointed when they find out that their daily backups didn't do much (e.g. captured the wrong directory) or cannot be restored.
- Keep backups on a safe location. Try to have them off premises. Move them regularly to a family members home, or send them over the Internet to a cloud storage provider (password-based encryption schemes can be used to protect confidentiality with off site backups).
- Mix backup methods. Take a full system (bare-metal) backup once in a while, with file and directory backups more regularly and application-level backups as much as possible (since those are what clients/users will be most likely angry about if lost).
- Mirroring is not having a backup. Mirroring keeps two sides in sync, whereas a backup is a snapshot of data at a point in time.
- dd — a utility used to copy raw data from source to sink, where source and sink can be a block device, file, or piped input/output.
- etckeeper — a collection of tools to let /etc be stored in a git, mercurial, bazaar, or darcs repository
- rdiff-backup — a GPL-licensed incremental backup utility based on librsync; it stores changes to files instead of entire duplications.
- Rsnapshot — an automated backup tool based on the rsync protocol and written in Perl.
- SparkleShare — a cross platform, free, open source, Dropbox-like, git-based collaboration and file sharing tool.