Backing Up Your Data Securely with Restic
Restic
Restic is a fast, secure, and efficient backup program that supports a variety of backends, including local storage, SFTP, S3, and more. It is designed to be easy to use, reliable, and secure. Restic uses strong encryption to protect your data and ensures that only you have access to your backups. In this post, we will cover how to install and configure Restic, create backup scripts, and schedule automated backups.
Installing Restic
On Debian, there’s a package called restic which can be installed from the official repos, e.g. with apt:
sudo apt install restic
Configuring Backends, i.e. 'Repositories'
The place where your backups will be saved is called a "repository". This is simply a directory containing a set of subdirectories and files created by restic to store your backups, some corresponding metadata and encryption keys. Restic supports a variety of backends, including local storage, SFTP, S3, and more:
- Local storage: A directory on your local filesystem.
- SFTP: A remote server accessible via SSH.
- S3: Amazon S3-compatible object storage services.
- Google Cloud Storage: Google Cloud Storage service (not Google Drive).
- Backblaze B2: Backblaze B2 cloud storage service.
- and many more; see official documentation for a complete list
In our previous post, we have used Google Drive (not to be confused with Google Storage Service). Restic, however, does not support Google Drive natively; yet it can use rsync to sync the backup repository with Google Drive. Nonetheless, we want our system to be as simple as possible, as there is power in simplicity. For this reason, we will use Amazon S3 as the backend for our backups. Amazon S3 is a highly durable and scalable object storage service that is widely used for storing backups and other data. It is also quite affordable, especially for small to medium-sized backups. If you have never registered an AWS account, it will give you a free tier for the first year, which is more than enough for our purposes.
AWS S3 Configuration
If you do not have an AWS account, you can create one here. Once you have an account, you can create an S3 bucket to store your backups. You can follow the official documentation to create an S3 bucket.
Configuring Access
There are different ways to authenticate Restic with AWS. The access key and secret key are the most common way to authenticate. However, creating these keys on a root user (i.e. the user that has infinite privileges) is a very bad security practice. Instead, we will create an IAM user that has only one permission; to read and write to the S3 bucket that we have created for the backups. By this way, even if the access key and secret key are compromised, the attacker will not be able to do anything other than reading and writing to the S3 bucket. To create an IAM user, follow the official documentation. Do not assign any permissions during creation. Once you have created the user, attach the following policy to this user:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::your-backup-bucket-name",
"arn:aws:s3:::your-backup-bucket-name/*"
]
}
]
}
Attaching this policy will allow the user to read, write, and delete objects in the S3
bucket. Replace your-backup-bucket-name with the name of the bucket you have created.
Once you have created the user, login as this user and create an access key and secret key. You can do this by clicking on the user, then clicking on the Security credentials tab, and then clicking on the Create access key button. Save the access key and secret key in a secure place, as you will not be able to see the secret key again. With this approach, we have the following advantages:
- Principle of least privilege: The user will only have access to what is necessary for the backup task.
- Easier to manage and rotate credentials: We can easily update or revoke access without affecting other parts of our AWS setup.
- Better auditing: We can track actions performed specifically by this user.
Creating a Repo in Restic
Now we can create a repository in Restic. Remember that a "repository" is simply a directory containing a set of subdirectories and files created by restic to store your backups, some corresponding metadata and encryption keys. We will create a repository in the S3 bucket we have created, but we need to pass the credentials to Restic. We can do this by setting the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. We do not want these to persist in our shell (we will explain why soon), so we will use the env command to pass these variables only for the current command:
export AWS_ACCESS_KEY_ID=MY_ACCESS_KEY
export AWS_SECRET_ACCESS_KEY=MY_SECRET_ACCESS_KEY
Now we can initialize a repository in the S3 bucket:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name init
It will ask for a password to encrypt the repository. Choose a strong password and keep it
in a secure place. You will need this password to restore your backups. Remember; if you
lose this password, you will not be able to restore your backups.
Few Words About Repos
Restic uses a repository to store backups. A repository is a directory containing a set of subdirectories and files created by restic to store your backups, some corresponding metadata and encryption keys. The repository is encrypted with the password you provide when you initialize the repository. This means that even if someone gains access to your repository, they will not be able to read the contents without the password. This is why it is crucial to choose a strong password and keep it in a secure place.
Restic uses a content-addressable storage model. This means that each file is stored only once in the repository, and identical files are deduplicated. This can save a lot of space, especially if you have many similar files. Restic also supports snapshots (the contents of a directory at a specific point in time), which allow you to restore your data to a specific point in time. Snapshots are read-only, so they cannot be modified or deleted. This ensures that your backups are always safe and secure.
When restic encounters a file that has already been backed up, whether in the current backup or a previous one, it makes sure the file’s content is only stored once in the repository. To do so, it normally has to scan the entire content of the file. Because this can be very expensive, restic also uses a change detection rule based on file metadata to determine whether a file is likely unchanged since a previous backup. If it is, the file is not scanned again.
Snapshots can have one or more tags, which can be used to group snapshots together. This can be useful if you want to keep track of different versions of your data, or if you want to organize your snapshots in a specific way. For example, you could tag all snapshots of your home directory with the tag "home", and all snapshots of your work directory with the tag "work". This makes it easy to find and restore specific snapshots when you need them.
A repository can be thought as a vault storing your directories and files. You can put many directories and/or files into a single repository, and take them out whenever you need them. You need multiple repositories if you want to store directories and/or files in different vaults. This way, you can have multiple copies of the same directory and/or file in different vaults, adhering to the famous (infamous?) 3-2-1 backup rule. You can list all the snapshots in a repository with the following command:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name snapshots
enter password for repository:
ID Date Host Tags Directory Size
-------------------------------------------------------------------------
40dc1520 2015-05-08 21:38:30 kasimir /home/user/work 20.643GiB
79766175 2015-05-08 21:40:19 kasimir /home/user/work 20.645GiB
bdbd3439 2015-05-08 21:45:17 luigi /home/art 3.141GiB
590c8fc8 2015-05-08 21:47:38 kazik /srv 580.200MiB
9f0bc19e 2015-05-08 21:46:11 luigi /srv 572.180MiB
For a variety of things you can do with snapshots, see the official documentation.
Manual Backups
If you want to perform a manual backup, you simply use the `backup` command. Remember that this command requires the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be set as environment variables:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name backup /path/to/your/data
open repository
enter password for repository:
repository a14e5863 opened (version 2, compression level auto)
load index files
start scan on [/home/user/work]
start backup on [/home/user/work]
scan finished in 1.837s: 5307 files, 1.720 GiB
Files: 5307 new, 0 changed, 0 unmodified
Dirs: 1867 new, 0 changed, 0 unmodified
Added to the repository: 1.200 GiB (1.103 GiB stored)
processed 5307 files, 1.720 GiB in 0:12
snapshot 40dc1520 saved
The output shows that restic successfully created a backup of the directory in a short time.
Each backup snapshot is given a unique hexadecimal identifier, in this case "40dc1520".
Restic processed 1.720 GiB of data from the local directory. However, only 1.200 GiB was
added to the repository, indicating that restic efficiently handled duplicate data. Further
compression reduced the stored data to 1.103 GiB.
Without the "--verbose" option, restic provides less detailed output but still displays a
real-time status. It's important to note that this live status shows the number of processed
files, not the amount of data transferred. The actual transferred volume may differ due to
factors like de-duplication, potentially being lower or higher than the processed amount. If
you run the backup command again, restic will create another snapshot of your data, but this
time it's even faster and no new data was added to the repository (since all data is already
there). This is because restic uses a content-addressable storage model, which means that
identical files are only stored once in the repository. This can save a lot of space,
especially if you have many similar files.
You can add multiple directories to the backup command, and restic will back them up in a single snapshot. For example, to back up both the /home/user/work and /srv directories, you would run:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name backup /home/user/work /srv
If you wish, you can also create different snapshots for each directory by running the
backup command separately for each directory. This can be useful if you want to restore only
a specific directory without affecting the others.
Automatic Backups
All is good up to now, but nobody got time for manual backups. We want our backups to be automated. This is the reason we don't permanently add our keys as environment variables; we don't want to expose them on every shell we open as we will rarely need them. However, restic does not support scheduled backups natively. Instead, we will use an environment variables file, together with a script, then use systemd to schedule the script.
Keys and Secrets
First, we create a directory for our scripts and environment variables file. We will create the /etc/restic directory, and create a file named restic-env (or any other name you like) in this directory. This file will contain the following necessary environment variables:
export RESTIC_REPOSITORY='s3:s3.xxxregionxxx.amazonaws.com/xxxx'
export RESTIC_PASSWORD='xxxxxxxxx'
export AWS_ACCESS_KEY_ID='xxxxxxxxx'
export AWS_SECRET_ACCESS_KEY='xxxxxxxx'
export SLACK_WEBHOOK_URL='https://hooks.slack.com/services/xxx
The SLACK_WEBHOOK_URL is optional; it is used to send a message to a Slack channel after
the backup is completed. You can create a webhook for your Slack channel here.
Important Note: In this simplified approach, we are storing the AWS access key and secret key in plain text. This is OK in our case for two reasons:
- Our server is secure and only accessible by us.
- Even if someone gains access to the server, they will not be able to do anything other than reading and writing to the S3 bucket, as we are using an IAM account that only has access to a single bucket.
sudo chown root:root /etc/restic/restic-env
sudo chmod 600 /etc/restic/restic-env
Backup Script
The major difficulty in creating a backup script is that many containers do not use simple files; most of them has databases (e.g. MySQL, PostgreSQL, MongoDB) and other services running. Some applications will have specific backup scripts; you cannot just dump a dataabase or a volume. For this reason, it is imperative to check the documentation of the applications that will be backed up. In my case, we have the following applications that need to be set up, and the corresponding file/folders:
- Caddy: Project folder (includes Caddyfile).
- Memos: Project folder.
- Calibre-web: We will backup the project folder, which also includes the library. It will take time only on the first backup, but the following snapshots will be faster thanks to deduplication.
- Freshrss: Project folder
- Paperless: Project folder. We set it up to use SQLite, so the directory-mounted volumes will include the database files as well.
- Radicale: Project folder. This will include the DAV files within the directory-mounted volumes.
- Searxng: Project folder.
- Silverbullet: Project folder. All the notes are stored within the directory-mounted volumes as markdown files.
- Vaultwarden: Project folder. All the encrypted vault files are stored within the directory-mounted volumes.
- Firefly-III: Firefly requries a special backup script to be run. We will run this script, and it will create a single compressed file. We will backup this file.
As seen, we have many different types of backup implementations. For this, we will create a single extensible script that can handle all possible scenarios. We will create a script named `restic-backup.sh` in the `/usr/local/bin` directory. This script will contain the following:
#!/bin/bash
set -e
# Load environment variables
source /etc/restic/restic-env
# Initialize the message variable
MESSAGE="🚀 Restic Backup Operation Results:\n"
ERRORS=""
# Function to execute backup and handle messaging
execute_backup() {
local dir="$1"
local preprocess_cmd="$2"
echo "Backing up $dir"
# Run preprocessing command if provided
if [ -n "$preprocess_cmd" ]; then
echo "Running preprocessing command: $preprocess_cmd"
eval "$preprocess_cmd"
preprocess_status=$?
if [ $preprocess_status -ne 0 ]; then
MESSAGE="$MESSAGE\n❌ Failure: Preprocessing for $dir failed."
ERRORS="$ERRORS\nFailure: Preprocessing for $dir failed. Command: $preprocess_cmd"
return
fi
fi
output=$(restic backup "$dir" 2>&1)
exit_status=$?
if [ $exit_status -eq 0 ]; then
MESSAGE="$MESSAGE\n✅ Success: Backup of $dir completed."
echo "✅ Success: Backup of $dir completed."
else
MESSAGE="$MESSAGE\n❌ Failure: Backup of $dir failed."
ERRORS="$ERRORS\nFailure: Backup of $dir failed. Output:\n$output"
echo "Failure: Backup of $dir failed. Output:\n$output"
fi
}
# List of directories to backup with optional preprocessing commands
declare -A BACKUP_DIRS
BACKUP_DIRS["/home/user/containers/caddy"]=""
BACKUP_DIRS["/home/user/containers/memos"]=""
BACKUP_DIRS["/home/user/containers/calibreweb"]=""
BACKUP_DIRS["/home/user/containers/freshrss"]=""
BACKUP_DIRS["/home/user/containers/paperless"]=""
BACKUP_DIRS["/home/user/containers/radicale"]=""
BACKUP_DIRS["/home/user/containers/searxng"]=""
BACKUP_DIRS["/home/user/containers/silverbullet"]=""
BACKUP_DIRS["/home/user/containers/vaultwarden"]=""
BACKUP_DIRS["/home/user/containers/firefly/backup"]="
BACKUP_DIR=/home/user/containers/firefly/backup
mkdir -p \$BACKUP_DIR
BACKUP_FILE=\$BACKUP_DIR/firefly_backup_\$(date +%Y%m%d_%H%M%S).sql.gz
sudo docker exec firefly_iii_db sh -c 'MYSQL_PWD=\$MYSQL_PASSWORD mariadb-dump -u\$MYSQL_USER \$MYSQL_DATABASE' | gzip > \"\$BACKUP_FILE\"
cd \"\$BACKUP_DIR\" || exit
ls -1t firefly_backup_*.sql.gz | tail -n +4 | xargs rm -f
"
# Perform backups
for dir in "${!BACKUP_DIRS[@]}"; do
execute_backup "$dir" "${BACKUP_DIRS[$dir]}"
done
# Run restic forget to maintain retention policy
echo "Running forget command to maintain retention policy"
forget_output=$(restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 6 2>&1)
forget_status=$?
if [ $forget_status -eq 0 ]; then
MESSAGE="$MESSAGE\n✅ Success: Retention policy applied."
else
MESSAGE="$MESSAGE\n❌ Failure: Error applying retention policy."
ERRORS="$ERRORS\nFailure: Error applying retention policy. Output:\n$forget_output"
fi
# Write errors to log file, if any
if [ -n "$ERRORS" ]; then
echo -e "$ERRORS" > "/home/user/backup/log/errors.$(date +"%Y-%m-%d_%H:%M:%S")"
MESSAGE="$MESSAGE\n\n⚠️ Errors occurred during backup. Check the log file for details."
fi
# Wrap it up
MESSAGE="$MESSAGE\n\nBackup operation completed. Have a nice day! 🎉"
# Send the accumulated message to Slack
if [ -n "$SLACK_WEBHOOK_URL" ]; then
curl -X POST -H 'Content-type: application/json' --data "{\"text\":\"$MESSAGE\"}" "$SLACK_WEBHOOK_URL"
else
echo "SLACK_WEBHOOK_URL is not set. Skipping Slack notification."
fi
echo "Backup process completed"
Let's review this script step-by-step:
- The script starts by ensuring that any error will cause the script to exit immediately.
- It then sources the environment variables file to load the necessary variables.
- It initializes two main variables:
- MESSAGE: This variable will accumulate the results of the backup operation and will be sent to Slack at the end.
- ERRORS: This variable will accumulate any errors that occur during the backup operation.
- The heart of the script is the
execute_backup
function. It takes two parameters; the directory to back up, and an optional preprocessing command. Here is what it does:- It prints a message indicating that it is backing up the directory.
- If a preprocessing command is provided, it runs this command. If the command fails, it adds an error message to the MESSAGE and ERRORS variables.
- It runs the restic backup command on the directory. If the command succeeds, it adds a success message to the MESSAGE variable. If it fails, it adds an error message to the MESSAGE and ERRORS variables.
- Next, the script defines an associative array named BACKUP_DIRS. This array contains the directories to back up as keys and optional preprocessing commands as values. For each directory in the array, the script calls the execute_backup function.
- The script iterates through the BACKUP_DIRS array, calling
execute_backup
for each directory. - After backing up all directories, the script runs the restic forget command to
maintain the retention policy:
- The forget command removes old snapshots according to the retention policy. In this case, it keeps 7 daily, 4 weekly, and 6 monthly snapshots.
- If the forget command succeeds, it adds a success message to the MESSAGE variable. If it fails, it adds an error message to the MESSAGE and ERRORS variables.
- If any errors occurred during the backup operation, the script writes them to a log file, with a timestamp in the file name.
- The script then sends the accumulated message to a Slack channel using a webhook URL. If the SLACK_WEBHOOK_URL variable is not set, it skips the Slack notification.
Systemd Service File
Now that we have our backup script, we need to create a systemd service file to run it on a schedule. We will create a file named restic-backup.service in the /etc/systemd/system directory. This file will contain the following:
[Unit]
Description=Restic Backup Service
After=network.target
[Service]
Type=oneshot
ExecStart=/usr/bin/sudo /usr/local/bin/restic-backup.sh
User=root
Group=root
[Install]
WantedBy=multi-user.target
Note that we are using the sudo
command to run the script as root. This is
necessary because the script needs to access the restic binary and the environment variables
file, which are owned by root. The script itself also has to run as root, so it has the
necessary permissions to access these files, and to back up the directories owned by other
users (mainly docker).
Timer File
Finally, we need to create a systemd timer file to schedule the backup service. We will create a file named restic-backup.timer in the /etc/systemd/system directory. This file will contain the following:
[Unit]
Description=Run Restic Backup nightly at 03:30
[Timer]
OnCalendar=*-*-* 03:30:00
Persistent=true
[Install]
WantedBy=timers.target
This timer file will run the backup service every night at 03:30. The
Persistent=true
option ensures that the timer will catch up on missed runs if
the system was down at the scheduled time.
Enabling and Starting the Timer
Once you have created the timer file, you need to enable and start the timer:
sudo sysytemctl daemon-reload
sudo systemctl enable restic-backup.timer
sudo systemctl start restic-backup.timer
You can check the existing times and verify that the timer is running with the following
command:
sudo systemctl list-timers --all
Restoring Backups
Restoring backups is as easy as creating them. To restore a backup, you need to know the snapshot ID of the backup you want to restore. You can list all snapshots in the repository with the following command:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name snapshots
Once you have the snapshot ID, you can restore the backup with the following command:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name restore snapshot-id --target /path/to/restore
Restic will restore the backup to the specified directory. If you want to restore the backup to its original location, you can omit the --target option. Restic will restore the backup to its original location by default.
Restic also supports restoring individual files or directories from a snapshot. You can use the following command to list the contents of a snapshot:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name ls snapshot-id
Once you have the snapshot ID and the path of the file or directory you want to restore, you can use the following command to restore the file or directory:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name restore snapshot-id --target /path/to/restore path/to/file-or-directory
Restic will restore the specified file or directory from the snapshot to the specified location. If you want to restore the file or directory to its original location, you can omit the --target option.
Restic also supports mounting snapshots as a FUSE filesystem. This allows you to browse the contents of a snapshot without restoring them. You can use the following command to mount a snapshot:
restic -r s3:s3.amazonaws.com/your-backup-bucket-name mount snapshot-id /path/to/mount
Restic will mount the snapshot to the specified directory. You can then browse the contents of the snapshot as if it were a regular filesystem. When you are done, you can unmount the snapshot with the following command:
fusermount -u /path/to/mount
Conclusions
Restic is a powerful and versatile backup tool that makes it easy to back up your data securely. By following the steps outlined in this post, you can set up automated backups to an Amazon S3 bucket, ensuring that your data is safe and secure. Restic's deduplication and encryption features make it an excellent choice for backing up your data, and its support for a variety of backends makes it easy to integrate with your existing infrastructure. Whether you are a home user looking to back up your personal files or a business looking to protect your critical data, Restic has you covered.