Check that Fedora Copr Backups are OK¶
This document explains how Fedora Copr backups are performed, so we can periodically verify that everything is in place and functioning properly. For disaster recovery, refer to Recovery from backups.
Copr Backend¶
The backend storage uses a complex RAID setup to provide redundancy directly on the server (in EC2). Backups are then synchronized periodically to the storinator01 host as incremental backups via rsnapshot. To verify backend backups, you should:
Confirm the timestamp of the most recent backup start.
Choose a random build that completed just before that time.
Verify that this build was successfully backed up to storinator01.
Here comes more detailed guide:
SSH into the
copr-be
machine and review the/var/log/cron
file. You may want to check thecrontab -l
output first to confirm the backup schedule (typically Fridays though) and open an older compressed Cron log file:$ xz -d < /var/log/cron-20241101.xz | grep '(copr) CMD' ... Nov 1 03:00:02 copr-be CROND[3482216]: (copr) CMD (ionice --class=idle /usr/local/bin/rsnapshot_copr_backend >/dev/null) ...
The last backup started on November 1, 3:00 AM.
The backup process typically takes several days. If there’s no corresponding
CMDEND
entry in the cron logs, it indicates that the backup is still in progress, and the build ID we’re trying to verify as backed up may not yet be included. Wait for it to complete. Or check the previous backup increment instead (that means poke at/var/log/cron-20241025.xz
).Find an appropriate build ID that finished “just before” that time above. For instance in the
@copr/copr-pull-requests
or@copr/copr-dev
projects. Good candidate is 8185411.SSH into the storinator01 box and locate the latest incremental backup (note that the sub-projects matter,
copr-pull-requests:pr:3473
in our case):$ find /srv/nfs/copr-be/copr-be-copr-user/backup/.sync/var/lib/copr/public_html/results/@copr/copr-pull-requests:pr:3473 | grep 8185411 | grep rpm$ /srv/nfs/copr-be/copr-be-copr-user/backup/.sync/var/lib/copr/public_html/results/@copr/copr-pull-requests:pr:3473/epel-8-x86_64/08185411-copr-rpmbuild/copr-builder-1.1-1.git.3.8adcc0d.el8.x86_64.rpm /srv/nfs/copr-be/copr-be-copr-user/backup/.sync/var/lib/copr/public_html/results/@copr/copr-pull-requests:pr:3473/epel-8-x86_64/08185411-copr-rpmbuild/copr-rpmbuild-1.1-1.git.3.8adcc0d.el8.src.rpm /srv/nfs/copr-be/copr-be-copr-user/backup/.sync/var/lib/copr/public_html/results/@copr/copr-pull-requests:pr:3473/epel-8-x86_64/08185411-copr-rpmbuild/copr-rpmbuild-1.1-1.git.3.8adcc0d.el8.x86_64.rpm /srv/nfs/copr-be/copr-be-copr-user/backup/.sync/var/lib/copr/public_html/results/@copr/copr-pull-requests:pr:3473/epel-9-x86_64/08185411-copr-rpmbuild/copr-builder-1.1-1.git.3.8adcc0d.el9.x86_64.rpm ...
This confirms the backups are working correctly. While you’re on storinator,
ensure there is adequate free space on the filesystem by running
df -h /srv/nfs/copr-be
.
Copr Frontend¶
For Frontend, we backup the PostgreSQL database (hourly). Check
/etc/cron.d/cron-backup-database-coprdb
cron config, and the corresponding
/backups
directory. That one should have the current timestamp, like:
[root@copr-fe ~][PROD]# ls -alh /backups/
total 662M
drwxr-xr-x. 1 postgres root 50 Nov 5 01:21 .
dr-xr-xr-x. 1 root root 160 Nov 28 2023 ..
-rw-r--r--. 1 postgres postgres 662M Nov 5 01:21 coprdb-2024-11-05.dump.xz
If we provide such an updated tarball, rdiff-backup periodically comes and pulls the backups “out”; as long as the box is in an appropriate Ansible group and we configure the backup dir.
For Frontend data volume, we also do automatic volume snapshots (see Copr Keygen info below for more info).
Copr Keygen¶
We don’t do filesystem backups (rsync) there. The important data —keypairs— are
stored on a separate volume /var/lib/copr-keygen
, and periodically
snapshotted in EC2. Check for the volume.
We do snapshots to Ohio, us-east-2
! Volume snapshots may be filtered
with tag FedoraGroup=copr
.
Copr DistGit¶
Due to Copr’s design (see architecture), Copr DistGit data is extensive, measuring in terabytes, yet it’s not critical enough to require formal backups. We anyway at least do periodic snapshots like with Copr Keygen above. In the event of a complete failure, we would restore from there — or simply initialize a new, empty volume.