HNM backup & migration routines
This document spells out the routines for backing up and migrating the websites managed by HNM AS.
Table of contents
- Introduction
- Migration between production and staging sites
- Migration of a production site to a new web server
- Migration of a subsite
- AWS S3 overview
- Destinations for my backups
- Types of sites
- Drupal code base backup
- Drupal file assets backup
- Drupal database backup
- Tools
- Final word
Drupal project mentioned in this chapter: Backup and Migrate.
Introduction
This is the operational note about the backups routines and migration procedures (e.g. between staging and production) for all websites managed by HNM AS.
The spreadsheet websites.xslx
is used to keep track
of backups and configuration details. All references to "the
spreadsheet" below refers to this. It is stored in
@HNM-PC.
These routines are based “3-2-1” backup practice, with Backup and Migrate used for scheduled and manual database local backups and s3cmd used to copy local database backups, public and private files to the offsite backup location. Most sites also have a staging site, that may have backup, but the staging is not always kept in sync.
All offsite backups are kept at AWS S3 “EU (Frankfurt)”. To browse this archive, use S3 browser (MS Windows GUI program).
To login to external web services, use one of links and credentials found in my Web services link section.
Automatic backup to offsite destinations is done
by cron. The command to view the active crontab
is crontab -l
.
Internal documentation
In the spreadsheet the “HNM” tab lists all sites and their databases, except the databases belonging to SNP, Karde and staging sites. Those belonging to SNP are found under the “SNP” tab. Those belonging to Karde are found under the “Karde” tab. The “Staging” tab lists staging sites.
Columns:
- Site: Site identifier.
- Loc: The locaton of the site's server.
- DB: Database name.
- DB user: Database user.
- px: DB prefix.
- Sched.: DB backup schedule.
- 1. BU loc. (DB): Local backup directory or “usit”.
- 2. BU loc. (DB): .
- Status: Issues, actions, or “OK”.
- BU Size - Files: .
- BU Size - DB: .
- BU loc. files: .
- : .
To enable offsite backups, do the following:
- Create an AWS S3 bucket for the backup. Use the site identifier in the spreadsheet. For instance. for the site “norren.no”, name the bucket “norren.no”.
- Create a new folder: “database” to hold the database dumps. The copied files will inherit the names from the original subdirectories.
- Backup of påublic and private files need to be tuned for each site, see exampeles below.
Basic scripted instructions for copying the database dumps and files to AWS S3 may look like the example below. In addition to the database dumps, each subdirectory below the public and private file system that contains original files should be copied. If the files live in the root, exclude subdirectories.
# Copy the database dump, /usr/local/bin/s3cmd sync --skip-existing \ /var/private/example/backup_migrate/ s3://example.org/database/ # Copy the "public://" (root public file) directory, but no subdirectories. /usr/local/bin/s3cmd sync --skip-existing --exclude="/*" \ /www/example.org/html/sites/default/files/ s3://example.org/files/ # Copy the "public://attachemnets" directory if it exists /usr/local/bin/s3cmd sync --skip-existing \ /www/example.org/html/sites/default/files/attachments s3://example.org/ # Copy the "public://pictures" directory if it exists /usr/local/bin/s3cmd sync --skip-existing \ /www/example.org/html/sites/default/files/pictures/ s3://example.org/ # Copy the "public://thumnnails/images/" directory if it exists. /usr/local/bin/s3cmd sync --skip-existing \ /www/example.org/html/sites/default/files/thumbnails/image s3://example.org/ # Copy the "public://thumnnails/images/" directory if it exists. /usr/local/bin/s3cmd sync --skip-existing \ /www/example.org/html/sites/default/files/thumbnails/image s3://example.org/ # Copy the "private://files" directory if it exists /usr/local/bin/s3cmd sync --skip-existing \ /var/private/example/pfiles/ s3://example.org/pfiles/Exclude folders. https://stackoverflow.com/q/21891045/1837734
Migration between production and staging sites
To copy the database, copy the SQL-dump file created Backup and Migrate, and roll it back in on the destination.
Single files can be copied over the Internet with scp. Examples:
$ scp gisle@copymarks.no:/home/gisle/z_tegn.txt . $ scp myproject.tar.gz gisle@copymarks.no:/home/gisle/myproject.tar.gz
Directories can be recursively be copied over the Internet with rsync -r.
For example, provided the siteroot directories for vhosts are
located in the /var/www
directory, ececuting the
following two commands on the destination will first change directory,
and then recursively copy all the files that make up the site
“example.com” to the destination.
$ cd /var/www $ rsync -r user@example.com:/var/www/example.com .
Below are two more rsync examples to run from the destination. The first does the same as the two commands above. The second recursively copy all the files in the directory user foo's home directory at the the source to current directory, and uses archive mode, which copies symlinks as symlinks.
$ rsync -r user@example.com:/var/www/example.com /var/www $ rsync -ra user@example.com:/home/username/foo .
Make sure that no firewall blocks access.
When copying public files, note that private files that lives outsite the siteroot must be copied seperately.
This command, when run from a staging server, will list directories and files on the production server that are missing from staging server, or are different. It will not list materials that are only on the staging server.
$ rsync -rvnc user@production.com:/var/www/vhost/html /var/www/vhost
Two shell script has been prepared for Titan. They need to be edited.
rsync_cpfromprod.sh
: copy code and public files from the production server.mysql_createdb.sh
: set up a datebase on the staging server.
First, at the destination (staging site), copy the code and the public files from the production server:
- Edit
rsync_cpfromprod.sh
to definePROD, WRP, WRS
andSITE
. - Run
rsync_cpfromprod.sh
. - If a multisite, move the site's settings directory to default. This also moves the public files subdirecory.
Next, migrate the database and the private files:
- Create the private files directory in the same location as on the production site.
- Back up the database and transfer the gzipped database SQL dump to the staging site.
- Edit
mysql_createdb.sh
to create the databse for the staging site. - Run
rsync_createdb.sh
. - Use gunzip to unpack a copy of the SQL dump.
- Use
mysql -u gisle -p mydatabase < mydatabase.mysql
to import the database.
Finish:
- Edit
settings.php
to point to contain the correct database and database credentials. - Run
fixperms
to fix permissions and ownerships.
You should now be able to visit the URL of the staging site and inspect the copy.
If
there is a WSOD and running any drush command produces:
“Fatal error: Call to undefined function cache_get()”, there may be a syntax error in settings.php
.
If the site is a multisite (e.g. one of the SNP sites), some image file references may be wrong, as they have paths containing the site specific path hardwired. You can edit the source to fix the path. The first two shows two different ways to refer to an image in the puvlic file directory on a multiste, and may be replaced by the last on the migrated staging site (not a multisite).
/vvnf/sites/se.vvnf/files/image.jpg /sites/org.vvnf/files/image.jpg /sites/default/files/image.jpg… or with symlinks. Example of how to fix this for vvnf:
First in the webroot, link the root to the subsite name:
$ ln -s . vvnf
Incidently, this make the site a subsite of itself.
Then, in the sites directory, symlink all aliases that are in use to default:
$ ln -s default no.vvnf $ ln -s default se.vvnf $ ln -s default org.NNF.vvnf
Migration of a production site to a new web server
The instructions below is what I believe is best practice for moving a production website to a new web server.
- Make sure that the destintion is not blocked by a firewall.
- Make a manual backup of the source database and copy it to the destination sink directory.
- On the source web server, replace Drupal with a static web page to
tell people that the site is currently being migrated. There is a
template in
/var/www/0_parked/migrated.html
on do20. - You may now change the zone file to point to the new destination. It typically takes at least one hour from you do this, until the new configuraton replaces the old.
- Set up DNS for a staging Drupal website pointing to the new
destination (DNS name may be “
staging.
” followed by production site domain name). We are going use “example.com
” as the production site domain name in the examples below. The database name will be “example”. - Rsync the code base and public files. Remember to also migrate private files if they exist. Do this on the destination. (See above for meaning of options.)
- Copy the “
settings.php
” from the migrated website to the~/configfiles
directory and prefix the filename with the database name. - Create a static HTML staging site:
- Configure apache2 for the staging site and enable it:
- On the destination, test and reload apache2 for the domain and check that static HTML works.
- Run
drupal7_cleaninst.sh
to create a clean Drupal installation on the staging site. You may use the database name of migrated site. You now have a clean install of Drupal7 on the staging site. - Log in as the super admin. Set up private file system on the staging site.
- Make a backup to manual backups directory to create the backup directory.
- On the staging, replace the newly installed Drupal site with the migrated codebase.
- Roll back the content from the backup into the staging site.
- Set up apache2 for the production site on the destination server.
- Monitor the front page of the migrated site, to see when the DNS change takes effect, restore the code base.
- After the new DNS becomes valid, if you use TLS, the site is going to look broken until you set up TLS on the detination anyway.
- Delete any TLS certificates for the source site. If it is enabled, disable it first (before deleting the certificate) Otherwise, apache2 will become confused by the cerificate missing.
- Then delete the certificates on the source site:
If the destination website should use TLS, set it up.
Certbot will create a numbered list of all the domains enabled for your Apache web server. You may pick more than one domain, but all the domains you pick will share a single certificate with the same common name (CN). Yout typically will pick just the domainname and “www.” followed by the domain name. You will also be asked about whether to set up redirects.
Below is an example, where we pick two websites (5 and 7) from the numbered list (not shown), and want redirects (option 2):
sudo certbot --apache Saving debug log to /var/log/letsencrypt/letsencrypt.log Plugins selected: Authenticator apache, Installer apache Which names would you like to activate HTTPS for? - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - … - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Select the appropriate numbers separated by commas and/or spaces, or leave input blank to select all options shown (Enter 'c' to cancel): 5,7 … Please choose whether or not to redirect HTTP traffic to HTTPS, removing HTTP access. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1: No redirect - Make no further changes to the web server configuration. 2: Redirect - Make all requests redirect to secure HTTPS access. Choose this for new sites, or if you're confident your site works on HTTPS. You can undo this change by editing your web server's configuration. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Select the appropriate number [1-2] then [enter] (press 'c' to cancel): 2 …
- Finally, fix file permissions:
$ rsync -r user@example.com:/var/www/example.com /var/www $ rsync -ra user@example.com:/var/private/example /var/private
$ cp /var/www/example.com/html/sites/default/settings.php \ ~/configfiles/example.settings.php
$ cp 0_parked staging.example.com
$ sudo a2ensite example.com.conf
$ sudo apache2ctl configtest $ sudo systemctl reload apache2
If
the site loads without styling, check the PHP
variable $base_url
in settings.php
.
$ mv example.com staging.example.com
$ mv staging.example.com example.com
If the site was set up with TLS, visiting the URL will typically produce a warning that says: “Warning: Potential Security Risk Ahead”. It will remain in place until TLS is enabled on the destination.
$ sudo a2dissite example.com.conf $ sudo a2dissite example.com-le-ssl.conf $ sudo apache2ctl configtest $ sudo systemctl reload apache2
$ sudo certbot delete
$ cd /var/www/example.com $ fixperms
Migration of a subsite
After migrating the default site of a multisite, a slightly different procedure is used to migrate all the subsites sharing the same codebase.
- Use Backup and Migarate to dump the database.
- Copy the database dump to the destination.
- Create an empty database with the right name using the HNM script.
- Gunzip the dump.
- Use mysql to populate the database.
$ sudo mysql database < dump.mysql
- Navigate to the site settings directory.
- Copy the settings file from either ~/configfiles or a previous subsite sharing the same database.
- Edit "settings.php":
database, prefix, $drupal_hash_salt
. - Check site's status report. Fix all problems. To fix character sets, position yourself in the directory holding the "settings.php" of the site and use the following command:
$ drush8 utf8mb4-convert-databases --collation=utf8mb4_danish_ci
- Repeat steps 6 to 9 for all subsites sharing the same database.
- When all sites sharing the same database has been migrated, make a backup.
AWS S3 overview
Current archives:
- norren.no: Daily.
- attachments: Content of
public://attachements
. Managed attachments. - database: Backup and Migarate SQL dump.
- image: Content of
public://thumbnails/image
. Those are the original images.
- attachments: Content of
- pvnext: Daily.
- attachments: Content of
public://attachements
. Managed attachments. - database: Backup and Migarate SQL dump.
- image: Content of
public://thumbnails/image
. Those are the original images.
- attachments: Content of
- snp-hnm: For schedule, see spreadsheet.
- anbf: private.
- database: Backup and Migarate SQL dumps for all subsite databases.
- files: Content of
public://
for all subsites (no subdirectories). - NNF: private.
- pictures: Content of
public://pictures
for all subsites. - se.Bergslegen: private.
- Svanrevet: private.
- tecknat: drawing library.
- xxx: private.
- :
Legacy archives:
- gisle-privat-backup: Eline, Photo Realestate. Private stuff, mostly photos.
- hnm_backup: pvn2 - documents uploaded 2017-02-09, goes back to 2014, vedelminner - documents uploaded 2017-02-09, files for SNP and friend sites, from 2018
- hnm-db-backups: scheduled database backups up to 2020-03-01 from most sites on pvn.no.
- personvernnemnda.no: intranet files, including BM-stuff - nothing newer than february.
- scandinavianaturist.org: Backup of files below
sites
andprivate://
up to 2020-04-22.
Replaced by snp-hnm.
https://serverfault.com/questions/517474/s3cmd-with-delete-removed
Destinations for my backups
BM means Backup and Migrate (Drupal module). “-a” is automatic and “-m” is manual.
An overview of my backup locations is also in the spreadsheet. The following locations exists and the following backup methods are used:
- s3-a: Files: Cron job adds new files to bucket daily. DB: Cron job adds dated SQL created by BM (with date as part of file name) daily. So far, no smart delete.
- bm-a: DB backup is to Scheduled Backups Directory on pvn.no, schedule is set up in BM on site (daily). Sometimes manually copied to Ifi
(below
~/www_docs/staging2/Backups/Db
). - bm-m: DB backup is to Manual Backups Directory on pvn.no (manually). Sometimes manually copied to Ifi (below
~/www_docs/staging2/Backups/Db
). - ifi-a: File tree overview, and file assets (most below
public://files
) from various VIP sites on pvn.no are copied to Ifi daily using cron on pvn.no (below~/www_docs/staging2/Backups/Files
). - ifi-m: File tree overview, and file assets (most below
public://files
) from various fairly stable sites on pvn.no are copied to Ifi manually (below~/www_docs/staging2/Backups/Files
). - s3-low: File assets belonging to low priority sites have been stored in tarballs on the Amazon cloud S3, in the bucket hnm-backup. Total DB dump are in in
bar-20170210.sql
. No scheduled backup, last backup was 2017-02-10. - tree: File tree overview only.
- cdrom: everything is on a CD-ROM that is kept in my office.
Types of sites
The following lists the sites on pvn.no and classify them into types that use different backup locations and routines.
There is no backup of PHP and JS unless explictliy mentioned. For
Drupal on pvn.no, the code base can recreated be
consulting tree.txt
(stored on Ifi as part of the daily
backup) to see what files make up the code base, and then
reconstructed by downloading fresh copies of contributed modules from
Drupal.org, extracting custom modules from my own repo and libraries
from elsewhere.
Contract sites
DB: BM is set up to create an on-site timestamped SQL dump every 1 day:
- intranet.pvn.no →
~/pvn2/backup_migrate/scheduled/PVNintranet-date.mysql.gz
The destination is the within the private file system for the intranet. - personvernnemnda.no →
~/db_backups/backup_migrate/scheduled/Personvernnemnda-date.mysql.gz
The destination is the default SQL backup directory (private file system for everything else).
These timestamped mysql dumps are synced to AWS S3 at 20:02 every
day (cron) by the script backup_pvnfiles.sh
.
The destination bucket is personvernnemnda.no
.
These things are synced in this bucket:
extranet_files
: New mangaged image files in the public file system. Only the original image is backed up.intranet_files
: New intranet files in the private file system. This includes the SQL dump in the private file system. The path isintranet_files/pvn2/
. The top folder is for attachment files from the private file system. There are two subfolders:backup_migrate
with daily SQL dumps andpictures
with pictures from the public file system.
The public website (personvernnemnda.no) XXXX
- All new files in the default SQL backup directory (this includes the extranet DB backup).
Check that this works!
VIP Drupal sites
- casamargarita.no
- hannemyr.no (check if backed up).
- larsvik.no
- minner.vedel.no
- norren.no
- terjerasmussen.no
Check that this works!
Daily scheduled on-site BU of DB by means of BM.
These are synced to AWS S3 at 20:02 every day (cron) by the
script backup_pvnfiles.sh
because all new files in the
default SQL backup directory is synced.
File content (media assets) are automatically copied to Ifi as tarballs daily.
TODO: Backup files to AWS S3 using conditional backup. Autodelete old DB backups on AWS S3.
Legacy sites:
These are temporarely running on DO vhosts because they require PHP 5 (others run PHP 7).
Manual file tree backup (htdocs and below) dated 2017-07-10 to ifi-m (all 3).
Manual DB backup dated 2017-07-10 to ifi-m (dbanswers).
Manual DB backup dated 2017-07-10 to ifi-m (roomsaxs & tolfa).
TODO: Auto backup of DB. Auto backup of images/upload
(tolfa). Longer term: Convert to Drupal.
Low priority (LP) sites with own DB
On pvn.no:
- mjovik.com (stale)
On pvn.no:
- bolig-sameie.no (static demo)
- cc-arkiv.ngoweb.no (frozen archive).
- elvegaarden.net (stale)
- hannemyr.com (stale)
- i18n.no (placeholder, no real content)
- pet.roztr.org (fairly stale)
- predictive-policing (fairly stale)
- nemo-project.org (stale)
- roztr.com (notebook for me only)
I only do manual BU f of these.
Sites on pvn.no with only s3-low backup:
- drupalprimer.com (low priority project)
A DB backup from 2017-02-10 exists on Amazon S3. These do not have file content.
HTML sites on pvn.no:
- kristennygaard.org cdrom
- kristianvedel.dk ifi-m: 2017-07-10
- vedel.no ifi-m: 2017-07-10
Sites on pvn.no with no backup:
- copymarks.no (D8 test site, no real content, broken)
- copymarks.org (D7 test site, no real content)
Parked sites on pvn.no:
Drupal code base backup
- pvn.no|contracted: tree.
- pvn.no|VIP: tree.
- pvn.no|LP: tree.
Drupal file assets backup
- pvn.no|contracted: s3-a (daily).
- pvn.no|VIP: ifi-a (daily).
- pvn.no|LP: ifi-m (2017-07-10).
Drupal database backup
- pvn.no|contracted: s3-a (daily).
- pvn.no|VIP: s3-a (daily).
- pvn.no|LP: s4 (2017-02-10).
Tools
Tools to interact with Amazon AWS S3
Main: AWS mangement console.
MS Windows: S3 browser. Credentials are in clipperz.
CLI: aws and s3cmd.
See Amazon AWS S3 for instructions about usage.
Buckets kept at AWS S3:
gisle-privat-backup
: Some private files. Backed up in 2017-02-hnm-backup
: All my websites. Backed up in 2017-02-hnm-db-backups
: Conditional sync of~/db_backups/…
personvernnemnda.no
: Conditional sync of files for intranet and extranet.
Drupal Backup and Migrate
The Drupal Backup and Migrate
(BM) module should be enabled for all contract and
VIP Drupal sites at pvn.no, do18, do19 and do20 with
a minimal profile and path to private file system set
to: /var/private/identifier
(most sites)
or bar:/home/gisle/PVN2
(PVN).
- Name: Local
- Backup source: Default database
- Settings profile: minimal
- Most: Active, run every 1 days (see "Sched." in spreadshet for details).
- Automatically delete old backups. Simple delete. Keep 10.
- Backup destination: Scheduled backups directory:
private://backup_migrate/scheduled
.
The configuration for BM, including the name of the backup file is kept in the database. This means that the configuration is overwritten when a staging site is updated with the latest production snapshot, or a staging database is transferred to production. For now, only remedy is to correct manually.
Fix: Provide for a suffix (prod/staging
) to be set
in settings.php
.
Final word
[TBA]
Last update: 2021-01-31 [gh].