Installing pgBackRest on OSX

Reading time ~3 minutes

If you’ve followed my previous posts (here and here), then you already have one or more versions of PostgreSQL installed on your Mac. Maybe these are solely for test or dev purposes and you don’t really care about any of the data therein, but if you do, let me guide you to pgBackRest.

pgBackRest aims to be a simple, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.

Instead of relying on traditional backup tools like tar and rsync, pgBackRest implements all backup features internally and uses a custom protocol for communicating with remote systems. Removing reliance on tar and rsync allows for better solutions to database-specific backup challenges. The custom remote protocol allows for more flexibility and limits the types of connections that are required to perform a backup which increases security.

pgBackRest is written in Perl, but don’t hold that against it. As of the 1.19 release, pgBackRest can now use S3 buckets as the storage backend. I really like pgBackRest and tend to use it for myself and customers over any of the other tools in the PostgreSQL ecosphere. So, let’s get started by downloading the latest release from their site, and then installing it. For some reason, no one has added pgBackRest to Homebrew yet (someone, pls!) so let’s do it the manual way:

(Keep in mind that I already had Perl setup to connect to PostgreSQL for other uses. You might need to install DBD::Pg.)

Now that pgBackRest is installed, let’s configure it. First, we’ll want to set some of the global properties that affect all pgBackRest operations:

As you can see, we set the following:

  • force the log level for all console output to ‘info’
  • define the S3 bucket we want to use
  • define the S3 endpoint to connect to
  • define our S3 key
  • define our S3 secret key
  • set which region our bucket is in
  • tell pgBackRest that we’re using S3 as the backend
  • configure retention of full backups
  • tell pgBackRest to issue a CHECKPOINT so backups can start right away instead of waiting for the next regular checkpoint

Now, we need to tell pgBackRest which instance of PostgreSQL we want to backup and where to find it. Again, if you used my previous posts to install multiple versions via Homebrew, this should look familiar:

You can see for each pg cluster, we define:

  • the path to the $PGDATA directory
  • the port the cluster listens on
  • and the path we want to store the backups in on our backend

When you put this all together, we’ll be connecting to an S3 bucket called, creatively enough, hunleyd-pgbackrest and then we will create a top-level directory (‘96’, ‘95’, etc) to store each cluster’s backups in.

Now that we’ve got our configuration complete, let’s do an initial backup of one of the clusters. First, we have to create the appropriate directories and metadata on the backend:

Then, we have pgBackRest verify that everything is properly setup. Note that this includes checking to ensure you tweaked postgresql.conf according to the directions on their site (I’m not going to repeat them here):

And since that all worked, we can take our first actual backup:

Neat!

Now, let’s check our S3 bucket, shall we?

s3_1

You can see here the top-level contents of my hunleyd-pgbackrest bucket. As stated before, each cluster gets its own sub-dir. Since we just backed up the ‘92’ cluster, let’s look inside it’s dir.

s3_2

You can see that pgBackRest has created as directory for the WALs to be stored in whenever archive_command fires and another directory for the actual cluster backups. Peeking into the archive dir, we see:

s3_3

This shows us some metadata, and shows that pgBackRest creates a directory for each timeline of the cluster. Since we are on timeline 1 in our 92 cluster, we have a 9.2-1 directory inside of which, we find:

s3_4

Our archived WALs have been compressed and uploaded. Hurray!

Now, let’s check inside the backup directory:

s3_5

We can see some metadata, and we can see a folder named the same as the backup label that was used when we ran our full backup. Inside that folder, we can see:

s3_6

Hey look, more metadata! And another folder! :) So, let’s dive into the pg_data folder where we see:

s3_7

Holy crap! It’s a basebackup of our $PGDATA data directory. And all the files have been nicely compressed for us. Rock on, pgBackRest!

And just in case you wanted to see the current backup catalog:

(look at that compression!)

Goodbye, Loui boy

Thank you, Loui, for bringing so much joy and happiness to our lives these past ten years. I will miss you.… Continue reading

New look, same content

Published on January 17, 2017

Changes to the blog

Published on December 02, 2016