Upgrading PostgreSQL 5x faster

Reading time ~2 minutes

Upgrading your PostgreSQL database from one major version (e.g. 9.4.x) to another major version (e.g. 9.5.x) used to a painful and exceedingly slow process. You essentially had two options: dump / reload the data or use one of the complex logical replication tools.

Thankfully, the PostgreSQL team introduced pg_upgrade back in version 9.0. Because the way data is stored internally in its datafiles in PostgreSQL rarely changes, pg_upgrade is able to re-use the existing datafiles (while manipulating some catalog entries) to “short circuit” the upgrade process. While this isn’t (yet) a true “in place upgrade” as done by some other databases, it’s pretty close. And it’s stupid fast. In my testing on my overworked Macbook Pro, it took 1/5 as long to upgrade as a traditional dump and reload. So, let’s look at this process shall we?

First, we assume that we have both PostgreSQL 9.5 and 9.6 installed and both have initialized (empty) clusters (see here if you need to do this). We’re going to use pgbench to create some data in our PostgreSQL 9.5 instance:

doug@Douglass-MacBook-Pro ~/foo » pg 9.5
doug@Douglass-MacBook-Pro ~/foo » createdb bench1; createdb bench2; createdb bench3
doug@Douglass-MacBook-Pro ~/foo » pgbench -i -s 15 bench1 ; pgbench -i -s 70 bench2 ; pgbench -i -s 600 bench3
doug@Douglass-MacBook-Pro ~/foo » pgbench -c 4 -j 2 -T 600 bench1 ; pgbench -c 4 -j 2 -T 600 bench2 ; pgbench -c 4 -j 2 -T 600 bench3

Now that we’ve got data in our cluster, we can do the dump. If this were a production instance, this is where you’d have to stop your application(s).

doug@Douglass-MacBook-Pro ~/foo » time pg_dumpall > data.sql
pg_dumpall > data.sql  20.57s user 30.63s system 4% cpu 18:43.70 total

We’ve now dumped out all our data, and spent 18 minutes with the application(s) down. Let’s restore our data to the PostgreSQL 9.6 cluster now:

doug@Douglass-MacBook-Pro ~/foo » pg 9.6
doug@Douglass-MacBook-Pro ~/foo » time psql -f data.sql
psql -f data.sql  14.53s user 18.30s system 1% cpu 37:48.49 total

After 37 minutes, our data is back and we can start our applications back up. An outage of approximately 56.5 minutes.

Now, let’s blow away our PostgreSQL 9.6 cluster and use pg_upgrade to complete the same task. You would do this with the application(s) down as well!

doug@Douglass-MacBook-Pro ~/foo » rm -fr $PGDATA/*
doug@Douglass-MacBook-Pro ~/foo » initdb $PGDATA
doug@Douglass-MacBook-Pro ~/foo » export OPGDATA=$PGDATA/../9.5
doug@Douglass-MacBook-Pro ~/foo » time pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B /usr/local/opt/postgresql-9.6/bin
pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B   0.40s user 12.12s system 1% cpu 10:26.64 total

And we’re done in 10.5 minutes. It took 1/5 the outage of the dump / load method. And that’s on my puny dataset with my overworked laptop! Pretty impressive, no?

For the curious, the pg_upgrade output that I omitted above for readability’s sake is:

updated PostgreSQL homebrew script

With the release of PostgreSQL 10, I've updated my `pg` script. You might recall from previous posts that this script is for Homebrew use...… Continue reading

When you cannot get there from here

Published on July 25, 2017

Installing pgBackRest on OSX

Published on June 14, 2017