Upgrading PostgreSQL 5x faster

Upgrading your PostgreSQL database from one major version (e.g. 9.4.x) to another major version (e.g. 9.5.x) used to a painful and exceedingly slow process. You essentially had two options: dump / reload the data or use one of the complex logical replication tools.

Thankfully, the PostgreSQL team introduced pg_upgrade back in version 9.0. Because the way data is stored internally in its datafiles in PostgreSQL rarely changes, pg_upgrade is able to re-use the existing datafiles (while manipulating some catalog entries) to “short circuit” the upgrade process. While this isn’t (yet) a true “in place upgrade” as done by some other databases, it’s pretty close. And it’s stupid fast. In my testing on my overworked Macbook Pro, it took 1/5 as long to upgrade as a traditional dump and reload. So, let’s look at this process shall we?

First, we assume that we have both PostgreSQL 9.5 and 9.6 installed and both have initialized (empty) clusters (see here if you need to do this). We’re going to use pgbench to create some data in our PostgreSQL 9.5 instance:

$ pg 9.5
$ createdb bench1; createdb bench2; createdb bench3
$ pgbench -i -s 15 bench1 ; pgbench -i -s 70 bench2 ; pgbench -i -s 600 bench3
$ pgbench -c 4 -j 2 -T 600 bench1 ; pgbench -c 4 -j 2 -T 600 bench2 ; pgbench -c 4 -j 2 -T 600 bench3

Now that we’ve got data in our cluster, we can do the dump. If this were a production instance, this is where you’d have to stop your application(s).

$ time pg_dumpall > data.sql
pg_dumpall > data.sql  20.57s user 30.63s system 4% cpu 18:43.70 total

We’ve now dumped out all our data, and spent 18 minutes with the application(s) down. Let’s restore our data to the PostgreSQL 9.6 cluster now:

$ pg 9.6
$ time psql -f data.sql
psql -f data.sql  14.53s user 18.30s system 1% cpu 37:48.49 total

After 37 minutes, our data is back and we can start our applications back up. An outage of approximately 56.5 minutes.

Now, let’s blow away our PostgreSQL 9.6 cluster and use pg_upgrade to complete the same task. You would do this with the application(s) down as well!

$ rm -fr $PGDATA/*
$ initdb $PGDATA
$ export OPGDATA=$PGDATA/../9.5
$ time pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B /usr/local/opt/postgresql-9.6/bin
pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B   0.40s user 12.12s system 1% cpu 10:26.64 total

And we’re done in 10.5 minutes. It took 1/5 the outage of the dump / load method. And that’s on my puny dataset with my overworked laptop! Pretty impressive, no?

For the curious, the pg_upgrade output that I omitted above for readability’s sake is:

Performing Consistency Checks
+++++++++++++++++++++++++++--
Checking cluster versions                                   ok
Checking database user is the install user                  ok
Checking database connection settings                       ok
Checking for prepared transactions                          ok
Checking for reg* system OID user data types                ok
Checking for contrib/isn with bigint-passing mismatch       ok
Checking for roles starting with 'pg_'                      ok
Creating dump of global objects                             ok
Creating dump of database schemas
                                                            ok
Checking for presence of required libraries                 ok
Checking database user is the install user                  ok
Checking for prepared transactions                          ok

If pg_upgrade fails after this point, you must re-initdb the
new cluster before continuing.

Performing Upgrade
++++++++++++++++++
Analyzing all rows in the new cluster                       ok
Freezing all rows on the new cluster                        ok
Deleting files from new pg_clog                             ok
Copying old pg_clog to new server                           ok
Setting next transaction ID and epoch for new cluster       ok
Deleting files from new pg_multixact/offsets                ok
Copying old pg_multixact/offsets to new server              ok
Deleting files from new pg_multixact/members                ok
Copying old pg_multixact/members to new server              ok
Setting next multixact ID and offset for new cluster        ok
Resetting WAL archives                                      ok
Setting frozenxid and minmxid counters in new cluster       ok
Restoring global objects in the new cluster                 ok
Restoring database schemas in the new cluster
                                                            ok
Copying user relation files                                 ok
Setting next OID for new cluster                            ok
Sync data directory to disk                                 ok
Creating script to analyze new cluster                      ok
Creating script to delete old cluster                       ok

Upgrade Complete
+++++++++++++++-
Optimizer statistics are not transferred by pg_upgrade so,
once you start the new server, consider running:
    ./analyze_new_cluster.sh

Running this script will delete the old cluster's data files:
    ./delete_old_cluster.sh