Upgrading your PostgreSQL database from one major version (e.g. 9.4.x) to another major version (e.g. 9.5.x) used to a painful and exceedingly slow process. You essentially had two options: dump / reload the data or use one of the complex logical replication tools.
Thankfully, the PostgreSQL team introduced
pg_upgrade back in version 9.0. Because the way data is stored internally in its datafiles in PostgreSQL rarely changes,
pg_upgrade is able to re-use the existing datafiles (while manipulating some catalog entries) to “short circuit” the upgrade process. While this isn’t (yet) a true “in place upgrade” as done by some other databases, it’s pretty close. And it’s stupid fast. In my testing on my overworked Macbook Pro, it took 1/5 as long to upgrade as a traditional dump and reload. So, let’s look at this process shall we?
First, we assume that we have both PostgreSQL 9.5 and 9.6 installed and both have initialized (empty) clusters (see here if you need to do this). We’re going to use
pgbench to create some data in our PostgreSQL 9.5 instance:
doug@Douglass-MacBook-Pro ~/foo » pg 9.5 doug@Douglass-MacBook-Pro ~/foo » createdb bench1; createdb bench2; createdb bench3 doug@Douglass-MacBook-Pro ~/foo » pgbench -i -s 15 bench1 ; pgbench -i -s 70 bench2 ; pgbench -i -s 600 bench3 doug@Douglass-MacBook-Pro ~/foo » pgbench -c 4 -j 2 -T 600 bench1 ; pgbench -c 4 -j 2 -T 600 bench2 ; pgbench -c 4 -j 2 -T 600 bench3
Now that we’ve got data in our cluster, we can do the dump. If this were a production instance, this is where you’d have to stop your application(s).
doug@Douglass-MacBook-Pro ~/foo » time pg_dumpall > data.sql pg_dumpall > data.sql 20.57s user 30.63s system 4% cpu 18:43.70 total
We’ve now dumped out all our data, and spent 18 minutes with the application(s) down. Let’s restore our data to the PostgreSQL 9.6 cluster now:
doug@Douglass-MacBook-Pro ~/foo » pg 9.6 doug@Douglass-MacBook-Pro ~/foo » time psql -f data.sql psql -f data.sql 14.53s user 18.30s system 1% cpu 37:48.49 total
After 37 minutes, our data is back and we can start our applications back up. An outage of approximately 56.5 minutes.
Now, let’s blow away our PostgreSQL 9.6 cluster and use
pg_upgrade to complete the same task. You would do this with the application(s) down as well!
doug@Douglass-MacBook-Pro ~/foo » rm -fr $PGDATA/* doug@Douglass-MacBook-Pro ~/foo » initdb $PGDATA doug@Douglass-MacBook-Pro ~/foo » export OPGDATA=$PGDATA/../9.5 doug@Douglass-MacBook-Pro ~/foo » time pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B /usr/local/opt/postgresql-9.6/bin pg_upgrade -d $OPGDATA -D $PGDATA -b /usr/local/opt/postgresql-9.5/bin -B 0.40s user 12.12s system 1% cpu 10:26.64 total
And we’re done in 10.5 minutes. It took 1/5 the outage of the dump / load method. And that’s on my puny dataset with my overworked laptop! Pretty impressive, no?
For the curious, the
pg_upgrade output that I omitted above for readability’s sake is: