Logical Replication with Skytools3

Reading time ~5 minutes

UPDATE: My coworker Richard liked this write up, and Skytools, so much he threw together a demo script. You can get it here.

I recently had to do a near-zero downtime upgrade from PostgreSQL 8.4.x to PostgreSQL 9.4.x for a custmer. I couldn’t use streaming replication because of the change in major version (and because it’s simply not present in 8.x), so that left me looking at logical replication options. Usually, everyone else would be thinking Slony right here. I’ve only messed with Slony a few times, but each time was a pita, and the whole thing just seemed overly complicated to me. So I decided to give Londiste a look.

Londiste is part of the Skytools suite, originally developed by Skype back when they were a ‘no central node’ setup. As such, the thing was literally born to be “master-master” and assumes nodes come and go at will, so it’s got all the tools to handle bringing nodes up/down, marking them active/inactive, catching them up, etc. It’s written in Python, and uses plain text ini files for configuration.

There’s really only two hurdles that I found with using Londiste. First is that if you can’t get the rpms from the PGDG Yum Repo you’re looking at compiling from Git. And second, the online documentation for it is hard to find, hard to follow, and practically no one has used it so you can’t ask RandomPostgresPerson for help.

Which is exactly why I’m writing this blog post. Here’s what I needed to get me through the migration in question. I hope it helps you, should you consider using Londiste for your own replication needs. To whit:

  • As with other logical replication tools, you must ensure that all the tables to be replicated have a valid primary key. So before you even get started, determine which tables are missing them and then pass that list to your junior DBA and have them create pkeys while you continue on:
  • On the PostgreSQL 9.4.x server that will be receiving the replicated data, we need to ensure that all roles are pre-created. We want all ownerships and grants to be identical when we’re done, right? You can use pg_dumpall -g on the PostgreSQL 8.4.x to get a listing of roles.

  • Again, like Slony, we should pre-build the schema on the PostgreSQL 9.4.x server. I think you can actually get Londiste to do this for you as part of the replication, but I couldn’t find anything online for sure, and I didn’t have time to add more experimentation here (we’re on the customer’s dime here, remember). So, use pg_dump over the network and pipe it to pg_restore to transfer the schema thusly:

  • Install Skytools on the PostgreSQL 9.4.x server using the PGDG repo:
  • Install Skytools from source on the PostgreSQL 8.4.x server:
  • Restart the PostgreSQL 8.4.x cluster to load the new libs and modules

  • Now we configure the Londiste ticker. Note, we have trust setup for the postgres user in pg_hba.conf so there is no password= in the connection strings. Adjust to meet your setup:

  • Start up the ticker, to provide the replication “heartbeat” by running pgqd -d ticker.ini

  • Check the ticker.log to ensure there are no warnings or errors! You can stop the ticker with pgqd -s ticker.ini while you fix things.

  • Now, we tell Londiste about the master node (same note applies about the lack of password in the connection string):

  • We have to actually create the master node as the root node by doing:
  • Check the master.log to see if you have a line like INFO Node "master" initialized for queue "myappq" with type "root"

  • Now, spin up the master’s replication worker process by running londiste3 -d master.ini worker

  • Next, we configure our slave node (same note applies about the lack of password in the connection string):

  • Like the master, we have to create the slave node. I created it as a leaf but I could have created it as a branch if we we’re going to cascade replication:
  • Check the slave.log to see if you have the line INFO Node "slave" initialized for queue "myappq" with type "branch"

  • Spin up the slave’s replication worker process by running londiste3 -d slave.ini worker

  • Tell the master node that we want to replicate all the tables in the db (londiste3 master.ini add-table --all) as well as all the sequences (londiste3 master.ini add-seq --all). Note that this only adds the tables that currently exist. If you add new tables to the master db, you need to londiste3 master.ini add-table tablename to add them to replication. Ditto for new sequences.

  • For the slave node, also replicate all the tables (londiste3 slave.ini add-table --all) and all the sequences (londiste3 slave.ini add-seq --all). Note that this only adds the tables that currently exist. If you add new tables to the master db, you need to londiste3 slave.ini add-table tablename to add them to replication. Ditto for new sequences.

At this point, replication is actually up and running. Any changes occurring on the master node are being replicated to the slave node. That’s all you need to do.

But what about the data that was already in the master db? You don’t need to do anything. It’s already replicating. You can forcibly tell Londiste to ‘catch things up’ by doing londiste3 slave.ini resync --all if you like though.

If you want to check on the replication at any point, simply issue londiste3 slave.ini status or to be more pedantic londiste3 slave.ini compare which will examine row counts and md5sums between master and slave.

Enjoy your new cross-version logical replication!

Goodbye, Loui boy

Thank you, Loui, for bringing so much joy and happiness to our lives these past ten years. *I will miss you.*… Continue reading

New look, same content

Published on January 17, 2017

Changes to the blog

Published on December 02, 2016