TRUNCATE problems with Slony
Slony (http://slony.info/) is great for high availability/load balancing and I have been recommending it to users for years now and haven’t really seen any major problems with it if you have got proper checks on the whole replication setup using proper monitoring with something like Nagios.
But well at times there is this annoying thing that always gave me trouble when a user goes and just runs a TRUNCATE on one of the master tables in order to do maintenance and all of the sudden you start getting errors like this on the slave nodes:
2010-06-01 11:02:09 ESTERROR remoteWorkerThread_1: "insert into "public"."table1" ("a","b") values ('1','one');
" ERROR: duplicate key value violates unique constraint "table1_pkey"
The reason behind this is…
- You did a truncate on a table at master node and assumed that this statement is all replicated to slaves as well
- Truncate event is not triggered like INSERT/UPDATE/DELETE (For PostgreSQL < 8.4) so that means the slave never got it and they still have the old copy of the table
- On master the table is now empty but the slave table has old records, which means on inserting new data you might start getting duplicate key violation error
I have used the following approach in the past to solve it…
Do a TRUNCATE on slaves for the same table and you will then see all those INSERTS/DELETES/UPDATES going through in the Slony logs file that have been queued up since Slony started getting these duplicate key violation problems
In order to avoid these scenarios, you have to keep on checking Slony lag times from sl_status of replication schema and alert if it goes above the dangerous lag limit. If it does, go and check the Slony logs as they usually tell the detailed story.
Another option to avoid doing TRUNCATE is by using table partitioning, which means when you need to drop old data just get rid of the specific partition and drop it out of the Slony cluster as well instead of doing a TRUNCATE. This way it will save you a lot of database routine maintenance like vacuuming as well because you are not deleting data using DELETEs but you are now dropping the whole partition.
I have recently seen trigger support for PostgreSQL 8.4 and above but I am not sure if it’s been implemented in Slony yet or what exactly is the plan on that, but it will be great to have that with Slony. I do have seen Bucardo (http://blog.endpoint.com/2009/07/bucardo-and-truncate-triggers.html) supporting truncate with triggers though.