Making backups of select tables

Every once in a while, especially if you have a fairly large database, you may find the need to do select backups of certain tables. Your criteria might be based on name or how relatively recently data has changed in the table. Below are some of the tricks we use. Some use our favorite hack of scripting command line scripts with SQL.

Backup specifically named tables - no tricks

The simple case is when you know exactly what tables you need backed up and there aren't too many that its easy enough to type them out. Here you just use the -t option for as many tables as you need to backup.

pg_dump -h localhost -p 5432 -U postgres -F c -b -v -f "/pgbak/somedb_keytbls.backup" -t someschema.sometable1 -t someschema.sometable2 somedb

Generate script to backup specifically named tables based on table name filter using SQL

We use this approach if say we have multiple schemas with same similarly named critical tables. We can't use the -n dump option because the tables cross schemas, but we know they all have similar names. You can get fairly fancy with this and even use information_schema.columns to get even fancier. The below will generate a pg_dump command to backup any table not in public or pg_catalog that has notes as part of the table name.

--- Simple go by name use information_schema.tables
-- but note you can use pg_stat_user_tables instead
SELECT 'pg_dump ' || ' -h localhost -p 5432 -U postgres  -F c -b -v -f "/pgbak/somedb_keytbls.backup" ' ||
  array_to_string(ARRAY(SELECT '-t ' || table_schema || '.' || table_name   
        FROM information_schema.tables 
        WHERE table_name LIKE '%_notes' AND table_schema NOT IN('pg_catalog','public' )
    ), ' ') || ' somedb';

Backup recently changed tables using stats view

Sometimes we want to backup just tables that have recently changed. There doesn't seem to be an absolutely perfect way of doing this, so we use the vacuum stats as the next best thing. This trick only works if you have autovacuum on. It uses the assumption that the vacuum process will try to go around and vacuum tables, where enough data has changed since the last vacuum run. The vacuum run setting you can tweak for each table for at least 8.4 and up. The below example will generate a pg_dump command to backup all tables in somedb that have been auto analyzed in the past 24 hours.

-- Backup recently updated tables based on auto analyze date
SELECT 'pg_dump ' || ' -h localhost -p 5432 -U postgres  -F c -b -v -f "/pgbak/somedb_keytbls.backup" ' ||
  array_to_string(ARRAY(SELECT '-t ' || schemaname || '.' || relname 
        FROM pg_stat_user_tables 
        WHERE last_autoanalyze > ( CURRENT_TIMESTAMP - (INTERVAL '1 day') )  )
    , ' ') || ' somedb';