One of the big changes in PostGIS 2.0 was that the geometry_columns table became a read only view. This change was not without sacrifice. On the plus it meant a table no longer needed to be maintained which was a relief for those who had to constantly ensure the sideline table was up to date. Also on the plus it meant we could base the views on the current user so that if a user connected, they would only see tables they had access to. The sacrifice made was a bit of speed. In most use cases, the speed difference is hardly noticeable and only noticeable if you have a query constantly polling this table. However if you have a database of 15,000 geometry tables it could take as long as 450ms as noted in this ticket.
So just playing with ideas, could the new PostgreSQL 9.3 event triggers and materialized views provide a solution. Keep in mind this is just food for thought. We're not condoning people go out and hack their PostGIS install. We suspect if we implement such a thing in PostGIS core it will change from what we propose here. If you saw our prior article Caching data with materialized views and statement level triggers you know where this is going. We'll add yet another concept to this cocktail and that is what we'll call schema hiding which by that we mean counting on the order of a search_path to hide a named table/view you don't want.
Step one is to install PostGIS in its own schema and make sure it comes after the public schema. This insures that if you do for some reason create a geometry_columns view or table or what have you in public, it will be used before the postgis packaged one.
CREATE SCHEMA postgis; CREATE EXTENSION postgis SCHEMA postgis; ALTER DATABASE test_gis SET search_path = "$user",public,postgis;
Now we'll create a geometry_columns materialized view in the public scheam. Note that because in step 1 we set the search path such that the geometry_columns view provide by postgis would be used after one in the user's schema or public schema. So now we are in position to put forth an alternative geometry_columns and one that will be used instead of the installed one.
The view we build is almost identical to the one provided with postgis except we took out the logic that limits the list of tables to just those the current user has select rights to. We don't want the materialized view to be user dependent.
CREATE MATERIALIZED VIEW public.geometry_columns AS SELECT current_database()::character varying(256) AS f_table_catalog, n.nspname::character varying(256) AS f_table_schema, c.relname::character varying(256) AS f_table_name, a.attname::character varying(256) AS f_geometry_column, COALESCE(NULLIF(postgis.postgis_typmod_dims(a.atttypmod), 2) , postgis.postgis_constraint_dims(n.nspname::text, c.relname::text, a.attname::text), 2) AS coord_dimension, COALESCE(NULLIF(postgis.postgis_typmod_srid(a.atttypmod), 0) , postgis.postgis_constraint_srid(n.nspname::text, c.relname::text, a.attname::text), 0) AS srid, replace(replace(COALESCE(NULLIF(upper(postgis.postgis_typmod_type(a.atttypmod)), 'GEOMETRY'::text) , postgis.postgis_constraint_type(n.nspname::text, c.relname::text, a.attname::text)::text , 'GEOMETRY'::text), 'ZM'::text, ''::text), 'Z'::text, ''::text)::character varying(30) AS type FROM pg_class c, pg_attribute a, pg_type t, pg_namespace n WHERE t.typname = 'geometry'::name AND a.attisdropped = false AND a.atttypid = t.oid AND a.attrelid = c.oid AND c.relnamespace = n.oid AND (c.relkind = 'r'::"char" OR c.relkind = 'v'::"char") AND NOT pg_is_other_temp_schema(c.relnamespace) AND NOT (c.relname = 'raster_columns'::name) ;
For this magic step, we are going to define a data definition language trigger (DDL) that refreshes our view whenever a new table is created or altered or we add constraints. We ned the constraint one to catch constraint based geometry tables.
CREATE OR REPLACE FUNCTION trig_refresh_geometry_columns() RETURNS event_trigger AS $$ BEGIN IF tg_tag IN('CREATE TABLE','CREATE TABLE AS' , 'CREATE FOREIGH TABLE', 'DROP TABLE', 'ALTER TABLE' , 'CREATE VIEW', 'ALTER VIEW') THEN REFRESH MATERIALIZED VIEW public.geometry_columns; END IF; END; $$ LANGUAGE plpgsql; CREATE EVENT TRIGGER trig_01_refresh_geometry_columns ON ddl_command_end EXECUTE PROCEDURE trig_refresh_geometry_columns();
CREATE TABLE testgeom(gid serial primary key, geom geometry(POINT,4326) ); ALTER TABLE testgeom ADD COLUMN geom2 geometry(POLYGON, 4326); SELECT f_table_name As tname, f_geometry_column As gc, srid, type FROM public.geometry_columns;
tname | gc | srid | type ----------+-------+------+--------- testgeom | geom | 4326 | POINT testgeom | geom2 | 4326 | POLYGON
The script below generates 1000 geometry tables.
DO LANGUAGE 'plpgsql' $$ DECLARE i int; BEGIN FOR i IN 1 .. 1000 LOOP EXECUTE 'CREATE TABLE gtest' || i::text || '(gid serial primary key, geom geometry(POINT,4326))'; END LOOP; END $$
If run with the trigger in place, it takes 63062 ms (a little over a minute to finish). Without the trigger it takes 3,831ms ~ 3seconds). This is because each call to CREATE TABLE causes a REFRESH VIEW. So moral of that story, if you plan to create 1000 tables in one go, then disable the trigger and manually refresh the view yourself.
ALTER EVENT TRIGGER trig_01_refresh_geometry_columns DISABLE; --do your crazy thing - ALTER EVENT TRIGGER trig_01_refresh_geometry_columns ENABLE; REFRESH MATERIALIZED VIEW geometry_columns;
Now to test speed of querying. Since we put the public first in our search_path, when we connect to database, the public.geometry_columns one will be used if we don't schema qualify.
-- 11ms SELECT count(*) from geometry_columns; -- 41ms SELECT count(*) from postgis.geometry_columns; -- this uses the one in public - takes 11 ms SELECT srid, type, count(*) as tot FROM geometry_columns GROUP BY srid,type; --this uses the one in postgis takes 61 ms SELECT srid, type, count(*) as tot FROM postgis.geometry_columns GROUP BY srid,type;