Even though others have blogged about this in the past and its well-documented in the docs, its a frequently enough asked question, that we thought we'd post it here again
with a couple of additional twists.
How to determine the size of a database on disk
SELECT pg_size_pretty(pg_database_size('somedatabase')) As fulldbsize;
How to determine the size of a database table on disk
NOTE: There are two functions in PostgreSQL - pg_relation_size and pg_total_relation_size. The pg_relation_size just measures the size of the actual table where
as the pg_total_relation_size includes both the table and all its toasted tables and indexes.
SELECT pg_size_pretty(pg_total_relation_size('someschema.sometable')) As fulltblsize,
pg_size_pretty(pg_relation_size('someschema.sometable')) As justthetblsize;
How to determine the size of a database schema
When you are doing a lot of data loading for later massaging to dump into your production tables, you will often generate junk tables in the process and worse yet other
people may generate junk tables period. As a general practice, we like to create a schema just for junk. The reason for that is that it is extremely easy to
exclude schemas from being backed up and to load individual schemas. So with this approach we create or move all tables we are not quite sure if they are needed or we only need temporarily
into a schema
called junk, scratch, possiblejunk or whatever you want to call the schema and exclude it from our backups.
Something of the form:
CREATE SCHEMA scratch;
ALTER TABLE someschema.joeyplaying SET SCHEMA scratch;
After saving the world from junk, you would like to know how much space you have saved your backups from backing up. So you would employ a query something of the form:
SELECT pg_size_pretty(pg_database_size('mycurrentdb')) As fullprod,
pg_size_pretty(CAST(pg_database_size('mycurrentdb') - (SELECT SUM(pg_total_relation_size(table_schema || '.' || table_name) )
FROM information_schema.tables WHERE table_schema = 'scratch') As bigint)) As tobebackedup_size,
pg_size_pretty(CAST((SELECT SUM(pg_total_relation_size(table_schema || '.' || table_name) )
WHERE table_schema = 'scratch') As bigint) ) As junk_size;
Size of Geometries in PostGIS tables
PostGIS has a companion function for measuring geometry size which is useful when you want to get a sense of how much space your geometries are taking up on disk
for a set of records.
SELECT ST_Mem_Size(ST_GeomFromText('LINESTRING(220268 150415,220227 150505,220227 150406)'));
SELECT pg_size_pretty(CAST(SUM(ST_Mem_Size(the_geom)) As bigint) ) as totgeomsum
FROM sometable WHERE state = 'MA';
First this is the first year that Ada Byron Lovelace day is celebrated, and somehow I managed to miss it. It was celebrated March 24th, 2009
so I guess I'm a couple of days behind the times. For those who are unfamiliar with who Ada Byron Lovelace is. She was the daughter of the poet Lord Byron
and considered to be the first computer programmer in the world. The fact that she was a woman I consider a side benefit. So I guess this means March 24th is also
computer programmer honor day too.
The OSGEO C-Camp Toronto Sprint was fun, although Leo and I couldn't stay for the whole event. I've never seen people close bugs so quickly.
and Mark Cave-Ayland were on a marathon run in the PostGIS ring. Olivier Courtin was also following not too far behind with SVG bug fixes and so forth. We also
discussed the possiblity of having ST_GeomFromGML, ST_GeomFromGeoJSON, ST_GeomFromKml and so forth and what that would entail.
It was great to meet Pierre Racine of WKT Raster fame in person and chat with Mateusz and Sandro Santilli via IRC. Frank Warmerdam, the GDAL god came to our table to provide his big two cents
about how WKT Raster meta data should be stored, dealing with large RASTERS and other things I didn't understand.
Mark Leslie in Australia did his part too, though he wasn't present -- he would come into IRC when others had fallen asleep. Such is the way with timezones. He has been working on beefing up the
curved support in PostGIS. The FOSS 4G 2009 conference will be in Sydney, Australia.
It was nice to be able to put a face to these people I've talked via newsgroups. It was also strange since most of the clients and many of the people we work
with we have never met, so the idea of meeting in person has become a very foreign concept for us.
To make it up
to Simon, we shall demonstrate 2 features we left out in our prior article.
Pretty SVG graphs -- so I don't have to hear more about SVG from Simon and how I'm caught in the 70s.
Ability to plot a schedule -- schedule item, start date, end date using a multi-column aggregate function. Here is a good use for multi-column aggregates or at least I think so. (Yeh for Multi-column aggregates)
PL/Python is the procedural language for PostgreSQL that allows you to write database stored functions and triggers in Python. Python has proved to be a charming language and when used for where it excels, enhances the power of PostgreSQL quite nicely.
Unfortunately we can't quite capture all its charm in a single pager cheat sheet, but hopefully
this will give you a sense of its usefulness.
Below is a Thumbnail view of the PL\Python cheat sheet. This is by no means comprehensive, but are the features we thought may be useful to know.
PostgreSQL 8.2 and above has this pretty neat feature of allowing you to define aggregate functions
that take more than one column as an input. First we'll start off with a rather pointless but easy to relate to example and then we'll follow up with something
a bit more interesting.
For more examples of creating aggregates in PostgreSQL, check out our other articles: