PLPython Part 3: Using custom classes, pulling data from PostgreSQL

One of the great things about Python is the plethora of free libraries around to do very cool things from Engineering, statistical modeling, to BioInformatics.

In this Part we shall play around with a simple but pretty neat package called ASCII Plotter - we found at Python Package Index which appears to be kind of a CPAN except for Python instead of Perl.

For more details on using PL/Python check out PostgreSQL docs PostgreSQL 8.3 PL/Python docs

The ASCII Plotter package can also be dowloaded directly from http://www.algorithm.co.il/blogs/index.php/ascii-plotter/

Install ASCII Plotter classes

  1. Download aplotter.zip from above website and extract.
  2. This extracts to one file called aplotter.py
  3. To install do the following - copy aplotter.py to your python25/lib folder (on windows its usually C:\Python25\lib and on Linux (/usr/lib/python2.5)

Using Python contribs in PL/Python and embedding classes

Lets quickly test by writing a function:

Now there are a bunch of functions in the aplotter lib, but the easiest to use is the plot function. Unfortunately the plot function does a print instead of a return which writes to stdout. To use in our PL/Python, we need to be able to grab that text instead of outputting to screen. I found this trick that seems to work nicely which is documented at redirecting python's print statements to a file except its a bit of a misnomer in that it tells you how to output to a buffer which is just what the doctor ordered.

Using our highly advanced copy and paste skills with artistic improvisation -- we have this

CREATE OR REPLACE FUNCTION dofunkyplot (param_anum integer)
  RETURNS text
AS $$
 import aplotter
 import sys
 class WritableObject:
	def __init__(self):
		self.content = ''
	def write(self, string):
		self.content = self.content + string
 saveout = sys.stdout
 outbuffer = WritableObject()
 sys.stdout = outbuffer
 #range (param_anum) will return array
 #consisting of 0 to param_num - 1 and formula 
 # gets applied to each element
 # [1.1*x/(2 + x) for x in range(2)] -> [0 1.1/3]
 aplotter.plot([1.1*x/(2 + x) for x in range(param_anum)])
 sys.stdout = saveout
 return outbuffer.content
$$ LANGUAGE plpythonu;

And now we can impress our friends with tricks like this: The below example will draw 2 plots.

SELECT dofunkyplot(n)
FROM generate_series(5,20, 10) As n;
  |
   +0.73                                                                    *
   |                                                               ---------
   |                                                      ---------
   |                                             ---------
   |                                    ---------
   |                                ----
   |                           -----
   |                      -----
   |                 -----
   |               //
   |             //
   |           //
   |         //
   |       //
   |     //
   |   //
   | //
---+0-----------------------------------------------------------------------+---
   +0                                                                     +4

   |
   +0.96                                                               -----*
   |                                              ---------------------
   |                                    ----------
   |                         -----------
   |                    -----
   |               -----
   |             --
   |          ---
   |        //
   |      //
   |    //
   |   /
   |  /
   | /
   | /
   |/
   /
---+0-----------------------------------------------------------------------+---
   +0                                                                    +14

Reading PostgreSQL data from PL/Python

Now plotting functions is nice, but sometimes we are asked to plot data in a database. Go figure why anyone would want to do that. Below makes fake data, function reads data from a PostgreSQL table and plots it.

##Plotting fake data but lets pretend it is real
CREATE TABLE mydata(pt_id serial, pt_observation numeric(10,4));
INSERT INTO mydata(pt_observation)
	SELECT n*pi()*random()*10
		FROM generate_series(1,15) As n;


CREATE OR REPLACE FUNCTION doplotmydata (param_numitems integer)
  RETURNS text
AS $$
 import aplotter
 import sys
 class WritableObject:
	def __init__(self):
		self.content = ''
	def write(self, string):
		self.content = self.content + string
 saveout = sys.stdout
 outbuffer = WritableObject()
 #lets us just plot the first 10 items
 rv = plpy.execute("SELECT pt_id, pt_observation FROM mydata",param_numitems);
 sys.stdout = outbuffer
 #make pt_id x and pt_observation y
 aplotter.plot([x['pt_id'] for x in rv], [y['pt_observation'] for y in rv]);
 sys.stdout = saveout
 return outbuffer.content
$$ LANGUAGE plpythonu;

##and now to test
SELECT doplotmydata(20);

309.87                                            |                            +
                         /|         \ 
                        / |        | \ 
                         /  |        |  \ 
                       //    |       |   \            *
                      /      |      |     \\         /
                      /      |      |      \        /
                       /        |     |       \       /
                       /        |    |         \     /
                      /         |    |          \   /
            \         \       /         |    |           \  /
             /\        / \      /          |  |             \/
            /  \      /   \    /           |  |              /
           /    \    /     \   /           |  |
           //      \  /       \ /             ||
        -----         \/         \/             ||
    ------               /          /             ||
5.5967==-----------------------------------------------|--------------------+--+
   +1                                                                    +15