Postgres OnLine Journal: Jan 2017 - Jun 2017
An in-depth Exploration of the PostgreSQL Open Source Database
 

Table Of Contents

What's new and upcoming in PostgreSQL
PostgreSQL Q & A
Basics
Using PostgreSQL Extensions

What's new and upcoming in PostgreSQL

 

CASE with set returning functions in PostgreSQL 10



One of the changes coming in PostgreSQL 10 is the ability for the CASE .. WHEN statement to return multiple rows if the expression contains a set returning function. To demonstrate the feature, we'll use the classic generate_series function:

Since this article was written, this feature has been removed, because it caused old logic that had sets from doing something different without warning. Refer to this discussion for details.

generate_series with CASE WHEN

SELECT CASE generate_series(1,10) WHEN 1 THEN 'First' ELSE 'Not First' END;

In PostgreSQL 10, this returns:

   case
-----------
 First
 Not First
 Not First
 Not First
 Not First
 Not First
 Not First
 Not First
 Not First
 Not First
(10 rows)

In PostgreSQL 9.6 and below, you are slapped with an error message:

ERROR:  set-valued function called in context that cannot accept a set

Interesting, right, but how is this useful? Why would I ever do that instead of something like this which will work in all versions?

SELECT CASE i WHEN 1 THEN 'First' ELSE 'Not First' END
FROM generate_series(1,10) AS i;

The answer seems reading from the commit notes, that it's a side effect of other optimizations and introduction of new ProjectSet executor node.

As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it.

Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later.

It should be noted though that, the new approach if you wanted to do something kinda crazy like that does seem to be faster than the standard approach.

Here is a quick test I did to compare the results and timing:

New feature way:

SELECT count(*)
FROM (SELECT CASE generate_series(1,10000000) WHEN 1 THEN 'First' ELSE 'Not First' END) AS f;

Uses this new thing called a ProjectSet executor and takes ~1 second

Aggregate  (cost=32.52..32.53 rows=1 width=8) (actual time=1036.048..1036.048 rows=1 loops=1)
  Output: count(*)
  ->  Result  (cost=0.00..20.02 rows=1000 width=32) (actual time=0.004..722.375 rows=10000000 loops=1)
        Output: CASE (generate_series(1, 10000000)) WHEN 1 THEN 'First'::text ELSE 'Not First'::text END
        ->  ProjectSet  (cost=0.00..5.02 rows=1000 width=4) (actual time=0.002..329.822 rows=10000000 loops=1)
              Output: generate_series(1, 10000000)
              ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.001 rows=1 loops=1)
Planning time: 0.034 ms
Execution time: 1036.082 ms

Good old backwards compatible way:

SELECT count(*)
   FROM (SELECT CASE i WHEN 1 THEN 'First' ELSE 'Not First' END
             FROM generate_series(1,10000000) AS i) AS f;
Aggregate  (cost=12.50..12.51 rows=1 width=8) 
	(actual time=1800.258..1800.259 rows=1 loops=1)
  Output: count(*)
  ->  Function Scan on pg_catalog.generate_series i  (cost=0.00..10.00 rows=1000 width=0) 
  		(actual time=863.841..1469.087 rows=10000000 loops=1)
        Output: i.i
        Function Call: generate_series(1, 10000000)
Planning time: 0.032 ms
Execution time: 1818.415 ms

Wow that's much slower.

One thing I love about PostgreSQL that you miss with closed-source and even with other open source projects, is that when you are curious about a new feature or why it's there there is plenty of dialogue on the hacker list and commit logs to keep you entertained.


PostgreSQL Q & A

 

Composite data types why IS NOT NULL is not the same as NOT IS NULL



With most types you deal with in SQL and more specifically PostgreSQL, when you use a condition such as something IS NULL and it returns false, you can be assured that something IS NOT NULL would return true. This is not the case with composite types.

Composite types are types that can contain multiple attributes. A classical case is a table row type that exists for all tables in PostgreSQL. If you have a value something that is a composite type, for example a row of a table, something IS NOT NULL is only true if none of the attributes of the composite are NULL.

Let's see this in action with a built in table in PostgreSQL

SELECT  COUNT(1) AS total, 
    COUNT(1) FILTER (WHERE t IS NULL) AS tnull, 
    COUNT(1) FILTER (WHERE t IS NOT NULL) AS tisnotnull,
    COUNT(1) FILTER (WHERE NOT (t is NULL) ) AS tnotisnull
FROM pg_tables AS t;

Now intuition might be telling you, if the total number of records in pg_tables is 62 and none of them are NULL, then certainly none of them should be IS NOT NULL. You might also be thinking that asking for NOT (something IS NULL) is just another way of asking something IS NOT NULL. The output instead yields something somewhat surprising:

 total | tnull | tisnotnull | tnotisnull
 -------+-------+------------+------------
    62 |     0 |         10 |         62
(1 row)

The other take-aways from this exercise is if you want the anti-thesis of IS NULL, you really want to use NOT (something IS NULL) and not (something IS NOT NULL); using something IS NOT NULL is a terse way of checking if all the fields in your record are filled in.

So taking a peak at some of the records that violate common-sense of IS NOT NULL.

SELECT  *
FROM pg_tables AS t
WHERE NOT (t IS NOT NULL) AND NOT (t IS NULL)
limit 2;

They all have at least one attribute that IS NULL, in this case the tablespace.

 schemaname |  tablename   | tableowner | tablespace | hasindexes | hasrules | hastriggers | rowsecurity
------------+--------------+------------+------------+------------+----------+-------------+-------------
 pg_catalog | pg_statistic | postgres   | NULL       | t          | f        | f           | f
 pg_catalog | pg_type      | postgres   | NULL       | t          | f        | f           | f
(2 rows)

The 10 records that don't violate IS NOT NULL, all have everything filled in

If a composite has all attributes that are NULL, then it is considered to be NULL. You can test this out yourself with this query:

SELECT ROW(NULL::name, NULL::name, NULL::name, NULL::name, NULL::boolean, 
	NULL::boolean, NULL::boolean, NULL::boolean)::pg_tables IS NULL;
SELECT NULL::pg_tables IS NULL;

Both examples above yield true though they have vastly different representations.


Basics

 

Dollar-quoting for escaping single quotes Beginner



PostgreSQL has a feature called dollar-quoting, which allows you to include a body of text without escaping the single quotes. This feature has existed for quite some time. You've probably seen this in action when defining functions for example:

CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
$$
SELECT 'Hello world. My name is ' || param_your_name || '.';
$$
language sql STRICT;

Which is easier to read, than the equivalent escape quoted function:


CREATE OR REPLACE FUNCTION hello_world(param_your_name text)
RETURNS text AS
'
SELECT ''Hello world. My name is '' || param_your_name || ''.'';
'
language sql STRICT;

I had almost forgotten about the usefulness of dollar-quoting outside of defining functions, until Vicky Vergara reminded me when we were working on some examples for our recently published book pgRouting: A Practical Guide. pgRouting applications are a perfect use-case for this feature since pgRouting has a lot of functions that take as input an SQL statement. If you have SQL statements that contain quotes, it's a pain to have to escape all the quotes in them to pass as input to a function.

You can use this feature pretty much anywhere where text is required. My favorites besides the obvious function bodies are DO commands, variables within functions, function inputs, and function/table/column comments.

Did you know that within dollar quoting, you can quote again using named dollar quoting. Named dollar quoting is using text in-between $$s as the delimiter and you can use any text you want. For example $sql$SELECT 'Johnny be good';$sql$. The key is to use a named dollar-quoting delimiter that is different from your outer delimiter. How does this work:

Here is a function that takes an sql expression that should return a text and executes it:

CREATE OR REPLACE FUNCTION sql_expression(param_sql text) 
RETURNS text AS
$$ 
DECLARE var_result text;
BEGIN
    EXECUTE param_sql INTO var_result;
    RETURN var_result;
END;
$$
language 'plpgsql' STABLE STRICT;

We can then use the function as follows:

SELECT sql_expression($sql$SELECT hello_world($phrase$Regina's elephant's dog$phrase$) 
    || $phrase$ I made a cat's meow today.$phrase$ $sql$);

Which outputs:

Hello world. My name is Regina's elephant's dog. I made a cat's meow today.

If you were to do this using escaping quotes, you'd have to escape out the inner quotes twice as follows:

SELECT sql_expression('SELECT hello_world(''Regina''''s elephant''''s dog'') 
    || '' I made a cat''''s meow today.'' ');

Which makes it difficult to just copy and paste the text from your scrap book of phrases.

This topic is one of the PostgreSQLisms covered in our upcoming book PostgreSQL: Up and Running 3rd Edition currently available in O'Reilly Early Release program.


Using PostgreSQL Extensions

 

PostgreSQL JSQuery extension Windows binaries



PostgreSQL JSQuery extension Windows binaries

The JSQuery extension is a PostgreSQL extension developed by Postgres Professional. You can get the source code and instructions for use at https://github.com/postgrespro/jsquery. JSQuery is a fairly easy compile install if you are on a Nix system. It provides more query functionality and additional index operator classes to support for JSONB than you get in built in PostgreSQL. It is supported for PostgreSQL 9.4 and above.

We've built windows binaries for PostgreSQL 64-bit 9.4, 9.5, 9.6, and 10beta1. The 9.4 64-bit will only install on the EDB PostgreSQL 9.4 64-bit distribution. The 9.5 and 9.6 are compatible with both PostgreSQL EDB and BigSQL distributions. The 10 has only been tested on BigSQL, but should work on EDB when it comes out. We should have 32-bit versions later and will link to those here.

Binaries updated May 20th 2017 to include index constants fix discussed in Fixed ! Bug in indexing scalars with jsquery and also inclusion of build for PostgreSQL 10 64-bit

The binaries

  • PostgreSQL 10beta1 64-bit zip, 7z
  • PostgreSQL 9.6 64-bit zip, 7z
  • PostgreSQL 9.5 64-bit zip, 7z
  • PostgreSQL 9.4 64-bit zip, 7z

After you have copied the binary files to your install, you can install the extension with:

CREATE EXTENSION jsquery;

To test do the following query:

SELECT f.data->>'name' AS name
FROM (VALUES ('{"age": 3, "name": "Ellie"}'::jsonb) , 
    ('{"age": 4, "name": "Peger"}'::jsonb) ) AS f(data)
WHERE  f.data @@ '"age" > 3';

Should give you:

 name
-------
 Peger
(1 row)