For these exercises I'm going to use these basic tables, so you can follow along. Be warned I'm using arrays here
just to say I'm using arrays.
Arrays are so fundamental to my suite of tools
that I can't even acknowledge I am using them except when I get stuck working in another relational database that doesn't support them
which is pretty much every other non-PostgreSQL database I have the misfortune of working with.
But I'm going to save my array love fest for another article, and just lightly use them here to create my test tables.
CREATE TABLE members(id bigint GENERATED ALWAYS AS IDENTITY,
name_first varchar(50), name_last varchar(50) );
INSERT INTO members(name_first, name_last)
VALUES ('Jill', 'Hill'), ('Jack', 'Hill');
CREATE TABLE orders(id bigint GENERATED ALWAYS AS IDENTITY, id_member bigint, date_add timestamptz DEFAULT CURRENT_TIMESTAMP);
INSERT INTO orders(id_member)
SELECT i % 2
FROM generate_series(1,10) AS i;
CREATE TABLE orders_items(id bigint GENERATED ALWAYS AS IDENTITY, id_order bigint, item_code varchar(100), quantity integer);
INSERT INTO orders_items(id_order, item_code, quantity)
SELECT i, (ARRAY['orange', 'apple', 'banana', 'chocolate'])[ j ], (random()*6)::integer
FROM generate_series(1,5) AS i CROSS JOIN generate_series(1, (random()*i)::integer + 1 ) AS j;
DISTINCT ON
I have never seen another database offer the DISTINCT ON clause. Sure every relational database
has a DISTINCT clause, but when was the last time you saw a database support the DISTINCT ON clause.
For those who are unfamiliar with this gem of a clause. What differentiates a DISTINCT ON from a DISTINCT
is you have a choice of what columns should be considered distinct and your ORDER BY controls your preferred row to return.
The only caveat is if you have an ORDER BY then the beginning list of columns should include your DISTINCT ON columns.
Here is such an example:
SELECT DISTINCT ON(m.name_last, m.name_first, m.id) m.name_last, m.name_first, o.*, oi.item_code
FROM members AS m INNER JOIN orders AS o ON m.id = o.id_member INNER JOIN orders_items AS oi ON o.id = oi.id_order
ORDER BY m.name_last, m.name_first, m.id, o.date_add DESC;
What does this statement do, it returns each members first and last name, the last order they made, and an item_code from an item on that order.
Return a row as a column
This next PostgreSQLism I consider the foundational piece that makes PostgreSQL object-relational.
It's the trick that you can stuff a whole row into a column. This feature has so many uses you can't imagine.
In its classic form, it looks like this:
SELECT m
FROM members AS m;
Which outputs:
m
---------------
(1,Jill,Hill)
(2,Jack,Hill)
(2 rows)
You might be thinking, so what? How is this useful? Its true value only shows itself when you dare
pass it to functions that understand composite types which is more functions than you realize. Such as to_jsonb, array_agg, and even window functions etc.
For example
SELECT to_jsonb(m) FROM members AS m;
OR more usefully combining with another PostgreSQLism
Ordered Aggregates
SELECT jsonb_agg(m ORDER BY m.name_first, m.name_last) AS output
FROM members AS m;
Output:
output
--------------------------------------------------------------------------------------------------------------
[{"id": 2, "name_last": "Hill", "name_first": "Jack"}
, {"id": 1, "name_last": "Hill", "name_first": "Jill"}]
Yes most PostgreSQL aggregates can have an ORDER BY in the clause. This feature is not limited to built-in aggregates, but any aggregate you create
or you see in an extension. Of course it's useless unless the answer changes depending on order in which you feed items.
Window Aggs with rows
Sure other relational databases have mastered window functions, just as PostgreSQL has mastered window functions, but can you pass a whole row into a window function in any other
relational database you have come across. Try this trick in another relational database and be prepared to be very disappointed. Sometimes I get a little lazy, like for example, I'd like to see Jill's previous order in her current order,
but I don't have time to go around and itemize all the fields in that last order, and hell am I going to write a bunch of lead function calls to get all the pieces I need.
So I do this instead.
WITH a AS (SELECT m.name_last, m.name_first, o,
lead(o) OVER(PARTITION BY m.id ORDER BY o.date_add) AS lead_o
FROM members AS m INNER JOIN orders AS o ON m.id = o.id_member
WHERE m.name_first = 'Jill'
)
SELECT a.name_last, a.name_first, (o).id AS orig_order_id, (lead_o).*
FROM a;
Admittedly that (subtable).field syntax takes a little bit of getting used to but it gets the job done.
Subtracting attributes from a jsonb
I have longed for an SQL syntax
SELECT m.* EXCEPT COLUMNS(name_last)
FROM members AS m
But my wish has not been granted. However there is the next best thing,
which is generally good enough, cause most of my time is spent slugging data at web apis and applications
who only want to be fed JSON for some reason.
So luckily I can do this
SELECT jsonb_agg(to_jsonb(m) - 'name_last' - 'id' ORDER BY name_first)
FROM members AS m;
Or slightly shorter if I've got a bunch of fields to remove
SELECT jsonb_agg(to_jsonb(m) - ARRAY['name_last','id'] ORDER BY name_first)
FROM members AS m;