I know a lot has been said about this beautiful value we affectionately call NULL, which is neither here nor there and that manages to catch many of us
off guard with its casual neither here nor thereness. Database analysts who are really just back seat mathematicians in disguise like to philosophize
about the unknown and pat themselves on the back when they feel they have mastered the unknown better than any one else. Of course database spatial analysts, the worst kind of back seat mathematicians,
like to talk not only about NULL but about EMPTY and compare notes with their brethren and write dissertations
about what to do about something that is neither here nor there
but is more known than the unknown, but not quite as known as the empty string.
Okay getting to the point, one of our clients asked us about a peculiar problem they had with a query, and the strange results they were getting. We admit this still manages to catch us off guard every once in a while.
PostgreSQL has various levels of encryption to choose from. In this article we'll go over the basics built-in and the more advanced provided by the contrib module pgcrypto. When encrypting data, as a general rule the harder you make it to
keep people out of your data, the easier it is for you to lock yourself out of your data. Not only does encryption make it difficult to read data, it
also takes more resources to query and decrypt. With those rules of thumb, its important to pick your encryption strategies based on the sensitivity of your data.
There are two basic kinds of encryption, one way and two way. In one way you don't ever care about decrypting the data into readable form, but you just want to verify the user knows what the underlying secret text is. This is normally used for passwords. In two way encryption, you want the ability to encrypt data as well as allow authorized users to decrypt it into a meaningful form. Data such as credit cards and SSNs would fall in this category.
One way encryption
Normally when people want one way encryption and just want a basic simple level of encryption, they use the md5 function which is built into PostgreSQL by default. The md5 function is equivalent to using the PASSWORD function in MySQL. If you want anything beyond that, you'll want to install
the pgcrypto contrib module.
pgcrypto comes packaged with most PostgreSQL installs including windows, and can be installed into a database by running the script in share/contrib/pgcrypto.sql of your PostgreSQL install. For PostgreSQL 8.4+, this adds 34 someodd functions to your list of options. For maintainability we like to install it in a
separate schema say crypto, and add this schema to our database search path.
For one way encryption, the crypt function packaged in pgcrypto provides an added level of security above the md5 way. The reason is that with md5, you can tell who has the same password because there is no salt so all people with the same password will have the same encoded md5 string.
With crypt, they will be different. To demonstrate lets create a table with two users who have happened to have chosen the same password.
Ordered Aggregates. This is extremely useful for spatial aggregates and ARRAY_AGG, STRING_AGG, and medians where you care about the order of the aggregation. Will have to give it a try.
For example if you are building a linestring using ST_MakeLine, a hack you normally do would be to order your dataset a certain way and then run ST_MakeLine. This will allow you to do
ST_MakeLine(pt_geom ORDER BY track_time) or ARRAY_AGG(student ORDER BY score) This is very very cool. Depesz has some examples of ordered aggregates.
Join removal -- this is a feature that will remove joins from the execution plans where they are not needed. For example where you have a left join that doesn't appear in a where or as a column in select. This is important
for people like us that rely on views to allow less skilled users to be able to write meaningful queries without knowing too much about joins or creating ad-hoc query tools that allow users to pick from multiple tables. Check out Robert Haas why join removal is cool for more use cases.
GRANT/REVOKE ON ALL object IN SCHEMA and ALTER DEFAULT PRIVILEGES. This is just a much simpler user-friendly way of applying permissions. I can't tell you how many times we get beat up by MySQL users who find the PostgreSQL security management tricky and tedious to get right.
Of course you can count on Depesz to have an example of this too Waiting for 9.0 - GRANT ALL
One of the coolest features of PostgreSQL is the ability to write functions using plain old
SQL. This feature it has had for a long time. Even before PostgreSQL 8.2. No other database to our knowledge has this feature. By SQL we mean sans procedural mumbo jumbo like
loops and what not. This is cool for two reasons:
Plain old SQL is the simplest to write and most anyone can write one and is just what the doctor ordered in many cases. PostgreSQL even allows you to write
aggregate functions with plain old SQL. Try to write an aggregate function in SQL Server
you've got to pull out your Visual Studio this and that and do some compiling and loading and you better know C# or VB.NET. Try in MySQL and you better learn C.
Do the same in PostgreSQL (you have a large choice of languages including SQL) and the code is simple to write. Nevermind
with MySQL and SQL Server, you aren't even allowed to do those type of things on a shared server or a server where the IT department is paranoid. The closest
with this much ease would be Oracle, which is unnecessarily verbose.
Most importantly -- since it is just SQL, for simple user-defined functions, a PostgreSQL sql function can often be in-lined into the overall query plan since
it only uses what is legal in plain old SQL.
This inlining feature is part of the secret sauce that makes PostGIS fast and easy to use.
So instead of writing geom1 && geom2 AND Intersects(geom1,geom2) -- a user can write
ST_Intersects(geom1,geom2) . The short-hand is even more striking when you think of the ST_DWithin function.
With an inlined function, the planner has visibility into the function and breaks apart the
spatial index short-circuit test && from the more exhaustive absolute test Intersects(geom1,geom2)
and has great flexibility in reordering the clauses in the plan.