Encrypting data with pgcrypto - Postgres OnLine Journal

PostgreSQL has various levels of encryption to choose from. In this article we'll go over the basics built-in and the more advanced provided by the contrib module pgcrypto. When encrypting data, as a general rule the harder you make it to keep people out of your data, the easier it is for you to lock yourself out of your data. Not only does encryption make it difficult to read data, it also takes more resources to query and decrypt. With those rules of thumb, its important to pick your encryption strategies based on the sensitivity of your data.

There are two basic kinds of encryption, one way and two way. In one way you don't ever care about decrypting the data into readable form, but you just want to verify the user knows what the underlying secret text is. This is normally used for passwords. In two way encryption, you want the ability to encrypt data as well as allow authorized users to decrypt it into a meaningful form. Data such as credit cards and SSNs would fall in this category.

One way encryption

Normally when people want one way encryption and just want a basic simple level of encryption, they use the md5 function which is built into PostgreSQL by default. The md5 function is equivalent to using the PASSWORD function in MySQL. If you want anything beyond that, you'll want to install the pgcrypto contrib module.

pgcrypto comes packaged with most PostgreSQL installs including windows, and can be installed into a database by running the script in share/contrib/pgcrypto.sql of your PostgreSQL install. For PostgreSQL 8.4+, this adds 34 someodd functions to your list of options. For maintainability we like to install it in a separate schema say crypto, and add this schema to our database search path.

For one way encryption, the crypt function packaged in pgcrypto provides an added level of security above the md5 way. The reason is that with md5, you can tell who has the same password because there is no salt so all people with the same password will have the same encoded md5 string. With crypt, they will be different. To demonstrate lets create a table with two users who have happened to have chosen the same password.

CREATE TABLE testusers(username varchar(100) PRIMARY KEY, cryptpwd text, md5pwd text);
INSERT INTO testusers(username, cryptpwd, md5pwd) 
    VALUES ('robby', crypt('test', gen_salt('md5')), md5('test')),
        ('artoo', crypt('test',gen_salt('md5')), md5('test'));
        
SELECT username, cryptpwd, md5pwd
    FROM testusers;
    
 username |              cryptpwd              |              md5pwd

----------+------------------------------------+---------------------------------
 robby   | $1$IOchfG/z$bZW1pRFA3wuvn6pAuD.Du/ | 098f6bcd4621d373cade4e832627b4f6
 artoo   | $1$84oZTXI/$yZ6wV5jhJo6aQYrTciMQR/ | 098f6bcd4621d373cade4e832627b4f6

Observe that both users have chosen the same password test. The md5 version is the same for both, but the crypted password is different although they are the same password. When any log in, we do this test.


 
 -- successful login
 SELECT username 
    FROM testusers 
    WHERE username = 'robby' AND cryptpwd = crypt('test', cryptpwd);
-- successful login     
 SELECT username 
    FROM testusers 
    WHERE username = 'artoo' AND cryptpwd = crypt('test', cryptpwd);
    
-- unsuccessful login
 SELECT username 
    FROM testusers 
    WHERE username = 'artoo' AND cryptpwd = crypt('artoo', cryptpwd);
    
-- using md5
SELECT username
    FROM testusers
    WHERE username = 'robby' and md5pwd = md5('test');

In the crypt case we use the encrypted password to determine the unencrypted password is the same as the encrypted. Passing in the encrypted password unsaltifies things so to speak. For md5, we don't need any of that and thus its easier to crack since the same password will yield the same md5 code.

Basics of 2 way encryption decryption with PGP encryption

For data that you care about retrieving, you don't want to know if the two pieces of information are the same, but you don't know that information, and you want only authorized users to be able to retrieve it. Information like this would be things like credit cards, social security numbers or swiss bank account numbers etc.

One of the most useful and easy to use encryption modes provided in pgcrypto is the PGP encryption functions. For these set of exercises, we'll go thru using PGP encryption to encrypt sensitive database data and also how to decrypt it. There are 2 kinds of PGP encryption you can use.

One is public/private also known as asymmetric - We call this the shazam mode of conduct because only the public key can encrypt it and only the private key can read it.
symmetric - the shared secret web of trust. The keys to encrypt and read are the same.

Take the case of the database with sensitive information, if you are worried about the database falling in the wrong hands, you would probably be better to use the public/private key approach. This ensures you can encrypt the data with a public key you store in the database or even as part of a trigger or even in plain site in an application, but Users who need to be able to read this secure information would need the private key to decrypt it. So a person stealing your database even though they can see the public key, it does them no good at trying to get the information.

Making PGP Keys

Before we can use PGP encryption, we need to make keys. Details below:

If you are on some Linux OS you probably have the command line tool called gpg that you can use already available. If you are on windows, you need to download them from somewhere like this page GNU Pg binaries. Way at the bottom of the page you should find gnupg-w32cli-1.4.10b.exe. Download and install that or you can simply extract the folder instead of installing and run from anywhere. By default it installs in folder C:\Program Files\GNU\GnuPG. You can copy these files anywhere. Really no need for installation.

Next we do the following more or less verbatim from the PostgreSQL pgcrypto docs. These steps will work on Linux/Unix/Mac OSX or windows

gpg --gen-key and follow the directions. Note if you don't need super security, you can just click enter to the password phrase thus one less argument you need to pass when decrypting your data.
gpg --list-secret-keys This will provide you a list of keys one being the one you generated. It will look something like:
```
sec   1024R/123ABCD 2010-06-15
uid                  My data key (super data encrypt) 
ssb   1024R/999DEFG 2010-06-15
```
Where the 1024R is the bit strength I chose and 123ABCD is the private key and 999DEFG is the public key.
gpg -a --export 999DEFG > public.key Replacing the 999DEFG with hmm your public key code. This is the key you will need to encrypt data.
gpg -a --export-secret-keys 123ABCD > secret.key again Replacing the 123ABCD with hmm yourprivate key code. This is the key you will need to decrypt data.

As a general note the -a switch is short for armour. Each key by default is a binary ugly looking thing that is hard to handle. The -a converts it to a palpable like piece of text easy for cut and paste. When using it, you will need the PostgreSQL pgcrypto dearmor function to feed it into the relevant encrypt/decrypt functions.

Encrypting/Decrypting data using PGP Public Private

Now that we have our public and private keys, we are ready to encrypt the data. In this set of exercises, we'll demonstrate using the following functions:

pgp_pub_encrypt - this is the function we will use to encrypt our data using our public key. There are two parallel functions pgp_pub_encrypt_bytea for encrypting binary data, pgp_sym_encrypt/pgp_sym_encrypt_bytea for using a symmetric key to encrypt data.
pgp_pub_decrypt - this is the function we will use to decrypt our data using our private key. There are parallel functions pgp_pub_decrypt_bytea for decrypting binary data,pgp_sym_decrypt/pgp_sym_decrypt_bytea for decrypting symmetrically encrypted data.
dearmor - as we stated earlier we used the -a argument in gpg to produce an easy to cut and paste text version of our keys. The keys are binary in nature, so we need dearmor to convert them back to their native format suitable for passing to the encrypt/decrypt functions
pgp_key_id - over time you may use several keys to encrypt your data in the database so some data may be accessible by one key and some by another. You may even use different keys for different set of users so they can't view each others personal data. This function tells you the key that was used to encrypt a given piece of data so that you can pull from your chest of collected keys, the right key to decrypt a certain piece of data.

First encrypting the data

CREATE TABLE testuserscards(card_id SERIAL PRIMARY KEY, username varchar(100), cc bytea); 

-- To encrypt the data
INSERT INTO testuserscards(username, cc)
SELECT robotccs.username, pgp_pub_encrypt(robotccs.cc, keys.pubkey) As cc
FROM (VALUES ('robby', '41111111111111111'),
    ('artoo', '41111111111111112') ) As robotccs(username, cc)
     CROSS JOIN (SELECT dearmor('-----BEGIN PGP PUBLIC KEY BLOCK-----
super publickey goobly gook goes here
-----END PGP PUBLIC KEY BLOCK-----') As pubkey) As keys;

-- Now if we select our data
SELECT username, cc 
FROM testuserscards;

We will see a whole bunch of encrypted stuff in the cc column that is way too hard to print on this page.

Now we can use pgp_keyid to verify which public key we used to encrypt our data.

SELECT pgp_key_id(dearmor('-----BEGIN PGP PUBLIC KEY BLOCK-----
super publickey goobly gook goes here
-----END PGP PUBLIC KEY BLOCK-----'));

-- gives you something like 
-- where last set of characters is the public key id you got in gpg
E0B086C2999DEFG


--verify our data was encrypted with the above public key
SELECT username, pgp_key_id(cc) As keyweused
FROM testuserscards;

 username |    keyweused
----------+------------------
 robby    | E0B086C2999DEFG
 artoo    | E0B086C2999DEFG

To decrypt the data we pull from our chest of private keys matching the public key we used to encrypt with.

-- To decrpt the data

SELECT username, pgp_pub_decrypt(cc, keys.privkey) As ccdecrypt
FROM testuserscards 
    CROSS JOIN
    (SELECT dearmor('-----BEGIN PGP PRIVATE KEY BLOCK-----
super private key gobbly gook goes here
-----END PGP PRIVATE KEY BLOCK-----') As privkey) As keys;


-- We get --
 username |  ccdecrypt
----------+-------------------
 robby    | 41111111111111111
 artoo    | 41111111111111112