And by “securely,” I mean “in a way that won’t get us sued or thrown in jail.”
To begin with: what we mean by “sensitive” data. This could be anything from passwords and credit card numbers to social security numbers and medical records. Basically, any information that you don’t want falling into the wrong hands.
Now, there are a few different ways to store sensitive data in PostgreSQL, but we’re going to focus on two main methods: encrypting at rest and encrypting in transit. Let’s start with the former.
Encrypting at Rest
When we say “at rest,” we mean that the data is sitting there doing nothing (i.e., not being transmitted or processed). This is typically what happens when your database is idle, or when you’re backing up your data to a file or another server. To encrypt this data, PostgreSQL provides us with two options: pg_crypt and pg_encrypt.
pg_crypt is an extension that allows us to store sensitive data in the form of hashed values (i.e., one-way encryption). This means that we can’t reverse engineer the original value from the hash, but we can still use it for comparison purposes. For example:
-- Create extension for pg_crypt
CREATE EXTENSION IF NOT EXISTS pg_crypt;
-- Insert a new row into mytable with a hashed password using pg_crypt
INSERT INTO mytable (password) VALUES (pg_crypt('mysecretpassword', gen_salt('bf')));
-- Explanation:
-- The first line creates the extension for pg_crypt if it does not already exist.
-- The second line inserts a new row into the table with a hashed password using pg_crypt.
-- The gen_salt function generates a random salt value to be used in the hashing process, making the password more secure.
In this case, our password will be stored as a hash in the database. When we need to check if someone’s entered the correct password during login, we can use pg_decrypt to compare the hashed value with what they’ve provided:
-- The following script is used to retrieve data from a table called "mytable" by comparing the hashed value of a password with the hashed value stored in the database.
SELECT * FROM mytable
-- Selects all columns from the table "mytable"
WHERE pg_crypt('mysecretpassword') = pg_crypt(password);
-- Uses the pg_crypt function to compare the hashed value of the password provided by the user with the hashed value stored in the database. If they match, the data will be retrieved.
pg_encrypt is another extension that allows us to store sensitive data in encrypted form. This means that the original value can be decrypted using a key, but only if we have access to it (i.e., it’s not stored in plaintext). For example:
-- Create extension for pg_encrypt
CREATE EXTENSION IF NOT EXISTS pg_encrypt;
-- Insert encrypted password into mytable
INSERT INTO mytable (password) VALUES (pg_encrypt('mysecretpassword', 'mypassphrase'));
-- The above script uses the pg_encrypt extension to encrypt the password 'mysecretpassword' using the key 'mypassphrase' and then inserts it into the password column of the mytable table. This ensures that the password is stored in an encrypted form, making it more secure.
In this case, our password will be stored as an encrypted value in the database. When we need to decrypt it for comparison purposes, we can use pg_decrypt:
-- This script is used to retrieve data from the table "mytable" where the password is equal to the decrypted value of 'mysecretpassword'.
-- The password is encrypted in the database and will be decrypted using the pg_decrypt function for comparison purposes.
SELECT *
FROM mytable
WHERE pgp_sym_decrypt(password, 'mypassphrase') = 'mysecretpassword'; -- The correct function to decrypt a password in PostgreSQL is pgp_sym_decrypt, not pg_decrypt.
Note that both of these methods require us to store the key or passphrase somewhere securely. If someone gains access to this information (either through a security breach or by stealing our server), they’ll be able to decrypt all of our sensitive data. This is why it’s important to keep your keys and passphrases in a separate, secure location (such as an encrypted file on another server).
Encrypting in Transit
Now encrypting data that’s being transmitted between servers or clients. This can be done using SSL/TLS, which is a standard protocol for securing network traffic. To enable SSL/TLS in PostgreSQL, we need to configure our server and client connections with the appropriate settings:
-- Server configuration (postgresql.conf)
-- Set ssl to on to enable SSL/TLS for server connections
ssl = on
-- Specify the path to the server certificate file
ssl_cert_file = '/path/to/server-cert.crt'
-- Specify the path to the server key file
ssl_key_file = '/path/to/server-key.pem'
-- Client configuration (pg_hba.conf)
-- Allow all connections from any IP address using md5 authentication
host all all 0.0.0.0/0 md5
-- Allow connections from localhost using ident authentication
hostssl all all 127.0.0.1/32 ident
-- Allow connections from localhost using peer authentication
-- This is a more secure option as it uses the operating system's user authentication
hostssl all all 127.0.0.1/32 peer
In this case, we’re enabling SSL/TLS on the server and configuring it to use a self-signed certificate (which is fine for testing purposes). We’re also allowing connections from localhost using SSL/TLS with client authentication enabled (i.e., we need to provide our username and password when connecting).
To connect to this server, we can use the psql command:
# Connect to the PostgreSQL server using the psql command
# -h specifies the host IP address, in this case, localhost (127.0.0.1)
# -p specifies the port number, in this case, the default port for PostgreSQL (5432)
# -d specifies the database to connect to, in this case, "mydatabase"
# -U specifies the username to use for authentication, in this case, "myuser"
# --password prompts for a password to be entered
# -c specifies the command to be executed, in this case, "SELECT * FROM mytable;"
psql -h localhost -p 5432 -d mydatabase -U myuser --password -c "SELECT * FROM mytable;"
Note that this will prompt us for our password, which is a good thing! It means that the data being transmitted between our client and server is encrypted.
Conclusion
And there you have it: two methods for storing sensitive data securely in PostgreSQL. Remember to keep your keys and passphrases safe, and always use SSL/TLS when transmitting data over a network.