You have a system of products and categories and you want a product to be allowed to be in multiple categories, but you want a product to only be allowed to be in one main category.
How do you enforce this rule in the database?
Some people will say -- why can't you just deal with this in your application logic. Our general reason is that much of our updating doesn't happen at our application level. We like enforcing rules at the database
level because it saves us from ourselves. We are in the business of massaging data. For this particular kind of example we wanted to make sure the database would provide us a nice safety net so that
we wouldn't accidentally assign a product in two main categories.
There are two approaches we thought of. One is the obvious have a primary category column and a bridge table that has secondary categories. That is an ugly solution because when you do a query you have to do a union
and always treat the secondary categories as different from the main. For most use-cases we don't usually care about distinguisihing primary from secondary category.
The solution we finally settled on was to have one bridge table with a boolean field for if its the main category. We enforce the only one main category requirement using a partial index. Now not all databases support partial indexes
This is one major value of using PostgreSQL that you have so many more options for implementing logic.
As some people noted in the comments and the reddit entry. SQL Server 2008 has a similar feature called Filtered Indexes. Though PostgreSQL has had partial indexes for as far back as I can remember. This brings up an interesting point which I have observed -- if you were using PostgreSQL before, you would already know how to use the Filtered Indexes, Multi row inserts introduced in SQL Server 2008, and the SEQUENCES feature coming in SQL Server 2010. So we should all use PostgreSQL, because it teaches us how to use the newer versions of SQL Server before they come out. :)
So how does the partial index solution look: NOTE for simplicity, we are leaving out all the complimentary tables and the foreign key constraints that we also have in place.
CREATE TABLE products_categories
category_id integer NOT NULL,
product_id integer NOT NULL,
main boolean NOT NULL DEFAULT false,
orderby integer NOT NULL DEFAULT 0,
CONSTRAINT products_categories_pkey PRIMARY KEY (category_id, product_id)
CREATE UNIQUE INDEX idx_products_categories_primary
WHERE main = true;
Testing it out. It saves us and gives us a nice informative message to boot.
INSERT INTO products_categories(category_id, product_id, main)
VALUES (1,2,true), (3,2,false), (3,3,true), (4,2,true);
ERROR: duplicate key value violates unique constraint "idx_products_categories_primary"
DETAIL: Key (product_id)=(2) already exists.