Liberal use of views is key to good database design, according to the PostgreSQL documentation. They might be a bit of a hassle to use properly, but they have the potential to greatly simplify your application layer.
An example schema
To illustrate the usage of views, let’s use a simple example schema to play around with. Consider a web shop selling products. It serves international customers, so it presents products in multiple languages. Its database might have these tables:
create table products ( code character varying primary key, price integer not null, archived_at timestamp without time zone ); create table product_contents ( language character varying, code character varying references products(code), title character varying not null, description text, constraint product_content_pkey primary key (language, code) );
We’ve got a
products table containing our product code, unit price in cents and a timestamp to record when a product was archived. After this date, we should no longer show it in our shop. The
product_contents table has a title and description for a product in a particular language, denoted by
Let’s add a few records:
insert into products (code, price, archived_at) values ('abc123', 1099, null), ('def456', 899, '2018-12-31 12:00:00'); insert into product_contents (language, code, title, description) values ('en-GB', 'abc123', 'Squeaky toy', 'Your dog will love it.'), ('nl-NL', 'abc123', 'Piepspeelgoed', 'Je hond zal het geweldig vinden.');
Automatically updatable views
Simple views in PostgreSQL are automatically updatable. That means that you can
delete on them, and PostgreSQL will figure out how to translate your queries to the actual table for you. Let’s see it in action with a simple view to get only the non-archived products:
create view current_products as select code, price, archived_at from products where archived_at is null;
Of course, we can
select from this view and only ever get records where
null. But we can also
insert a new row into our table:
insert into current_products (code, price) values ('ghi789', 499);
Our new row got inserted into our table, with
null set for our unspecified
archived_at column (not because
archived_at was not specified, but because our view uses
null in its condition):
=# select * from products; code │ price │ archived_at ────────┼───────┼───────────────────── abc123 │ 1099 │ NULL def456 │ 899 │ 2018-12-31 12:00:00 ghi789 │ 499 │ NULL (3 rows)
You might wonder what happens when you do specify a value for
archived_at in this query. Let’s see:
insert into current_products (code, price, archived_at) values ('archprod', 499, '2018-01-01 12:00');
This doesn’t really make sense from a logical point of view, but technically, it works! PostgreSQL will happily insert the row for you:
=# select * from products; code │ price │ archived_at ─────────────────┼───────┼───────────────────── abc123 │ 1099 │ NULL def456 │ 899 │ 2018-12-31 12:00:00 ghi789 │ 499 │ NULL archivedproduct │ 499 │ 2018-01-01 12:00:00 (4 rows)
Depending on your application, you may or may not like this behaviour. You can control it using the
with local check option or
with cascaded check option on your view definition:
create view current_products as select code, price, archived_at from products where archived_at is null with cascaded check option;
This clause will tell PostgreSQL to check whether the inserted row will be visible in the view. If not, the operation is rejected:
=# insert into current_products (code, price, archived_at) values ('archivedproduct2', 499, '2018-01-01 12:00'); ERROR: new row violates check option for view "current_products" DETAIL: Failing row contains (archivedproduct2, 499, 2018-01-01 12:00:00).
Views are not automatically updatable but read-only when they reference multiple tables or use aggregates (see the exact list of requirements). Let’s create a view that is read-only:
create view current_product_versions as select code, price, language, title, description from current_products join product_contents using (code);
It will give us a row for each translated product:
=# select * from current_product_versions; code │ price │ language │ title │ description ────────┼───────┼──────────┼───────────────┼────────────────────────────────── abc123 │ 1099 │ en-GB │ Squeaky toy │ Your dog will love it. abc123 │ 1099 │ nl-NL │ Piepspeelgoed │ Je hond zal het geweldig vinden. (2 rows)
Also note how
current_product_versions references our earlier
current_products view, rather than the original table! But it’s the
join clause that makes
current_product_versions read-only. If we try to write to it, PostgreSQL will raise an error:
=# insert into current_product_versions (code, price, language, title, description) values ('xxx', 999, 'en-US', 'Squeaky toy', 'Yo dawg will dig it'); ERROR: cannot insert into view "current_product_versions" DETAIL: Views that do not select from a single table or view are not automatically updatable. HINT: To enable inserting into the view, provide an INSTEAD OF INSERT trigger or an unconditional ON INSERT DO INSTEAD rule.
We can still simulate writing to it using triggers to replace the default behaviour of
delete operations. For example, here’s how we might replace inserts on our view:
create or replace function insert_current_product_version() returns trigger as $$ declare begin insert into current_products (code, price) select NEW.code, NEW.price; insert into product_contents (language, code, title, description) select NEW.language, NEW.code, NEW.title, NEW.description; return NEW; end; $$ language plpgsql; create trigger insert_current_product_version_trg instead of insert on current_product_versions for each row execute function insert_current_product_version();
We define a trigger
insert_current_product_version_trg to call function
insert_current_product_version instead of an insert operation. The trigger function
insert_current_product_version can access the virtual row to be inserted using the special
NEW variable and can use it to perform two different inserts into actual tables.
With this trigger set up, we can insert rows:
=# insert into current_product_versions (code, price, language, title, description) values ('xxx', 999, 'en-US', 'Squeaky toy', 'Yo dawg will dig it'); INSERT 0 1
We can follow the same approach for update and delete operations. But there are a few important caveats to keep in mind:
- although technically not required for PostgreSQL, many ORMs depend on the returned number of affected rows for these operations. Make sure to not just
return nullfrom your trigger functions.
- your triggers run before your statement has gotten a chance to reach to target table and use its default values. Therefore, your triggers are unaware of the underlying column defaults, unless you explicitly add them to your view using
- views in PostgreSQL do not have primary key constraints. If your ORM tries to auto-detect a primary key, it will mostly likely fail unless you explicitly tell it which column(s) should be treated as primary key.
Most of these caveats are also further explained in the PostgreSQL documentation about trigger functions.
When to use views
Views are a powerful way to design how your application reads and writes data, independent from the optimal storage/normalisation solution for your data. A well-crafted view can greatly simplify your application code. Then again, not every query generated by your ORM should be replaced by a view in the database. I try to use this guideline: how should data access work across applications (perhaps written with different languages and frameworks) all accessing the same database? I use it to keep my application logic in my application, and my data design in my database.
Why use views at all?
Dealing with read-only views via triggers and functions can be a hassle. Are they worth it?
First, lots of views you’ll write will indeed be automatically updatable, so that removes most of the hassle. Introducing views will enrich your domain vocabulary and help your application deal more with business logic than with storage logic.
Second, views are helpful when you’re not the only client for a data store. Data has a longer lifespan than code, so chances are your application will get replaced at some point, while your database lives on. Using views instead of extensive application logic keeps the data logic where it belongs.
Third, using views can be helpful when performing incremental upgrades to the underlying data model. For example, when splitting or combining tables to make future changes easier, it is helpful to introduce a temporary view so the application can keep functioning like before. This lets you make small steps and keep the application in a working state, rather than embarking on a massive rewrite where you can only hope you’ll get it back in a working state at the end.
Lastly, the hassle that comes with using views is simple enough. It’s not necessarily easy, and functionality in database functions and triggers are not directly obvious, but it’s all still rather straightforward — once you know the pitfalls. Now you do, so have fun!