Cheatsheet : PostgresQL

     

psql

Postgres cli.

To enter psql prompt in th image, type:

psql -U postgres

to use postgres (default) user.

To quit, simply type

\q

To display help

psql --help

SQL

Since 1979.
Case insensitive.

CREATE database

CREATE DATABASE test;

Note: ; is necessary to execute the command

CONNECT to database

psql -h localhost -p 5432 -U postgres test

or in psql, list the databases with l and then \c test to connect to the database test.

DROP (delete) to database

DROP DATABASE test;

WARNING: Never do that on a production server. The data would be LOST.

CREATE tables

Without constraints

In psql prompt, line by line:

CREATE TABLE person (
    id int,
    firtst_name VARCHAR(50),
    last_name VARCHAR(50),
    gender VARCHAR(6),
    date_of_birth DATE );

If wou want to see the tables, use the describe command \d.

postgres=# \d
         List of relations
 Schema |  Name  | Type  |  Owner
--------+--------+-------+----------
 public | person | table | postgres
(1 row)

You can also see the descrption for a particular table \d person.

postgres=# \d person
                         Table "public.person"
    Column     |         Type          | Collation | Nullable | Default
---------------+-----------------------+-----------+----------+---------
 id            | integer               |           |          |
 first_name    | character varying(50) |           |          |
 last_name     | character varying(50) |           |          |
 gender        | character varying(7)  |           |          |
 date_of_birth | date                  |           |          |

With constraints

In psql prompt, line by line:

CREATE TABLE person (
    id bigserial NOT NULL PRIMARY KEY, /* increments by itself */
    firtst_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    gender VARCHAR(6) NOT NULL,
    date_of_birth DATE NOT NULL);

We can't create a user without specifying all columns anymore.

Besides, if we use the describe command, you'll see that the database has now 1 table and 1 sequence, wich is the incrementing storage of id, due to bigserial.

Note: do not confuse bigserial with bigint. The type is identical but the sequence is only created for bigserial.

postgres=# \d
              List of relations
 Schema |     Name      |   Type   |  Owner
--------+---------------+----------+----------
 public | person        | table    | postgres
 public | person_id_seq | sequence | postgres
(2 rows)

postgres=# \d person
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)

Note: You can chose to only view tables by using \dt.

INSERT data in tables

1 entry

INSERT INTO person (
    first_name,
    last_name,
    gender,
    date_of_birth)
VALUES ('Cem', 'Soyding', 'Male', DATE '1985-04-10');

Note 1: Thanks to bigserial, we don't have to indicate the primary key.
Note 2: Date value is not a string. You have to indicate DATE followed by a string representation of the date.

To visualize the table entries, type SELECT * FROM person;.

postgres=# select * from person
postgres-# ;
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |

1000 entries

You can use (Mockaroo)[https://mockaroo.com/] to generate fake data for you.
Fill the form with the table structure and select SQL output. The tool is very useful as it supports the generation of dummy names, last names, countries, dates, emails, etc ...
Download the data and you'll get a *.sql file.

Open the file, delete the section where the table is created.

You can either copy/paste the insert instruction in psql, or (better), use the insert command \i c:\documents\data.sql.

Note: The file must be located on the host. If like me, you're running postgres in docker, you'll have to share a folder with the image or simply use the copy/paste method.

SELECT records from tables

SELECT lets you pick columns from tables. It can also be used to apply functions on columns and display their results as solumns in the output.

All columns

Use * to select all columns.

SELECT * FROM person;

Calling SELECT FROM person; will also work, but you'll have no column shown :

postgres=# select from person;
--
(3 rows)

Some columns

Calling SELECT first_name FROM person; will also work, and you'll have 1 column printed out.

 first_name
------------
 Cem
 Mete
 Tom
(3 rows)

Calling SELECT first_name, last_name FROM person; will print 2 columns and so on.

ORDER BY results

You can order the results ascending (default, ASC) or descending (DESC);

SELECT * FROM person ORDER BY date_of_birth;
or
SELECT * FROM person ORDER BY date_of_birth ASC;

will return

postgres=# SELECT * FROM person ORDER BY date_of_birth;
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  3 | Tom        | Sawyer    | Male   | 1952-01-23    |
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |

It is also possible to sort with 2 columns we it is highly discouraged.

DISTINCT to select values once

If you want each value of a column to appear only once, use DISTINCT.

SELECT DISTINCT last_name FROM person;
postgres=# SELECT DISTINCT last_name FROM person;
 last_name
-----------
 Soyding
 Sawyer
(2 rows)

WHERE clause to filter results

SELECT * FROM person WHERE last_name = 'Soyding';
postgres=# SELECT * FROM person WHERE last_name = 'Soyding';
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(2 rows)

You can also have multiple conditions with the AND keyword.

SELECT * FROM person WHERE last_name = 'Soyding' AND first_name = 'Cem';
postgres=# SELECT * FROM person WHERE last_name = 'Soyding' AND first_name = 'Cem';
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
(1 row)

Use parenthesis with AND/OR to combine more conditions

SELECT * FROM person WHERE last_name = 'Soyding' AND (first_name = 'Cem' OR first_name = 'Mete');
postgres=# SELECT * FROM person WHERE last_name = 'Soyding' AND (first_name = 'Cem' OR first_name = 'Mete');
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(2 rows)

You can chain as much as you wish WHERE <condition> AND <condition> AND (<condition> or <condition>) ....

Note: SELECT <condition> is a valid command. It will return true or false in the prompt.

postgres=# SELECT 1 = 1;
 ?column?
----------
 t
(1 row)

postgres=# SELECT 1 = 2;
 ?column?
----------
 f
(1 row)

postgres=#

The following comparison operators work : =, <, >, <=, >=, <> (not equal).

Note: string comparison is case sensitive.

LIMIT or OFFSET to select n entries

SELECT * FROM person LIMIT 2;

We only have the first 2 entries:

postgres=# SELECT * FROM person LIMIT 2;
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(2 rows)
SELECT * FROM person OFFSET 1;

We have skipped the first entry:

postgres=# SELECT * FROM person OFFSET 1;
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
  3 | Tom        | Sawyer    | Male   | 1952-01-23    |
(2 rows)
SELECT * FROM person OFFSET 1 LIMIT 1;

We have skipped the 1 entry and have selected the first entry out of results:

postgres=# SELECT * FROM person OFFSET 1 LIMIT 1;
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(1 row)
or FETCH

LIMIT is not pat of the official SQL language. It was introduced by some databases and became popular but the original way of limiting results was with FETCH keyword.

SELECT * FROM person FETCH FIRST 2 ROW ONLY;

The output is identical to LIMIT 2.

IN to verify inclusion

This is another operator that can be used with WHERE.
Instead of doing WHERE <column> = <value> OR <column> = <value> and have repetition of <column>, we can have a shorter statement thanks to IN.

SELECT * FROM person WHERE first_name IN ('Cem', 'Tom', 'Sandra');
postgres=# SELECT * FROM person WHERE first_name IN ('Cem', 'Tom', 'Sandra');
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  3 | Tom        | Sawyer    | Male   | 1952-01-23    |
(2 rows)

BETWEEN to select data in a range

SELECT * FROM person WHERE date_of_birth BETWEEN DATE '1985-01-01' AND '1989-01-01';
postgres=# SELECT * FROM person WHERE date_of_birth BETWEEN DATE '1985-01-01' AND '1989-01-01';
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(2 rows)

LIKE operator to match patterns

The wildcard in sql is %.

SELECT * FROM person WHERE first_name LIKE '%et%';
postgres=# SELECT * FROM person WHERE first_name LIKE '%et%';
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
(1 row)

It is also possible to match single characters in the pattern with _:

  • '__e33%' : has 2 characters followed by e33 and any character(s) at the end
  • '___' : has 3 characters (would match Cem and Tom)
ILIKE for case insensistiveness
SELECT * FROM person WHERE first_name ILIKE 'ce%';
postgres=# SELECT * FROM person WHERE first_name ILIKE 'ce%';
 id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
(1 row)

GROUP BY to sort results based on a particular column

GROUP BY has more or less the same effect as DISTINCT as it will group the identical results together for a column. However, it is capable of running aggregation routines for each result, such as COUNT for instance.

SELECT last_name from person GROUP BY last_name;
same as 
SELECT DISTINCT last_name from person;

but is we want some additional data for each result, we have to specify the function(s) to be executed as columns.

SELECT last_name, COUNT(*) from person GROUP BY last_name;
postgres=# SELECT last_name, COUNT(*) from person GROUP BY last_name;
 last_name | count
-----------+-------
 Soyding   |     2
 Sawyer    |     1
(2 rows)

Note that you can append ORDER BY last_name to the sql statement to have results in alphabetical order.

HAVING to filter grouped items

HAVING lets you specify a function to filter the results of a GROUP BY statement.

SELECT last_name, COUNT(*) from person GROUP BY last_name HAVING COUNT(*) > 1;
postgres=# SELECT last_name, COUNT(*) from person GROUP BY last_name HAVING COUNT(*) > 1;
 last_name | count
-----------+-------
 Soyding   |     2
(1 row)

Note 1: HAVING must be used right after GROUP BY
Note 2: COUNT(*) in the HAVING statement is a function, not the column value. Actually, running SELECT last_name from person GROUP BY last_name HAVING COUNT(*) > 1;  would also work but you just wouldn't see the count column in the output.

Aggrgegate functions

As seen above, COUNT(expression) is an aggregagte function but there a plenty others.
All the aggregate functions supported by postgres are available (here)[https://www.postgresql.org/docs/current/functions-aggregate.html].

Let's review the most common ones. For this demo, use the car.sql document to create a table with car models.

MIN, MAX, AVG
SELECT MAX(price) from car;
SELECT MIN(price) from car;
SELECT AVG(price) from car;
postgres=# SELECT MAX(price) from car;
   max
----------
 99851.57
(1 row)

You can also ROUND the result

SELECT ROUND(AVG(price)) from car;
postgres=# SELECT ROUND(AVG(price)) from car;
 round
-------
 53237
(1 row)

Now let's do something more advanced. Say we want to have the minimum price per car models.

SELECT make, model, MIN(price) from car GROUP BY make, model;
postgres=# SELECT make, model, MIN(price) from car GROUP BY make, model;
     make      |          model          |   min
---------------+-------------------------+----------
 Kia           | Amanti                  | 85644.87
 Oldsmobile    | Silhouette              | 56568.37
 Subaru        | Forester                | 29907.65
 Ferrari       | 612 Scaglietti          | 14094.20
 Mercedes-Benz | SLS AMG                 | 41724.84
 Jeep          | Grand Cherokee          | 40795.13
 Honda         | Civic Si                | 70029.97
 Hyundai       | Accent                  | 17258.76
 Lincoln       | Aviator                 | 22223.19
SUM

Let's sum the prices of cars for each car maker.

SELECT make, SUM(price) from car GROUP BY make;
postgres=# SELECT make, SUM(price) from car GROUP BY make;
     make      |    sum
---------------+------------
 Fillmore      |   87016.60
 Ford          | 3928319.24
 Smart         |  149249.17
 Maserati      |  435203.45
 Dodge         | 2493273.86
Arithmetic operators

In postgres, you can perform the following mathematical operations:

  • 1 + 2
  • 1 - 2
  • 1 * 2
  • 1 / 2
  • 1^2 (1 power 2)
  • 5! (factorial of 5 = 120)
  • 10 % 3 (10 modulo 3 = 1)

Operator's precedence is equivalent to any programming language so you can chain them (1/2-1)

Now let's say we're offering a 10% discount on all prices. Let's see what's the gain for the clients.

SELECT id, make, model, price, price *.10  FROM car;
postgres=# SELECT id, make, model, price, price *.10  FROM car;
  id  |     make      |          model          |  price   | ?column?
------+---------------+-------------------------+----------+-----------
    1 | Oldsmobile    | Silhouette              | 56568.37 | 5656.8370
    2 | GMC           | Savana 2500             | 55473.36 | 5547.3360
    3 | Ford          | Focus                   | 14106.87 | 1410.6870
    4 | GMC           | 1500 Club Coupe         | 69277.36 | 6927.7360
    5 | Volkswagen    | Scirocco                | 37747.81 | 3774.7810
    6 | Mercury       | Grand Marquis           | 53602.37 | 5360.2370 

Ok now I want to see the price before and after the discount.

SELECT id, make, model, price, ROUND(price - price *.10, 2)  FROM car;
postgres=# SELECT id, make, model, price, ROUND(price - price *.10, 2)  FROM car;
  id  |     make      |          model          |  price   |  round
------+---------------+-------------------------+----------+----------
    1 | Oldsmobile    | Silhouette              | 56568.37 | 50911.53
    2 | GMC           | Savana 2500             | 55473.36 | 49926.02
    3 | Ford          | Focus                   | 14106.87 | 12696.18
    4 | GMC           | 1500 Club Coupe         | 69277.36 | 62349.62

Note: ROUND si optional, it just helps reading the result better.

AS to name function columns

By default, postgres names the columns with the aggregate function that was used. If non, the column will have ? characters.
AS (alias) lets you override the name of these columns.

SELECT id, make, model, price, ROUND(price - price *.10, 2) AS discounted FROM car;
postgres=# SELECT id, make, model, price, ROUND(price - price *.10, 2) AS discounted FROM car;
  id  |     make      |          model          |  price   | discounted
------+---------------+-------------------------+----------+------------
    1 | Oldsmobile    | Silhouette              | 56568.37 |   50911.53
    2 | GMC           | Savana 2500             | 55473.36 |   49926.02
    3 | Ford          | Focus                   | 14106.87 |   12696.18

nulls handling

COALESCE to have a fallback value

COALESCE lets you define fallback values when requested data is null.

SELECT COALESCE(null, 1);

In this example, first parameter is null, so the result will be 1.
We can have more parameters if needed:

SELECT COALESCE(null, null, 1);

Since first value if null, the second one is picked. It is null as well, so the third will be returned.

Let's go back to our person database and add 1 more row by specifying an email.

postgres=# insert into person (first_name, last_name, gender, date_of_birth, email)
postgres-# values ('Rodrigo', 'Medina', 'Male', DATE '1963-10-23', 'r.medina@gmail.com');

Ok now we want to select all email addresses, and if it does not exist, print 'no email':

SELECT COALESCE(email, 'no email') FROM person;
postgres=# select coalesce(email, 'no email') from person;
      coalesce
--------------------
 no email
 no email
 no email
 r.medina@gmail.com
 (4 rows)
NULLIF to prevent against errors

The following example will throw an error

SELECT 10 / 0;
ERROR:  division by zero

but this works:

SELECT 10 / NULL;
 ?column?
----------

(1 row)

So I should be careful and turn any 0 value into null when dividing. This is one of the use cases where NULLIF(a, b) is useful. If a == b, it returns null

SELECT 10 / NULLIF(0, 0);

NOW() : Dates and timestamps

Get current timestamp.

SELECT NOW();
              now
-------------------------------
 2023-05-18 12:31:06.258372+00
(1 row)

Date only

We can cast the result of NOW() into a DATE with ::.

SELECT NOW()::DATE;
    now
------------
 2023-05-18
(1 row)

Time only

Same with Time with ::.

SELECT NOW()::TIME;
       now
-----------------
 12:33:21.034418
(1 row)

INTERVAL to add/remove time

INTERVAL lets you declare a quantity of time.

SELECT NOW() - INTERVAL '1 YEAR';
           ?column?
-------------------------------
 2022-05-18 12:35:38.589708+00
(1 row)

You can use the followings time qualifiers:

  • CENTURY/CENTURIES
  • YEAR/YEARS
  • MONTH/MONTHS
  • DAY/DAYS
  • SECOND/SECONDS
  • MILLISECOND/MILLISECONDS

and combine them :

SELECT NOW() - INTERVAL '1 YEAR 2 DAYS';
           ?column?
-------------------------------
 2022-05-16 12:37:59.856092+00
(1 row)

EXTRACT parts

Use extract the get specific values out of a datetime.

SELECT EXTRACT(MONTH FROM NOW());
 extract
---------
       5
(1 row)

You can use any of the time qualifiers listed for INTERVAL.

AGE function

AGE computes the number of years between now and a timestamp.

SELECT first_name, age(date_of_birth) FROM person;
 first_name |           age
------------+-------------------------
 Cem        | 38 years 1 mon 8 days
 Mete       | 34 years 4 mons 29 days
 Tom        | 71 years 3 mons 26 days
 Rodrigo    | 59 years 6 mons 26 days
(4 rows)

Primary keys

Primary keys (or PK) or the unique identifiers of a row in a table.

ALTER tables

DROP CONSTRAINT to delete column constraints

Here is how you can remove the PRIMARY KEY constraint from a column.

\d person
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)

person_pkey is the indexer of primary keys and represents the PRIMARY KEY constraint here.
To remove it, simply do

ALTER TABLE person DROP CONSTRAINT person_pkey;
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |

Indexes are gone now and we can now enter several rows with the same id.

ADD constraints

PRIMARY KEY
ALTER TABLE person ADD PRIMARY KEY (id);
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)

person_pkey is back.

Note 1: You can have multiple rows selected in PRIMARY KEY if required.
Note 2: If your table has several rows with the same id, the command won't work until you've cleaned your table. Delete the duplicates or re-assign ids and retry.

UNIQUE

Say I only want the first_name to have unique values.

With default name
ALTER TABLE person ADD UNIQUE(first_name);
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)
    "person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)

The constraint's name has been set to person_first_name_key by postgres.

With custom name

I can create a constraint with a custom name (here unique_first_name).

ALTER TABLE person ADD CONSTRAINT unique_first_name UNIQUE(first_name);
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)
    "unique_first_name" UNIQUE CONSTRAINT, btree (first_name)

Note 1: You can have multiple rows selected in UNIQUE if required, too have unique couples of values.

Now I can't add any person with a first_name that is already used.

CHECK

Let's you define conditional validation on column values.
For instance, let's limit the possible values for the gender column to Male and Female.

ALTER TABLE person ADD CONSTRAINT gender_constraint CHECK(gender = 'Female' OR gender = 'Male');
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)
    "person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)
Check constraints:
    "gender_constraint" CHECK (gender::text = 'Female'::text OR gender::text = 'Male'::text)

DELETE

A row

DELETE FROM person WHERE id = 3;
 id | first_name | last_name | gender | date_of_birth |       email
----+------------+-----------+--------+---------------+--------------------
  1 | Cem        | Soyding   | Male   | 1985-04-10    |
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com
(3 rows)

All rows

Warning: Wipes out the entire table !. 99% of the time, you're gonna need a WHERE clause.

DELETE FROM person;

UPDATE

A row

UPDATE person SET email = 'noemail@gmail.com' WHERE id = 1;
 id | first_name | last_name | gender | date_of_birth |       email
----+------------+-----------+--------+---------------+--------------------
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com
  1 | Cem        | Soyding   | Male   | 1985-04-10    | noemail@gmail.com

Note that you can list the columns to be set:

UPDATE person SET first_name = 'Gerry', last_name = 'Smith' WHERE id = 1;

All rows (e.g. entire column)

Warning: This is usually NOT what you want to do. 99% of the time, you're gonna need a WHERE clause.

UPDATE person SET email = 'noemail@gmail.com';

ON CONFLICT for conflicts handling

Whenever you try to do something that violates a UNIQUE constraint, you'll get error. For instance you can't add an entry with an already used primary key.

Now if you want to deal with conflicts when executing an operation, you can use ON CONFLICT.

DO NOTHING

INSERT INTO person (id, first_name, last_name, gender, date_of_birth)
VALUES (1, 'Fred', 'Garcia', 'Male', DATE '1992-02-04')
ON CONFLICT(id) DO NOTHING;

The statement cleary says that in case there is a conflict on the id column (which is the case), do not throw an error and silently ignore the insertion. Executing it woul return

RETURN 0 0

Note: ON CONFLICT only work on columns with a UNIQUE constraint (or primary keys). Using it on any other column will throw an error.

DO UPDATE

DO UPDATE lets you apply some of the changes on an conflicting insertion (a sort of Upsert operation).

INSERT INTO person (id, first_name, last_name, gender, date_of_birth)
VALUES (1, 'Louise', 'Muller', 'Female', DATE '1992-02-04')
ON CONFLICT(id) DO UPDATE SET gender = EXCLUDED.gender;

This statement says that if id=1 is already taken, apply the gender of this statement anyway. EXCLUDED points to the conflicted operation, e.g. the current one.

postgres=# select * from person;
 id | first_name | last_name | gender | date_of_birth |       email
----+------------+-----------+--------+---------------+--------------------
  2 | Mete       | Soyding   | Male   | 1988-12-20    |
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com
  1 | Cem        | Soyding   | Female | 1985-04-10    | noemail@gmail.com
(3 rows)

We can see the that the gender was updated.
Note that just like UPDATE, you can list several columns to be set after SET.

Relationships

In a relational database, tables can be related to each other. This is fundamental to clearly separate the entities represented in our tables and avoid repetition of data.

In our database, we have 2 tables : one for cars and one for persons. We did well by separating these entities. Now if we want a person to own a car, we can achieve this by building a relationship.

Reliationships use primary keys to bind rows of different tables together. This relation is expressed in the owning entity thanks to a FOREIGN KEY, which is a new column pointing on a foreign PRIMARY KEY.

ADD COLUMN

Contract:

  • 1 person can only have 1 car.
  • 2 persons cannot own the same car.
  • 1 car can only be owned by 1 person.
ALTER TABLE person ADD COLUMN car_id bigint REFERENCES car(id) UNIQUE;
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
 car_id        | bigint                 |           |          |
Indexes:
    "person_pkey" PRIMARY KEY, btree (id)
    "person_car_id_key" UNIQUE CONSTRAINT, btree (car_id)
    "person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)
Check constraints:
    "gender_constraint" CHECK (gender::text = 'Female'::text OR gender::text = 'Male'::text)
Foreign-key constraints:
    "person_car_id_fkey" FOREIGN KEY (car_id) REFERENCES car(id)

We use a bigint type because it is the natural type of a bigserial primary key. Aditionnally, we indicate to postgres that the new column car_id should point on car table's id column. Finally, we define a UNIQUE constraint to never have the same car appearing in car_id.

SET FOREIGN KEY columns

Now let's assign a few cars to some of our persons.

UPDATE person SET car_id = 22 WHERE id = 1;
UPDATE person SET car_id = 41 WHERE id = 2;
 id | first_name | last_name | gender | date_of_birth |       email        | car_id
----+------------+-----------+--------+---------------+--------------------+--------
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com |
  1 | Cem        | Soyding   | Male   | 1985-04-10    | noemail@gmail.com  |     22
  2 | Mete       | Soyding   | Male   | 1988-12-20    |                    |     41
(3 rows)

Now it's not super convenient to read the car model. We have their ids but what we really need is a human readable information. That'y what inner joins are for.

INNER JOINS

An inner join takes 2 tables (A and B), and if there is a relation found between their records, join creates a new record (C, i.e. the result).

Now let's use our car_id:

SELECT * from person
JOIN car ON person.car_id = car.id;

JOIN indicates which table to join and ON maps the foreign key on the primary key.

 id | first_name | last_name | gender | date_of_birth |       email       | car_id | id |  make   |   model   |  price
----+------------+-----------+--------+---------------+-------------------+--------+----+---------+-----------+----------
  1 | Cem        | Soyding   | Male   | 1985-04-10    | noemail@gmail.com |     22 | 22 | Toyota  | Celica    | 55918.50
  2 | Mete       | Soyding   | Male   | 1988-12-20    |                   |     41 | 41 | Lincoln | Mark VIII | 28281.25
(2 rows)

Tip: To visualize large table entries better, use \x to activate the expanded mode (it's a toggle, so run the comand again to disable it). Now if you run the query again:

-[ RECORD 1 ]-+------------------
id            | 1
first_name    | Cem
last_name     | Soyding
gender        | Male
date_of_birth | 1985-04-10
email         | noemail@gmail.com
car_id        | 22
id            | 22
make          | Toyota
model         | Celica
price         | 55918.50
-[ RECORD 2 ]-+------------------
id            | 2
first_name    | Mete
last_name     | Soyding
gender        | Male
date_of_birth | 1988-12-20
email         |
car_id        | 41
id            | 41
make          | Lincoln
model         | Mark VIII
price         | 28281.25

Not all the entries of person appear in the output. Remember, if the foreign key is null, a record has no relation and hence, the JOIN operation does not include it in the result.

It is also possible to select the columns you want to appear in the result:

SELECT person.first_name, person.last_name, car.make, car.model from person
JOIN car ON person.car_id = car.id;
-[ RECORD 1 ]---------
first_name | Cem
last_name  | Soyding
make       | Toyota
model      | Celica
-[ RECORD 2 ]---------
first_name | Mete
last_name  | Soyding
make       | Lincoln
model      | Mark VIII
Left joins

If you want to include record that don't have a foreign key relationship in your join, simply use LEFT JOIN. The syntax is pretty much the same:

SELECT * from person
LEFT JOIN car ON person.car_id = car.id;
 id | first_name | last_name | gender | date_of_birth |       email        | car_id | id |  make   |   model   |  price
----+------------+-----------+--------+---------------+--------------------+--------+----+---------+-----------+----------
  1 | Cem        | Soyding   | Male   | 1985-04-10    | noemail@gmail.com  |     22 | 22 | Toyota  | Celica    | 55918.50
  2 | Mete       | Soyding   | Male   | 1988-12-20    |                    |     41 | 41 | Lincoln | Mark VIII | 28281.25
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com |        |    |         |       |
(3 rows)

but this time, the result includes people that do not have a car. Note that the car columns are all null for Rodrigo. We can confirm this by typing:

SELECT * from person
LEFT JOIN car ON person.car_id = car.id
WHERE car.* IS NULL;
 id | first_name | last_name | gender | date_of_birth |       email        | car_id | id | make | model| price
----+------------+-----------+--------+---------------+--------------------+--------+----+------+-------+-------
  4 | Rodrigo    | Medina    | Male   | 1963-10-23    | r.medina@gmail.com |        |    |      |  |
(1 row)

DELETE records with FOREIGN KEY

If you try to delete a record that is involved in a relationship, postgres will throw an error.

DELETE FROM car WHERE id = 22;
ERROR:  update or delete on table "car" violates foreign key constraint "person_car_id_fkey" on table "person"
DETAIL:  Key (id)=(22) is still referenced from table "person".

We have the following options:

  • Delete the relationship by setting car_id to null in person's record and then delete the car
  • Delete the person record first and then delete the car
Using CASCADE

We could have defined an ON DELETE constraint on the foreign key with CASCADE.

ALTER TABLE person ADD CONSTRAINT car_cascade_delete bigint REFERENCES car(id) UNIQUE ON DELETE CASCADE;

Now, when deleting a person record, if it has a relationship with car, the person and the car will be deleted.

This is considered a bad practice so use it with care.

Export to CSV

psql lets you copy the result of a query:

\copy (SELECT * from person LEFT JOIN car ON person.car_id = car.id) TO '/Users/root/Desktop/results.csv' ',' CSV HEADER;

Exploring

BIGSERIAL and Sequences

Let's take a look at what is inside the sequence table auto-created by postgres when we used bigserial type.

SELECT *  FROM person_id_seq;
 last_value | log_cnt | is_called
------------+---------+-----------
          4 |      32 | t

As you can see, the table contains the last value of the counter 4 and other data for diagnostics.

Now remember how the person table was constructed :

\d person
                                       Table "public.person"
    Column     |          Type          | Collation | Nullable |              Default
---------------+------------------------+-----------+----------+------------------------------------
 id            | bigint                 |           | not null | nextval('person_id_seq'::regclass)
 first_name    | character varying(50)  |           | not null |
 last_name     | character varying(50)  |           | not null |
 gender        | character varying(7)   |           | not null |
 date_of_birth | date                   |           | not null |
 email         | character varying(150) |           |          |
 car_id        | bigint                 |           |          |

Let's grab nextval('person_id_seq'::regclass) because this is just a function and do:

SELECT nextval('person_id_seq'::regclass);
 nextval
---------
       5
(1 row)

Now if we print the sequence table again, here is what we see:

 last_value | log_cnt | is_called
------------+---------+-----------
          5 |      32 | t
(1 row)

So my next insertion in person table will now have the id 6.

But you can always alter the counter and restart to a previous number::

ALTER SEQUENCE person_id_seq RESTART WITH 4;
 last_value | log_cnt | is_called
------------+---------+-----------
          4 |       0 | f
(1 row)

Extensions

postgres functionalities can be extended by installing extensions