Cheatsheet : PostgresQL
psql
Postgres cli.
To enter psql
prompt in th image, type:
psql -U postgres
to use postgres
(default) user.
To quit, simply type
\q
To display help
psql --help
SQL
Since 1979.
Case insensitive.
CREATE database
CREATE DATABASE test;
Note: ;
is necessary to execute the command
CONNECT to database
psql -h localhost -p 5432 -U postgres test
or in psql
, list the databases with l
and then \c test
to connect to the database test
.
DROP (delete) to database
DROP DATABASE test;
WARNING: Never do that on a production server. The data would be LOST.
CREATE tables
Without constraints
In psql
prompt, line by line:
CREATE TABLE person (
id int,
firtst_name VARCHAR(50),
last_name VARCHAR(50),
gender VARCHAR(6),
date_of_birth DATE );
If wou want to see the tables, use the describe command \d
.
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+--------+-------+----------
public | person | table | postgres
(1 row)
You can also see the descrption for a particular table \d person
.
postgres=# \d person
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+-----------------------+-----------+----------+---------
id | integer | | |
first_name | character varying(50) | | |
last_name | character varying(50) | | |
gender | character varying(7) | | |
date_of_birth | date | | |
With constraints
In psql
prompt, line by line:
CREATE TABLE person (
id bigserial NOT NULL PRIMARY KEY, /* increments by itself */
firtst_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
gender VARCHAR(6) NOT NULL,
date_of_birth DATE NOT NULL);
We can't create a user without specifying all columns anymore.
Besides, if we use the describe command, you'll see that the database has now 1 table and 1 sequence, wich is the incrementing storage of id
, due to bigserial
.
Note: do not confuse bigserial
with bigint
. The type is identical but the sequence is only created for bigserial
.
postgres=# \d
List of relations
Schema | Name | Type | Owner
--------+---------------+----------+----------
public | person | table | postgres
public | person_id_seq | sequence | postgres
(2 rows)
postgres=# \d person
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
Note: You can chose to only view tables by using \dt
.
INSERT data in tables
1 entry
INSERT INTO person (
first_name,
last_name,
gender,
date_of_birth)
VALUES ('Cem', 'Soyding', 'Male', DATE '1985-04-10');
Note 1: Thanks to bigserial
, we don't have to indicate the primary key.
Note 2: Date value is not a string. You have to indicate DATE
followed by a string representation of the date.
To visualize the table entries, type SELECT * FROM person;
.
postgres=# select * from person
postgres-# ;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
1000 entries
You can use (Mockaroo)[https://mockaroo.com/] to generate fake data for you.
Fill the form with the table structure and select SQL
output. The tool is very useful as it supports the generation of dummy names, last names, countries, dates, emails, etc ...
Download the data and you'll get a *.sql
file.
Open the file, delete the section where the table is created.
You can either copy/paste the insert
instruction in psql
, or (better), use the insert command \i c:\documents\data.sql
.
Note: The file must be located on the host. If like me, you're running postgres in docker, you'll have to share a folder with the image or simply use the copy/paste method.
SELECT records from tables
SELECT
lets you pick columns from tables. It can also be used to apply functions on columns and display their results as solumns in the output.
All columns
Use *
to select all columns.
SELECT * FROM person;
Calling SELECT FROM person;
will also work, but you'll have no column shown :
postgres=# select from person;
--
(3 rows)
Some columns
Calling SELECT first_name FROM person;
will also work, and you'll have 1 column printed out.
first_name
------------
Cem
Mete
Tom
(3 rows)
Calling SELECT first_name, last_name FROM person;
will print 2 columns and so on.
ORDER BY results
You can order the results ascending (default, ASC
) or descending (DESC
);
SELECT * FROM person ORDER BY date_of_birth;
or
SELECT * FROM person ORDER BY date_of_birth ASC;
will return
postgres=# SELECT * FROM person ORDER BY date_of_birth;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
3 | Tom | Sawyer | Male | 1952-01-23 |
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
It is also possible to sort with 2 columns we it is highly discouraged.
DISTINCT to select values once
If you want each value of a column to appear only once, use DISTINCT
.
SELECT DISTINCT last_name FROM person;
postgres=# SELECT DISTINCT last_name FROM person;
last_name
-----------
Soyding
Sawyer
(2 rows)
WHERE clause to filter results
SELECT * FROM person WHERE last_name = 'Soyding';
postgres=# SELECT * FROM person WHERE last_name = 'Soyding';
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
(2 rows)
You can also have multiple conditions with the AND
keyword.
SELECT * FROM person WHERE last_name = 'Soyding' AND first_name = 'Cem';
postgres=# SELECT * FROM person WHERE last_name = 'Soyding' AND first_name = 'Cem';
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
(1 row)
Use parenthesis with AND
/OR
to combine more conditions
SELECT * FROM person WHERE last_name = 'Soyding' AND (first_name = 'Cem' OR first_name = 'Mete');
postgres=# SELECT * FROM person WHERE last_name = 'Soyding' AND (first_name = 'Cem' OR first_name = 'Mete');
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
(2 rows)
You can chain as much as you wish WHERE <condition> AND <condition> AND (<condition> or <condition>) ...
.
Note: SELECT <condition>
is a valid command. It will return true
or false
in the prompt.
postgres=# SELECT 1 = 1;
?column?
----------
t
(1 row)
postgres=# SELECT 1 = 2;
?column?
----------
f
(1 row)
postgres=#
The following comparison operators work : =
, <
, >
, <=
, >=
, <>
(not equal).
Note: string comparison is case sensitive.
LIMIT or OFFSET to select n entries
SELECT * FROM person LIMIT 2;
We only have the first 2 entries:
postgres=# SELECT * FROM person LIMIT 2;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
(2 rows)
SELECT * FROM person OFFSET 1;
We have skipped the first entry:
postgres=# SELECT * FROM person OFFSET 1;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
2 | Mete | Soyding | Male | 1988-12-20 |
3 | Tom | Sawyer | Male | 1952-01-23 |
(2 rows)
SELECT * FROM person OFFSET 1 LIMIT 1;
We have skipped the 1 entry and have selected the first entry out of results:
postgres=# SELECT * FROM person OFFSET 1 LIMIT 1;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
2 | Mete | Soyding | Male | 1988-12-20 |
(1 row)
or FETCH
LIMIT
is not pat of the official SQL language. It was introduced by some databases and became popular but the original way of limiting results was with FETCH
keyword.
SELECT * FROM person FETCH FIRST 2 ROW ONLY;
The output is identical to LIMIT 2
.
IN to verify inclusion
This is another operator that can be used with WHERE
.
Instead of doing WHERE <column> = <value> OR <column> = <value>
and have repetition of <column>
, we can have a shorter statement thanks to IN
.
SELECT * FROM person WHERE first_name IN ('Cem', 'Tom', 'Sandra');
postgres=# SELECT * FROM person WHERE first_name IN ('Cem', 'Tom', 'Sandra');
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
3 | Tom | Sawyer | Male | 1952-01-23 |
(2 rows)
BETWEEN to select data in a range
SELECT * FROM person WHERE date_of_birth BETWEEN DATE '1985-01-01' AND '1989-01-01';
postgres=# SELECT * FROM person WHERE date_of_birth BETWEEN DATE '1985-01-01' AND '1989-01-01';
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
(2 rows)
LIKE operator to match patterns
The wildcard in sql is %
.
SELECT * FROM person WHERE first_name LIKE '%et%';
postgres=# SELECT * FROM person WHERE first_name LIKE '%et%';
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
2 | Mete | Soyding | Male | 1988-12-20 |
(1 row)
It is also possible to match single characters in the pattern with _
:
'__e33%'
: has 2 characters followed bye33
and any character(s) at the end'___'
: has 3 characters (would matchCem
andTom
)
ILIKE for case insensistiveness
SELECT * FROM person WHERE first_name ILIKE 'ce%';
postgres=# SELECT * FROM person WHERE first_name ILIKE 'ce%';
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+-------
1 | Cem | Soyding | Male | 1985-04-10 |
(1 row)
GROUP BY to sort results based on a particular column
GROUP BY
has more or less the same effect as DISTINCT
as it will group the identical results together for a column. However, it is capable of running aggregation routines for each result, such as COUNT
for instance.
SELECT last_name from person GROUP BY last_name;
same as
SELECT DISTINCT last_name from person;
but is we want some additional data for each result, we have to specify the function(s) to be executed as columns.
SELECT last_name, COUNT(*) from person GROUP BY last_name;
postgres=# SELECT last_name, COUNT(*) from person GROUP BY last_name;
last_name | count
-----------+-------
Soyding | 2
Sawyer | 1
(2 rows)
Note that you can append ORDER BY last_name
to the sql statement to have results in alphabetical order.
HAVING to filter grouped items
HAVING
lets you specify a function to filter the results of a GROUP BY
statement.
SELECT last_name, COUNT(*) from person GROUP BY last_name HAVING COUNT(*) > 1;
postgres=# SELECT last_name, COUNT(*) from person GROUP BY last_name HAVING COUNT(*) > 1;
last_name | count
-----------+-------
Soyding | 2
(1 row)
Note 1: HAVING
must be used right after GROUP BY
Note 2: COUNT(*)
in the HAVING
statement is a function, not the column value. Actually, running SELECT last_name from person GROUP BY last_name HAVING COUNT(*) > 1;
would also work but you just wouldn't see the count
column in the output.
Aggrgegate functions
As seen above, COUNT(expression)
is an aggregagte function but there a plenty others.
All the aggregate functions supported by postgres are available (here)[https://www.postgresql.org/docs/current/functions-aggregate.html].
Let's review the most common ones. For this demo, use the car.sql
document to create a table with car models.
MIN, MAX, AVG
SELECT MAX(price) from car;
SELECT MIN(price) from car;
SELECT AVG(price) from car;
postgres=# SELECT MAX(price) from car;
max
----------
99851.57
(1 row)
You can also ROUND
the result
SELECT ROUND(AVG(price)) from car;
postgres=# SELECT ROUND(AVG(price)) from car;
round
-------
53237
(1 row)
Now let's do something more advanced. Say we want to have the minimum price per car models.
SELECT make, model, MIN(price) from car GROUP BY make, model;
postgres=# SELECT make, model, MIN(price) from car GROUP BY make, model;
make | model | min
---------------+-------------------------+----------
Kia | Amanti | 85644.87
Oldsmobile | Silhouette | 56568.37
Subaru | Forester | 29907.65
Ferrari | 612 Scaglietti | 14094.20
Mercedes-Benz | SLS AMG | 41724.84
Jeep | Grand Cherokee | 40795.13
Honda | Civic Si | 70029.97
Hyundai | Accent | 17258.76
Lincoln | Aviator | 22223.19
SUM
Let's sum the prices of cars for each car maker.
SELECT make, SUM(price) from car GROUP BY make;
postgres=# SELECT make, SUM(price) from car GROUP BY make;
make | sum
---------------+------------
Fillmore | 87016.60
Ford | 3928319.24
Smart | 149249.17
Maserati | 435203.45
Dodge | 2493273.86
Arithmetic operators
In postgres, you can perform the following mathematical operations:
1 + 2
1 - 2
1 * 2
1 / 2
1^2
(1 power 2)5!
(factorial of 5 = 120)10 % 3
(10 modulo 3 = 1)
Operator's precedence is equivalent to any programming language so you can chain them (1/2-1
)
Now let's say we're offering a 10% discount on all prices. Let's see what's the gain for the clients.
SELECT id, make, model, price, price *.10 FROM car;
postgres=# SELECT id, make, model, price, price *.10 FROM car;
id | make | model | price | ?column?
------+---------------+-------------------------+----------+-----------
1 | Oldsmobile | Silhouette | 56568.37 | 5656.8370
2 | GMC | Savana 2500 | 55473.36 | 5547.3360
3 | Ford | Focus | 14106.87 | 1410.6870
4 | GMC | 1500 Club Coupe | 69277.36 | 6927.7360
5 | Volkswagen | Scirocco | 37747.81 | 3774.7810
6 | Mercury | Grand Marquis | 53602.37 | 5360.2370
Ok now I want to see the price before and after the discount.
SELECT id, make, model, price, ROUND(price - price *.10, 2) FROM car;
postgres=# SELECT id, make, model, price, ROUND(price - price *.10, 2) FROM car;
id | make | model | price | round
------+---------------+-------------------------+----------+----------
1 | Oldsmobile | Silhouette | 56568.37 | 50911.53
2 | GMC | Savana 2500 | 55473.36 | 49926.02
3 | Ford | Focus | 14106.87 | 12696.18
4 | GMC | 1500 Club Coupe | 69277.36 | 62349.62
Note: ROUND
si optional, it just helps reading the result better.
AS to name function columns
By default, postgres names the columns with the aggregate function that was used. If non, the column will have ?
characters.AS
(alias) lets you override the name of these columns.
SELECT id, make, model, price, ROUND(price - price *.10, 2) AS discounted FROM car;
postgres=# SELECT id, make, model, price, ROUND(price - price *.10, 2) AS discounted FROM car;
id | make | model | price | discounted
------+---------------+-------------------------+----------+------------
1 | Oldsmobile | Silhouette | 56568.37 | 50911.53
2 | GMC | Savana 2500 | 55473.36 | 49926.02
3 | Ford | Focus | 14106.87 | 12696.18
null
s handling
COALESCE to have a fallback value
COALESCE
lets you define fallback values when requested data is null
.
SELECT COALESCE(null, 1);
In this example, first parameter is null
, so the result will be 1.
We can have more parameters if needed:
SELECT COALESCE(null, null, 1);
Since first value if null
, the second one is picked. It is null
as well, so the third will be returned.
Let's go back to our person
database and add 1 more row by specifying an email.
postgres=# insert into person (first_name, last_name, gender, date_of_birth, email)
postgres-# values ('Rodrigo', 'Medina', 'Male', DATE '1963-10-23', 'r.medina@gmail.com');
Ok now we want to select all email addresses, and if it does not exist, print 'no email':
SELECT COALESCE(email, 'no email') FROM person;
postgres=# select coalesce(email, 'no email') from person;
coalesce
--------------------
no email
no email
no email
r.medina@gmail.com
(4 rows)
NULLIF to prevent against errors
The following example will throw an error
SELECT 10 / 0;
ERROR: division by zero
but this works:
SELECT 10 / NULL;
?column?
----------
(1 row)
So I should be careful and turn any 0
value into null
when dividing. This is one of the use cases where NULLIF(a, b)
is useful. If a == b
, it returns null
SELECT 10 / NULLIF(0, 0);
NOW() : Dates and timestamps
Get current timestamp.
SELECT NOW();
now
-------------------------------
2023-05-18 12:31:06.258372+00
(1 row)
Date only
We can cast the result of NOW()
into a DATE
with ::
.
SELECT NOW()::DATE;
now
------------
2023-05-18
(1 row)
Time only
Same with Time
with ::
.
SELECT NOW()::TIME;
now
-----------------
12:33:21.034418
(1 row)
INTERVAL to add/remove time
INTERVAL
lets you declare a quantity of time.
SELECT NOW() - INTERVAL '1 YEAR';
?column?
-------------------------------
2022-05-18 12:35:38.589708+00
(1 row)
You can use the followings time qualifiers:
CENTURY
/CENTURIES
YEAR
/YEARS
MONTH
/MONTHS
DAY
/DAYS
SECOND
/SECONDS
MILLISECOND
/MILLISECONDS
and combine them :
SELECT NOW() - INTERVAL '1 YEAR 2 DAYS';
?column?
-------------------------------
2022-05-16 12:37:59.856092+00
(1 row)
EXTRACT parts
Use extract the get specific values out of a datetime.
SELECT EXTRACT(MONTH FROM NOW());
extract
---------
5
(1 row)
You can use any of the time qualifiers listed for INTERVAL
.
AGE function
AGE
computes the number of years between now and a timestamp.
SELECT first_name, age(date_of_birth) FROM person;
first_name | age
------------+-------------------------
Cem | 38 years 1 mon 8 days
Mete | 34 years 4 mons 29 days
Tom | 71 years 3 mons 26 days
Rodrigo | 59 years 6 mons 26 days
(4 rows)
Primary keys
Primary keys (or PK) or the unique identifiers of a row in a table.
ALTER tables
DROP CONSTRAINT to delete column constraints
Here is how you can remove the PRIMARY KEY
constraint from a column.
\d person
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
person_pkey
is the indexer of primary keys and represents the PRIMARY KEY
constraint here.
To remove it, simply do
ALTER TABLE person DROP CONSTRAINT person_pkey;
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes
are gone now and we can now enter several rows with the same id
.
ADD constraints
PRIMARY KEY
ALTER TABLE person ADD PRIMARY KEY (id);
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
person_pkey
is back.
Note 1: You can have multiple rows selected in PRIMARY KEY
if required.
Note 2: If your table has several rows with the same id
, the command won't work until you've cleaned your table. Delete the duplicates or re-assign ids and retry.
UNIQUE
Say I only want the first_name
to have unique values.
With default name
ALTER TABLE person ADD UNIQUE(first_name);
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
"person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)
The constraint's name has been set to person_first_name_key
by postgres.
With custom name
I can create a constraint with a custom name (here unique_first_name
).
ALTER TABLE person ADD CONSTRAINT unique_first_name UNIQUE(first_name);
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
"unique_first_name" UNIQUE CONSTRAINT, btree (first_name)
Note 1: You can have multiple rows selected in UNIQUE
if required, too have unique couples of values.
Now I can't add any person with a first_name
that is already used.
CHECK
Let's you define conditional validation on column values.
For instance, let's limit the possible values for the gender
column to Male
and Female
.
ALTER TABLE person ADD CONSTRAINT gender_constraint CHECK(gender = 'Female' OR gender = 'Male');
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
"person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)
Check constraints:
"gender_constraint" CHECK (gender::text = 'Female'::text OR gender::text = 'Male'::text)
DELETE
A row
DELETE FROM person WHERE id = 3;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+--------------------
1 | Cem | Soyding | Male | 1985-04-10 |
2 | Mete | Soyding | Male | 1988-12-20 |
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com
(3 rows)
All rows
Warning: Wipes out the entire table !. 99% of the time, you're gonna need a WHERE
clause.
DELETE FROM person;
UPDATE
A row
UPDATE person SET email = 'noemail@gmail.com' WHERE id = 1;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+--------------------
2 | Mete | Soyding | Male | 1988-12-20 |
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com
1 | Cem | Soyding | Male | 1985-04-10 | noemail@gmail.com
Note that you can list the columns to be set:
UPDATE person SET first_name = 'Gerry', last_name = 'Smith' WHERE id = 1;
All rows (e.g. entire column)
Warning: This is usually NOT what you want to do. 99% of the time, you're gonna need a WHERE
clause.
UPDATE person SET email = 'noemail@gmail.com';
ON CONFLICT for conflicts handling
Whenever you try to do something that violates a UNIQUE
constraint, you'll get error. For instance you can't add an entry with an already used primary key.
Now if you want to deal with conflicts when executing an operation, you can use ON CONFLICT
.
DO NOTHING
INSERT INTO person (id, first_name, last_name, gender, date_of_birth)
VALUES (1, 'Fred', 'Garcia', 'Male', DATE '1992-02-04')
ON CONFLICT(id) DO NOTHING;
The statement cleary says that in case there is a conflict on the id
column (which is the case), do not throw an error and silently ignore the insertion. Executing it woul return
RETURN 0 0
Note: ON CONFLICT
only work on columns with a UNIQUE
constraint (or primary keys). Using it on any other column will throw an error.
DO UPDATE
DO UPDATE
lets you apply some of the changes on an conflicting insertion (a sort of Upsert operation).
INSERT INTO person (id, first_name, last_name, gender, date_of_birth)
VALUES (1, 'Louise', 'Muller', 'Female', DATE '1992-02-04')
ON CONFLICT(id) DO UPDATE SET gender = EXCLUDED.gender;
This statement says that if id=1
is already taken, apply the gender of this statement anyway. EXCLUDED
points to the conflicted operation, e.g. the current one.
postgres=# select * from person;
id | first_name | last_name | gender | date_of_birth | email
----+------------+-----------+--------+---------------+--------------------
2 | Mete | Soyding | Male | 1988-12-20 |
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com
1 | Cem | Soyding | Female | 1985-04-10 | noemail@gmail.com
(3 rows)
We can see the that the gender was updated.
Note that just like UPDATE
, you can list several columns to be set after SET
.
Relationships
In a relational database, tables can be related to each other. This is fundamental to clearly separate the entities represented in our tables and avoid repetition of data.
In our database, we have 2 tables : one for cars
and one for persons
. We did well by separating these entities. Now if we want a person to own a car, we can achieve this by building a relationship.
Reliationships use primary keys to bind rows of different tables together. This relation is expressed in the owning entity thanks to a FOREIGN KEY
, which is a new column pointing on a foreign PRIMARY KEY
.
ADD COLUMN
Contract:
- 1 person can only have 1 car.
- 2 persons cannot own the same car.
- 1 car can only be owned by 1 person.
ALTER TABLE person ADD COLUMN car_id bigint REFERENCES car(id) UNIQUE;
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
car_id | bigint | | |
Indexes:
"person_pkey" PRIMARY KEY, btree (id)
"person_car_id_key" UNIQUE CONSTRAINT, btree (car_id)
"person_first_name_key" UNIQUE CONSTRAINT, btree (first_name)
Check constraints:
"gender_constraint" CHECK (gender::text = 'Female'::text OR gender::text = 'Male'::text)
Foreign-key constraints:
"person_car_id_fkey" FOREIGN KEY (car_id) REFERENCES car(id)
We use a bigint
type because it is the natural type of a bigserial
primary key. Aditionnally, we indicate to postgres that the new column car_id
should point on car
table's id
column. Finally, we define a UNIQUE
constraint to never have the same car appearing in car_id
.
SET FOREIGN KEY columns
Now let's assign a few cars to some of our persons.
UPDATE person SET car_id = 22 WHERE id = 1;
UPDATE person SET car_id = 41 WHERE id = 2;
id | first_name | last_name | gender | date_of_birth | email | car_id
----+------------+-----------+--------+---------------+--------------------+--------
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com |
1 | Cem | Soyding | Male | 1985-04-10 | noemail@gmail.com | 22
2 | Mete | Soyding | Male | 1988-12-20 | | 41
(3 rows)
Now it's not super convenient to read the car model. We have their ids but what we really need is a human readable information. That'y what inner joins are for.
INNER JOINS
An inner join takes 2 tables (A
and B
), and if there is a relation found between their records, join creates a new record (C
, i.e. the result).
Now let's use our car_id
:
SELECT * from person
JOIN car ON person.car_id = car.id;
JOIN
indicates which table to join and ON
maps the foreign key on the primary key.
id | first_name | last_name | gender | date_of_birth | email | car_id | id | make | model | price
----+------------+-----------+--------+---------------+-------------------+--------+----+---------+-----------+----------
1 | Cem | Soyding | Male | 1985-04-10 | noemail@gmail.com | 22 | 22 | Toyota | Celica | 55918.50
2 | Mete | Soyding | Male | 1988-12-20 | | 41 | 41 | Lincoln | Mark VIII | 28281.25
(2 rows)
Tip: To visualize large table entries better, use \x
to activate the expanded mode (it's a toggle, so run the comand again to disable it). Now if you run the query again:
-[ RECORD 1 ]-+------------------
id | 1
first_name | Cem
last_name | Soyding
gender | Male
date_of_birth | 1985-04-10
email | noemail@gmail.com
car_id | 22
id | 22
make | Toyota
model | Celica
price | 55918.50
-[ RECORD 2 ]-+------------------
id | 2
first_name | Mete
last_name | Soyding
gender | Male
date_of_birth | 1988-12-20
email |
car_id | 41
id | 41
make | Lincoln
model | Mark VIII
price | 28281.25
Not all the entries of person
appear in the output. Remember, if the foreign key is null, a record has no relation and hence, the JOIN
operation does not include it in the result.
It is also possible to select the columns you want to appear in the result:
SELECT person.first_name, person.last_name, car.make, car.model from person
JOIN car ON person.car_id = car.id;
-[ RECORD 1 ]---------
first_name | Cem
last_name | Soyding
make | Toyota
model | Celica
-[ RECORD 2 ]---------
first_name | Mete
last_name | Soyding
make | Lincoln
model | Mark VIII
Left joins
If you want to include record that don't have a foreign key relationship in your join, simply use LEFT JOIN
. The syntax is pretty much the same:
SELECT * from person
LEFT JOIN car ON person.car_id = car.id;
id | first_name | last_name | gender | date_of_birth | email | car_id | id | make | model | price
----+------------+-----------+--------+---------------+--------------------+--------+----+---------+-----------+----------
1 | Cem | Soyding | Male | 1985-04-10 | noemail@gmail.com | 22 | 22 | Toyota | Celica | 55918.50
2 | Mete | Soyding | Male | 1988-12-20 | | 41 | 41 | Lincoln | Mark VIII | 28281.25
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com | | | | |
(3 rows)
but this time, the result includes people that do not have a car. Note that the car
columns are all null for Rodrigo. We can confirm this by typing:
SELECT * from person
LEFT JOIN car ON person.car_id = car.id
WHERE car.* IS NULL;
id | first_name | last_name | gender | date_of_birth | email | car_id | id | make | model| price
----+------------+-----------+--------+---------------+--------------------+--------+----+------+-------+-------
4 | Rodrigo | Medina | Male | 1963-10-23 | r.medina@gmail.com | | | | |
(1 row)
DELETE records with FOREIGN KEY
If you try to delete a record that is involved in a relationship, postgres will throw an error.
DELETE FROM car WHERE id = 22;
ERROR: update or delete on table "car" violates foreign key constraint "person_car_id_fkey" on table "person"
DETAIL: Key (id)=(22) is still referenced from table "person".
We have the following options:
- Delete the relationship by setting
car_id
to null in person's record and then delete the car - Delete the person record first and then delete the car
Using CASCADE
We could have defined an ON DELETE
constraint on the foreign key with CASCADE
.
ALTER TABLE person ADD CONSTRAINT car_cascade_delete bigint REFERENCES car(id) UNIQUE ON DELETE CASCADE;
Now, when deleting a person record, if it has a relationship with car
, the person and the car will be deleted.
This is considered a bad practice so use it with care.
Export to CSV
psql
lets you copy the result of a query:
\copy (SELECT * from person LEFT JOIN car ON person.car_id = car.id) TO '/Users/root/Desktop/results.csv' ',' CSV HEADER;
Exploring
BIGSERIAL and Sequences
Let's take a look at what is inside the sequence table auto-created by postgres when we used bigserial
type.
SELECT * FROM person_id_seq;
last_value | log_cnt | is_called
------------+---------+-----------
4 | 32 | t
As you can see, the table contains the last value of the counter 4
and other data for diagnostics.
Now remember how the person
table was constructed :
\d person
Table "public.person"
Column | Type | Collation | Nullable | Default
---------------+------------------------+-----------+----------+------------------------------------
id | bigint | | not null | nextval('person_id_seq'::regclass)
first_name | character varying(50) | | not null |
last_name | character varying(50) | | not null |
gender | character varying(7) | | not null |
date_of_birth | date | | not null |
email | character varying(150) | | |
car_id | bigint | | |
Let's grab nextval('person_id_seq'::regclass)
because this is just a function and do:
SELECT nextval('person_id_seq'::regclass);
nextval
---------
5
(1 row)
Now if we print the sequence table again, here is what we see:
last_value | log_cnt | is_called
------------+---------+-----------
5 | 32 | t
(1 row)
So my next insertion in person
table will now have the id 6
.
But you can always alter the counter and restart to a previous number::
ALTER SEQUENCE person_id_seq RESTART WITH 4;
last_value | log_cnt | is_called
------------+---------+-----------
4 | 0 | f
(1 row)
Extensions
postgres functionalities can be extended by installing extensions