An introduction to PL/pgSQL

PL/pgSQL is an imperative SQL-like language that offers several features to write a piece of runnable code. Besides SQL, this is the default language used in PostgreSQL to write functions, procedures, triggers, and programs in general.

PostgreSQL has a special statement called DO that accepts a PL/pgSQL piece of code and executes it as an anonymous code block. This is particularly handy to learn features of the language or to execute a code snippet without having to wrap it into a function or a routine. The syntax of the DO statement is quite straightforward and only requires us to specify the block of code to execute:

DO <plpgsql code block>;

In the rest of the chapter, the DO statement will be used to run PL/pgSQL pieces of code and demonstrate all the main features of the language.

A runnable PL/pgSQL piece of code must contain at least one code block. A PL/pgSQL code block is composed of two main parts:

  • DECLARE section, where you declare (and, if needed, initialize) variables
  • BEGIN/END section, where you write the execution flow

The variable declaration part can be omitted if the code block does not require any variables. Otherwise, any referenced variable must appear within this part. It is worth noting that the BEGIN/END control flow part does not define a transaction boundary; the BEGIN keyword has nothing to do with the one used to open a transaction.

Variables are declared using a name, which must be a valid SQL identifier, have a type and can have a default initial value. When a variable is declared, its initial value defaults to the SQL NULL PL. Otherwise, the variable assumes the specified initial value. In this case, it is also possible to declare the variable as a constant, meaning its value cannot be altered. It is important to note that the initial values are computed and assigned when the code block is effectively executed, not when it is defined. This means that dynamic values are computed when the block executes.

The control flow part includes expressions that end with a semicolon. These can be categorized as one of the following:

  • Recognized expressions, such as a variable assignment, conditionals, or loops
  • Non-recognized expressions, which are interpreted as regular SQL statements

There is a special statement, NULL, which does nothing at all. It is therefore a placeholder for the branches where no operation is required or implemented.

Each PL/pgSQL code block is managed as a single SQL case-insensitive text string, which you must write with the appropriate quoting. Since the function can contain other quoted strings, determining the right number of ticks can become a nightmare. Luckily, PostgreSQL provides dollar-quoting, which is where each string is contained between two identical tags, made by a couple of '$' signs. A dollar-quoted string is handled as if it were written between regular quotation marks, ' , but the content of the string itself is easier and more readable. The content of the tags does not matter, it could even be omitted:

$code$   -- opening tag
<plpgsql>
$code$ -- closing tag

$$ -- empty opening tag
<plpgsql>
$$ -- closing tag

Thanks to dollar quoting, writing a PL/pgSQL block of code is easy and only requires us to put the block within two dollar quote tags:

$code$
DECLARE
...
BEGIN
...
END
$code$

When we develop a piece of code, we should think about its maintenance and readability, so we should add comments. SQL-style comments are allowed within a block of code, including the following:

  • A single-line comment that begins with a -- sign
  • A multi-line comment that begins with /* and ends with */

As you might imagine, comments are ignored by the executor.

Neither horizontal or vertical spaces are important in this context, so we can choose to indent our code however we like. The language is also case-insensitive, so we can write a block of code in almost any format. It is, however, recommended to write keywords with an uppercase letter and variable names in lowercase. It is advisable to avoid the camel-case notation for names, since this can lead to name clashing because the block code will be treated in a case-insensitive mode.

Let's take a quick look at the RAISE statement. This is used to throw exceptions, such as signal error conditions, and can be used to print out messages while the code block executes. RAISE accepts an optional level of error reporting that corresponds to the log levels (such as WARNING, INFO, or DEBUG). It also accepts a message that can be dynamically built with positional arguments. The format of a RAISE message uses a single '%' sign as a placeholder for the next positional argument, which will be interpolated as a string. The following statement prints out a text message '1 + 2 = 3' at the DEBUG log level:

RAISE DEBUG '1 + 2 = %', (1 + 2);

We will look at more details about the RAISE statement in the following sections. For now, we can think of RAISE as a possible way to print out messages as the code executes.

It is now time to see plpgsql in action. The quickest way to do this is by means of the DO statement. The code in Listing 1 shows a simple block of code that declares three variables of different types, where i has no initial value and therefore defaults to NULL, and d and t are assigned to an initial value by means of the DEFAULT keyword. After the variable declaration part, the BEGIN/END section wraps the control flow part and a RAISE statement is used to display the variable values:

testdb=> DO
$code$
DECLARE
i int; -- initial value is NULL
t text DEFAULT 'Hello World!';
d date DEFAULT CURRENT_DATE;
BEGIN
RAISE INFO 'Variables are: d = [%], t = [%], i = [%]', d, t, i;
END
$code$;
OUTPUT:
Variables are: d = [2018-07-11], t = [Hello World!], i = [<NULL>].
Listing 1:  The very first block of PL/pgSQL code

We can nest PL/pgSQL blocks, but the language provides a single variable namespace, and therefore the inner block masks out variables if it provides a DECLARE part and re-defines the same variable names, as shown in Listing 2:

testdb=> DO $code$
DECLARE
i int DEFAULT 10;
t text DEFAULT 'Hello World!';
j int DEFAULT 100;
BEGIN
RAISE INFO 'Outer block: i = [%] t = [%] j = [%]', i, t, j;
-- nested block
DECLARE
i text DEFAULT 'PostgreSQL is amazing!';
t int DEFAULT 20;
j int DEFAULT 999;
BEGIN
RAISE INFO 'Inner block i = [%] t = [%] j = [%]', i, t, j;
END;

RAISE INFO 'Outer block: i = [%] t = [%] j = [%]', i, t, j;
END; $code$;

INFO: Outer block: i = [10] t = [Hello World!] j = [100]
INFO: Inner block i = [PostgreSQL is amazing!] t = [20] j = [999]
INFO: Outer block: i = [10] t = [Hello World!] j = [100]
Listing 2:  Nested code block and masked variables