I stick my finger into existence and it smells of nothing. Where am I? What is this called the world? Who is it that lured me here? How did I come into this world? Why was I not consulted? Oh I stick my finger into existence and it smells of nothing.
- Søren Kierkegaard
PostgreSQL Tips
Table of Contents
Performance and Debugging
Analyzing a Query
It is important to analyze queries, particularly those that either get used often, or those whose execution occurs in noticable places within the application. In order to help inspect the efficiency of the query and database design, the "EXPLAIN" command can be prefixed to any SQL statement. For example:
EXPLAIN
SELECT  comments.title,
        comments.date_created,
        users.name
FROM    comments,
        users
WHERE   comments.user_id = users.id
  AND   comments.date_created >= DATE('2007-01-01')
ORDER BY date_created
LIMIT 20;
			
This does not actually execute the query. Rather, it displays the "query plan". The query plan is how the database intends to execute this query. It contains an abbreviated list "plan nodes". The plan nodes are bascially the work components of the query. The database engine generally needs to perform multiple tasks in the course of executing the SQL statement. EXPLAIN will help list those plan nodes and show their estimated cost and output. The most critical piece of information is the cost. With this information, decisions can be made about how to tune query or tune the database.
The above example query produces the following query plan:
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..14.72 rows=20 width=31)
   ->  Nested Loop  (cost=0.00..1654.17 rows=2247 width=31)
         ->  Index Scan using comments_date_created on comments  (cost=0.00..864.56 rows=2247 width=26)
               Index Cond: (date_created >= '2007-01-01'::date)
         ->  Index Scan using users_id_key on users  (cost=0.00..0.34 rows=1 width=13)
               Index Cond: (users.id = comments.user_id)
(6 rows)
			
You'll notice that there are three basic things going on here:
  • The LIMIT clause
  • A nested loop
  • A couple of index scans being used to satisfy our WHERE clause
And if you look at each line, there are three pieces of information within the parentheses:
  • cost. This specifies the disk page cost to begin the work and also the disk page cost to complete retrieving all rows. These two pieces of data are on the left and right sides of the '..', respectively.
  • rows. This is the estimate of how many rows will be output by this plan node if the query is completed.
  • width. This is the estimate number of bytes output by this plan node if the query is completed.
As mentioned, the "cost" represents estimated units of disk page fetches. Disk pages may already be in RAM, or they may need to be loaded off the disks. In any case, it indicates how much work will need to be done to complete the node.
Some questions to think about when reading EXPLAIN output:
  • Would an index help?
  • Am I using the right kind of index?
  • Do my indexes contain the right columns?
  • Is there a faster way of getting the same data?
Query Execution Times and Resource Usage
Consider showing the query and execution times on queries submitted through the psql client:
SET client_min_messages to log;
SET log_statement_stats TO on;
			
For example, using the above query:
LOG:  QUERY STATISTICS
DETAIL:  ! system usage stats:
!       0.007310 elapsed 0.008000 user 0.000000 system sec
!       [0.012000 user 0.004000 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       0/227 [0/1068] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [11/3] voluntary/involuntary context switches
! buffer usage stats:
!       Shared blocks:          1 read,          0 written, buffer hit rate = 99.79%
!       Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
!       Direct blocks:          0 read,          0 written
			
This data can be used to tune the server settings.
Text Manipulation
Stripping Text
Replacing Text
Memory allocated: 262,144 bytes
Page run time: 0.1393 seconds