October 11, 2024
Configuring PostgreSQL
PostgreSQL has many options that can be configured to optimize performance, security, and resource utilization. This guide will help you understand how to configure PostgreSQL to suit your needs. It goes over the most common settings you might want to change and gives a brief overview of what those settings do.
PostgreSQL can be configured via
- Command line arguments (e.g.,
-c "max_connections=100"
) - Environment variables (e.g.
-e POSTGRES_PASSWORD=mysecretpassword
) - A configuration file,
postgresql.conf
(see previous section for locations of this file)
For a complete reference of all the settings PostgreSQL offers, make sure you check out the official docs for the version you are running at https://www.postgresql.org/docs/current/config-setting.html.
Things to Note About the .conf file
- Comments in .conf files start with a pound sign (
#
) - You may optionally use
=
with or without spaces around it to assign values to parameters. - Any changes made to
postgresql.conf
will only take effect after the server is restarted.
Connection Settings
These settings control things like the number and behavior of connections to the database.
-
max_connections
- The maximum number of concurrent connections to the database. -
listen_addresses
– Specifies the network interfaces the server listens on for connections (default islocalhost
). This can be changed to*
to listen on all interfaces and allow remote connections. -
port
—The TCP port number for incoming connections (the default is5432
). This may be changed to avoid conflicts with other services or instances of PostgreSQL on the same machine. -
idle_in_transaction_session_timeout
– Sets a timeout (in milliseconds) for sessions that remain idle within a transaction to prevent holding locks unintentionally.
Resource Utilization Settings
These settings control how much memory the server uses for various operations.
-
shared_buffers
– The amount of memory the database server uses for caching data in memory. It is one of the most important settings for performance tuning. (Default is usually 128MB; 25% of total system memory is typically recommended.) -
effective_cache_size
- The amount of memory available for the database to use for disk-based tables. This is used by the query planner to estimate the cost of operations. -
work_mem
- The amount of memory that each worker process can use for sorting and hashing. -
maintenance_work_mem
- The amount of memory that the server will use for maintenance operations like VACUUM and CREATE INDEX. -
wal_buffers
– Memory reserved for Write-Ahead Logging (WAL) buffers, which store changes before they are written to disk. Important for write-heavy workloads. Defaults to -1, which means it will use 3% ofshared_buffers
but not more than one WAL segment, typically 16MB. -
effective_io_concurrency
- The estimated number of disk I/O operations per second. -
max_parallel_workers
– The maximum number of workers that the system can launch in parallel queries.
Write-Ahead Logging (WAL)
Allows tuning the WAL to adjust performance, durability, and disk space usage.
(I've put wal_buffers
in the previous section, but it's also a WAL setting.)
-
wal_level
– The level of write-ahead logging that the server uses. Higher levels provide more detailed logging and can impact the amount of disk space required for WAL. Options are asminimal
,replica
, andlogical
:-
minimal
: Generates the least amount of WAL data necessary for crash recovery. Use Case: Suitable for standalone databases where replication and point-in-time recovery (PITR) are not needed. Limitations: Does not support replication or PITR. -
replica
(the default, formerly known asarchive
): Writes enough WAL data to support standby servers and PITR. Use Case: Needed for physical replication (streaming replication) and archiving. -
logical
: Includes additional information required for logical replication and logical decoding. Use Case: Necessary when using logical replication slots or third-party applications that read WAL for data changes.
-
-
synchronous_commit
– Controls when the server commits a transaction.ON
commits immediately,remote_write
commits only after the data is written to the standby server, andremote_apply
commits only after the data is applied to the standby server.
Autovacuum
Controls how the server handles cleaning up unused rows in tables.
-
autovacuum
– Controls whether the server uses the autovacuum daemon to clean up dead tuples in the background. (on
/off
) -
vacuum_cost_delay
– Controls the delay between VACUUM operations. Set to a value in milliseconds.
Replication
Controls how the server handles replication across servers and hot standby for failover.
-
hot_standby
– Controls whether the server can serve as a hot standby. This can impact the server's performance, as the server must handle both applying WAL data and servicing read-only queries. It can be set toON
orOFF
. -
synchronous_commit
– Controls when the server commits a transaction.ON
commits immediately,remote_write
commits only after the data is written to the standby server, andremote_apply
commits only after the data is applied to the standby server.
Logging and Statistics
These settings control how the server logs messages and tracks activity and where logs are written.
-
log_destination
– Specifies where the server should log messages. This can be set tostderr
orcsvlog
. -
log_min_messages
– The minimum severity of messages that will be logged. -
log_min_duration_statement
– The minimum duration of statements that will be logged. -
log_statement
– Controls which SQL statements are logged. It can be set toall
,ddl
,mod
,stat
, ornone
. -
logging_collector
– Controls whether the server logs to a file or to the console. (on
/off
) -
log_filename
– Sometimes adjusted to control how logs are named or rotated in environments requiring fine-grained log management. -
track_activities
– Controls whether the server tracks activity information for each session. (on
/off
) -
track_counts
– Controls whether the server tracks the number of rows processed by each statement. (on
/off
) -
track_io_timing
– Controls whether the server tracks the time spent on I/O operations. (on
/off
)
Query Tuning / Planner
These settings allow you to influence the query planner's cost estimates, which can affect query execution plans. This can allow you to balance performance and resource usage.
-
random_page_cost
– The estimated cost of a random page read. You can adjust this value to influence the query planner's cost estimates (lower values encourage favoring index scans), which can affect query execution plans. -
effective_cache_size
– The amount of memory available for the database to use for disk-based tables. This is used by the query planner to estimate the cost of operations. -
work_mem
– The amount of memory that each worker process can use for sorting and hashing. -
maintenance_work_mem
– The amount of memory that the server will use for maintenance operations like VACUUM and CREATE INDEX. -
max_parallel_workers_per_gather
– Limits the number of parallel workers that can be used by a singleGather
node, which is a key element in parallel query execution. -
random_page_cost
- The estimated cost of a random page read. You can adjust this value to influence the query planner's cost estimates (lower values encourage favoring index scans), which can affect query execution plans.
Security and Authentication
These settings control how the server handles security and authentication.
-
ssl
– Enables or disables SSL connections to enhance security. Turning this on will require SSL certificates to be present. (ssl_cert_file
andssl_key_file
in the same directory as thepostgresql.conf
file.) -
password_encryption
– Controls whether the server encrypts passwords in thepg_authid
system catalog tables. Off is only recommended for development environments. -
authentication_timeout
– The maximum time allowed for authentication attempts. Set to a value in milliseconds. -
ssl_ciphers
– Specifies the SSL ciphers that the server will use for encryption.
Background Writer
These control how the server handles background writes to the disk to optimize I/O performance and reduce write latency spikes.
-
bgwriter_delay
– Controls the delay between background write operations (in ms).
Locale and Formatting
With these, you can customize different formatting for the intended audience of the server and apply locale-specific settings.
-
datestyle
– Controls the display format of date and time values. -
timezone
– The default timezone for the server.
Client Connection Settings
-
statement_timeout
– The maximum time allowed for a single statement to execute in milliseconds.
Helpful Tool: PGTune
Getting your head around all these settings when you're new to PostgreSQL can be daunting. There are tools that can help you configure your PostgreSQL server based on your hardware and environment. One worth checking out is a web application called PGTune that will help you customize your configuration based on your hardware and environment: https://pgtune.leopard.in.ua/
Conclusion
There are so many settings you can tweak to optimize Postgres, and you're not likely going to need most of them, but you can keep coming back to this guide as a handy reference if you're wondering about any of the settings.