Skip to main content
Incident: PostgreSQL Extension Installation Causes Connection Failure
  1. PostgreSQL Posts/

Incident: PostgreSQL Extension Installation Causes Connection Failure

Ruohang Feng
Author
Ruohang Feng
Pigsty Founder, @Vonng
Table of Contents

Author: Vonng (@Vonng)

Today encountered an interesting case where a customer reported database connection issues. The error was:

psql: FATAL:  could not load library "/export/servers/pgsql/lib/pg_hint_plan.so": /export/servers/pgsql/lib/pg_hint_plan.so: undefined symbol: RINFO_IS_PUSHED_DOWN

Obviously, this error shows the plugin wasn’t compiled properly, reporting symbol not found. Therefore, database backend processes crashed with FATAL error and exited directly when attempting to load the pg_hint_plan plugin during startup.

Usually this problem is relatively easy to solve. Such additional extensions are typically specified in shared_preload_libraries - just remove this extension name.

But then…
#

The customer said they enabled the extension via ALTER ROLE|DATABASE SET session_preload_libraries = pg_hint_plan.

These two commands override system default parameters when using specific users or connecting to specific databases to load the pg_hint_plan plugin.

ALTER DATABASE postgres SET session_preload_libraries = pg_hint_plan;
ALTER ROLE postgres SET session_preload_libraries = pg_hint_plan;

If this is the case, it’s still solvable. Usually as long as other users or databases can log in normally, you can remove these two configuration lines via ALTER TABLE statements.

But the bad thing was, all users and databases had this parameter configured, so no connection could connect to the database.

In this situation, the database became vegetative - postmaster was still alive, but any newly created backend server processes would commit suicide due to failed extensions… Even external binary commands like dropdb couldn’t work.

So then…
#

Unable to establish database connections, conventional methods all failed… only dirty hacks remained.

If we could erase user and database level configuration items at binary level, then we could connect to the database and clean up the extensions.

DB and Role level configurations are stored in system catalog pg_db_role_setting, which has fixed OID = 2964, stored in global/2964 under the data directory. Shut down the database, open the pg_db_role_setting file with binary editor:

# Open with vim, use :%!xxd to edit binary
# After editing use :%!xxd -r to convert back to binary, then :wq to save
vi ${PGDATA}/global/2964

Here, replace all pg_hint_plan strings with equal-length ^@ binary zero characters. Of course, if you don’t care about original configurations, the simpler approach is directly truncating this file to zero length.

Restart database, finally could connect again.

Reproduction
#

This problem is very simple to reproduce. Initialize a new database instance:

initdb -D /pg/test -U postgres && pg_ctl -D /pg/test start

Then execute the following statement to experience this sourness:

psql postgres postgres -c 'ALTER ROLE postgres SET session_preload_libraries = pg_hint_plan;'

Lessons…
#

  1. After installing extensions, always verify the extension works properly before enabling it
  2. Always leave a way out: an emergency clean superuser or a pollution-free connectable database would avoid such troubles

Related