git.postgresql.org Git - postgresql.git/log

Add regression test coverage for timetz_izone().

Extend the test added by commit 97957fdba so that it also covers
timetz_izone(), that is the "AT TIME ZONE interval" case.
This is mostly to see if xlc's apparent bug occurs there too,
but more code coverage is always welcome.

Discussion: https://postgr.es/m/2287835.1697464481@sss.pgh.pa.us

Ensure we have a snapshot while dropping ON COMMIT DROP temp tables.

Dropping a temp table could entail TOAST table access to clean out
toasted catalog entries, such as large pg_constraint.conbin strings
for complex CHECK constraints. If we did that via ON COMMIT DROP,
we triggered the assertion in init_toast_snapshot(), because
there was no provision for setting up a snapshot for the drop
actions. Fix that.

(I assume here that the adjacent truncation actions for ON COMMIT
DELETE ROWS don't have a similar problem: it doesn't seem like
nontransactional truncations would need to touch any toasted fields.
If that proves wrong, we could refactor a bit to have the same
snapshot acquisition cover that too.)

The test case added here does not fail before v15, because that
assertion was added in 277692220 which was not back-patched.
However, the race condition the assertion warns of surely
exists further back, so back-patch to all supported branches.

Per report from Richard Guo.

Discussion: https://postgr.es/m/CAMbWs4-x26=_QxxgdJyNbiCDzvtr2WV5ZDso_v-CukKEe6cBZw@mail.gmail.com

Move extra code out of the Pre/PostRestoreCommand() section.

If SIGTERM is received within this section, the startup process
will immediately proc_exit() in the signal handler, so it is
inadvisable to include any more code than is required there (as
such code is unlikely to be compatible with doing proc_exit() in a
signal handler). This commit moves the code recently added to this
section (see 1b06d7bac9 and 7fed801135) to outside of the section.
This ensures that the startup process only calls proc_exit() in its
SIGTERM handler for the duration of the system() call, which is how
this code worked from v8.4 to v14.

Reported-by: Michael Paquier, Thomas Munro
Analyzed-by: Andres Freund
Suggested-by: Tom Lane
Reviewed-by: Michael Paquier, Robert Haas, Thomas Munro, Andres Freund
Discussion: https://postgr.es/m/Y9nGDSgIm83FHcad%40paquier.xyz
Discussion: https://postgr.es/m/20230223231503.GA743455%40nathanxps13
Backpatch-through: 15

Update the documentation on recovering from (M)XID exhaustion.

The old documentation encourages entering single-user mode for no
reason, which is a bad plan in most cases. Instead, discourage users
from doing that, and explain the limited cases in which it may be
desirable.

The old documentation claims that running VACUUM as anyone but the
superuser can't possibly work, which is not really true, because it
might be that some other user has enough permissions to VACUUM all
the tables that matter. Weaken the language just a bit.

The old documentation claims that you can't run any commands
when near XID exhaustion, which is false because you can still
run commands that don't require an XID, like a SELECT without a
locking clause.

The old documentation doesn't clearly explain that it's a good idea
to get rid of prepared transactons, long-running transactions, and
replication slots that are preventing (M)XID horizon advancement.
Spell out the steps to do that.

Also, discourage the use of VACUUM FULL and VACUUM FREEZE in
this type of scenario.

Back-patch to v14. Much of this is good advice on all supported
versions, but before 60f1f09ff44308667ef6c72fbafd68235e55ae27
the chances of VACUUM failing in multi-user mode were much higher.

Alexander Alekseev, John Naylor, Robert Haas, reviewed at various
times by Peter Geoghegan, Hannu Krosing, and Andres Freund.

Discussion: http://postgr.es/m/CA+TgmoYtsUDrzaHcmjFhLzTk1VEv29mO_u-MT+XWHrBJ_4nD8A@mail.gmail.com

List 006_login_trigger.pl test for meson

Reported-by: Thomas Munro
Discussion: https://postgr.es/m/CA%2BhUKGLuqDUaYYhJnA1H1q5Z-k18kQHoEqZ5fiXtTi4038nspg%40mail.gmail.com

worker_spi: Fix test failure with BGWORKER_BYPASS_ROLELOGINCHECK

This is a consequence of 4817da51f69a that has bumped up
max_worker_processes, where now the last worker started by the test
would be able to start by itself a parallel worker because there are
more slots available. This did not show up before as the number of
bgworkers reached exactly 8, as known as the previous limit, at the end
of the test.

Per report from buildfarm member crake, reproducible with
debug_parallel_query = regress in the same fashion as fd4d93d269c0.

Try to handle torn reads of pg_control in frontend.

Some of our src/bin tools read the control file without any kind of
interlocking against concurrent writes from the server.  At least ext4
and ntfs can expose partially modified contents when you do that.

For now, we'll try to tolerate this by retrying up to 10 times if the
checksum doesn't match, until we get two reads in a row with the same
bad checksum.  This is not guaranteed to reach the right conclusion, but
it seems very likely to.  Thanks to Tom Lane for this suggestion.

Various ideas for interlocking or atomicity were considered too
complicated, unportable or expensive given the lack of field reports,
but remain open for future reconsideration.

Back-patch as far as 12.  It doesn't seem like a good idea to put a
heuristic change for a very rare problem into the final release of 11.

Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru>
Reviewed-by: David Steele <david@pgmasters.net>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

worker_spi: Bump up max_worker_processes in TAP tests

mamba has detected a failure in the last test that should start a
bgworker while bypassing the role login check. The buildfarm did not
provide any information about its failure in the logs, but I suspect
that this is caused by an exhaustion of the max_worker_processes slots
set at 8 by default.

In "normal" test runs, the number of bgworkers running at this stage of
the test is already 7, so, if one of them spawns for example a parallel
worker all the slots would be taken, preventing the last worker of the
test to start.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZSyebsiub88pyJJO@paquier.xyz

Rename 005_login_trigger.pl to 006_login_trigger.pl

In order to avoid numbering collision with 005_sspi.pl.

Fix role names in src/test/authentication/t/005_login_trigger.pl

Per buildfarm member longfin.

Add a few recent commits to .git-blame-ignore-revs

Three commits need to be added compared to the last time this file was
updated:
- e9718b4bd3e4
- 01529c704008
- b6a77c6a6ccf

Fix code indentation violations in e83d1b0c40cc

koel has not reported this one yet, I have just bumped on it while
looking at a different patch.

Fix comment from commit 22655aa231.

Per automated complaint from BF animal koel this needed to be
re-indented, but there was also a typo. Back-patch to 16.

Add support event triggers on authenticated login

This commit introduces trigger on login event, allowing to fire some actions
right on the user connection.  This can be useful for logging or connection
check purposes as well as for some personalization of environment.  Usage
details are described in the documentation included, but shortly usage is
the same as for other triggers: create function returning event_trigger and
then create event trigger on login event.

In order to prevent the connection time overhead when there are no triggers
the commit introduces pg_database.dathasloginevt flag, which indicates database
has active login triggers.  This flag is set by CREATE/ALTER EVENT TRIGGER
command, and unset at connection time when no active triggers found.

Author: Konstantin Knizhnik, Mikhail Gribkov
Discussion: https://postgr.es/m/0d46d29f-4558-3af9-9c85-7774e14a7709%40postgrespro.ru
Reviewed-by: Pavel Stehule, Takayuki Tsunakawa, Greg Nancarrow, Ivan Panchenko
Reviewed-by: Daniel Gustafsson, Teodor Sigaev, Robert Haas, Andres Freund
Reviewed-by: Tom Lane, Andrey Sokolov, Zhihong Yu, Sergey Shinderuk
Reviewed-by: Gregory Stark, Nikita Malakhov, Ted Yu

Acquire ControlFileLock in relevant SQL functions.

Commit dc7d70ea added functions that read the control file, but didn't
acquire ControlFileLock. With unlucky timing, file systems that have
weak interlocking like ext4 and ntfs could expose partially overwritten
contents, and the checksum would fail.

Back-patch to all supported releases.

Reviewed-by: David Steele <david@pgmasters.net>
Reviewed-by: Anton A. Melnikov <aamelnikov@inbox.ru>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20221123014224.xisi44byq3cf5psi%40awork3.anarazel.de

Dissociate btequalimage() from interval_ops, ending its deduplication.

Under interval_ops, some equal values are distinguishable.  One such
pair is '24:00:00' and '1 day'.  With that being so, btequalimage()
breaches the documented contract for the "equalimage" btree support
function.  This can cause incorrect results from index-only scans.
Users should REINDEX any btree indexes having interval-type columns.
After updating, pg_amcheck will report an error for almost all such
indexes.  This fix makes interval_ops simply omit the support function,
like numeric_ops does.  Back-pack to v13, where btequalimage() first
appeared.  In back branches, for the benefit of old catalog content,
btequalimage() code will return false for type "interval".  Going
forward, back-branch initdb will include the catalog change.

Reviewed by Peter Geoghegan.

Discussion: https://postgr.es/m/20231011013317.22.nmisch@google.com

Don't spuriously report FD_SETSIZE exhaustion on Windows.

Starting on 2023-08-03, this intermittently terminated a "pgbench -C"
test in CI.  It could affect a high-client-count "pgbench" without "-C".
While parallel reindexdb and vacuumdb reach the same problematic check,
sufficient client count and/or connection turnover is less plausible for
them.  Given the lack of examples from the buildfarm or from manual
builds, reproducing this must entail rare operating system
configurations.  Also correct the associated error message, which was
wrong for non-Windows.  Back-patch to v12, where the pgbench check first
appeared.  While v11 vacuumdb has the problematic check, reaching it
with typical vacuumdb usage is implausible.

Reviewed by Thomas Munro.

Discussion: https://postgr.es/m/CA+hUKG+JwvTNdcyJTriy9BbtzF1veSRQ=9M_ZKFn9_LqE7Kp7Q@mail.gmail.com

Harden xxx_is_visible() functions against concurrent object drops.

For the same reasons given in commit 403ac226d, adjust these
functions to not assume that checking SearchSysCacheExists can
guarantee success of a later fetch.

This follows the same internal API choices made in the earlier commit:
add a function XXXExt(oid, is_missing) and use that to eliminate
the need for a separate existence check.  The changes are very
straightforward, though tedious.  For the moment I just made the new
functions static in namespace.c, but we could export them if a need
emerges.

Per bug #18014 from Alexander Lakhin.  Given the lack of hard evidence
that there's a bug in non-debug builds, I'm content to fix this only
in HEAD.

Discussion: https://postgr.es/m/18014-28c81cb79d44295d@postgresql.org

Harden has_xxx_privilege() functions against concurrent object drops.

The versions of these functions that accept object OIDs are supposed
to return NULL, rather than failing, if the target object has been
dropped.  This makes it safe(r) to use them in queries that scan
catalogs, since the functions will be applied to objects that are
visible in the query's snapshot but might now be gone according to
the catalog snapshot.  In most cases we implemented this by doing
a SearchSysCacheExists test and assuming that if that succeeds, we
can safely invoke the appropriate aclchk.c function, which will
immediately re-fetch the same syscache entry.  It was argued that
if the existence test succeeds then the followup fetch must succeed
as well, for lack of any intervening AcceptInvalidationMessages call.

Alexander Lakhin demonstrated that this is not so when
CATCACHE_FORCE_RELEASE is enabled: the syscache entry will be forcibly
dropped at the end of SearchSysCacheExists, and then it is possible
for the catalog snapshot to get advanced while re-fetching the entry.
Alexander's test case requires the operation to happen inside a
parallel worker, but that seems incidental to the fundamental problem.
What remains obscure is whether there is a way for this to happen in a
non-debug build.  Nonetheless, CATCACHE_FORCE_RELEASE is a very useful
test methodology, so we'd better make the code safe for it.

After some discussion we concluded that the most future-proof fix
is to give up the assumption that checking SearchSysCacheExists can
guarantee success of a later fetch.  At best that assumption leads
to fragile code --- for example, has_type_privilege appears broken
for array types even if you believe the assumption holds.  And it's
not even particularly efficient.

There had already been some work towards extending the aclchk.c
APIs to include "is_missing" output flags, so this patch extends
that work to cover all the aclchk.c functions that are used by the
has_xxx_privilege() functions.  (This allows getting rid of some
ad-hoc decisions about not throwing errors in certain places in
aclchk.c.)

In passing, this fixes the has_sequence_privilege() functions to
provide the same guarantees as their cousins: for some reason the
SearchSysCacheExists tests never got added to those.

There is more work to do to remove the unsafe coding pattern with
SearchSysCacheExists in other places, but this is a pretty
self-contained patch so I'll commit it separately.

Per bug #18014 from Alexander Lakhin.  Given the lack of hard evidence
that there's a bug in non-debug builds, I'm content to fix this only
in HEAD.  (Perhaps we should clean up the has_sequence_privilege()
oversight in the back branches, but in the absence of field complaints
I'm not too excited about that either.)

Discussion: https://postgr.es/m/18014-28c81cb79d44295d@postgresql.org

Fix bulk table extension when copying into multiple partitions

When COPYing into a partitioned table that does now permit the use of
table_multi_insert(), we could error out with
ERROR: could not read block NN in file "base/...": read only 0 of 8192 bytes

because BulkInsertState->next_free was not reset between partitions. This
problem occurred only when not able to use table_multi_insert(), as a
dedicated BulkInsertState for each partition is used in that case.

The bug was introduced in 00d1e02be24, but it was hard to hit at that point,
as commonly bulk relation extension is not used when not using
table_multi_insert(). It became more likely after 82a4edabd27, which expanded
the use of bulk extension.

To fix the bug, reset the bulk relation extension state in BulkInsertState in
ReleaseBulkInsertStatePin(). That was added (in b1ecb9b3fcf) to tackle a very
similar issue. Obviously the name is not quite correct, but there might be
external callers, and bulk insert state needs to be reset in precisely in the
situations that ReleaseBulkInsertStatePin() already needed to be called.

Medium term the better fix likely is to disallow reusing BulkInsertState
across relations.

Add a test that, without the fix, reproduces #18130 in most
configurations. The test also catches the problem fixed in b1ecb9b3fcf when
run with small shared_buffers.

Reported-by: Ivan Kolombet <enderstd@gmail.com>
Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us>
Analyzed-by: Andres Freund <andres@anarazel.de>
Bug: #18130
Discussion: https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org
Discussion: https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us
Backpatch: 16-

Improve the naming in wal_sync_method code.

* sync_method is renamed to wal_sync_method.

* sync_method_options[] is renamed to wal_sync_method_options[].

* assign_xlog_sync_method() is renamed to assign_wal_sync_method().

* The names of the available synchronization methods are now
  prefixed with "WAL_SYNC_METHOD_" and have been moved into a
  WalSyncMethod enum.

* PLATFORM_DEFAULT_SYNC_METHOD is renamed to
  PLATFORM_DEFAULT_WAL_SYNC_METHOD, and DEFAULT_SYNC_METHOD is
  renamed to DEFAULT_WAL_SYNC_METHOD.

These more descriptive names help distinguish the code for
wal_sync_method from the code for DataDirSyncMethod (e.g., the
recovery_init_sync_method configuration parameter and the
--sync-method option provided by several frontend utilities).  This
change also prevents name collisions between the aforementioned
sets of code.  Since this only improves the naming of internal
identifiers, there should be no behavior change.

Author: Maxim Orlov
Discussion: https://postgr.es/m/CACG%3DezbL1gwE7_K7sr9uqaCGkWhmvRTcTEnm3%2BX1xsRNwbXULQ%40mail.gmail.com

Update oldextversions test for pg_stat_statements 1.11

Commit 5a3423ad8e updated pg_stat_statements to 1.11 due to added
columns in the view for jit deform counters. Fixing oldextversion
was however missed in that commit. This adds a test for the 1.11
version view definition.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ2Y7c2VEg4S1KDXFcaHjyF8=KZa_rMV6WOe+KwE-KE97g@mail.gmail.com

Doc: Add more links in logical replication pages.

The logical replication pages in the docs mostly have links to
corresponding pub/sub commands whenever they are mentioned, but there were
some omissions. This patch adds the missing links.

Author: Peter Smith
Reviewed-by: Vignesh C, Amit Kapila
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPu2S4RdzYKR7H5_E7QYWyq5hB0hL4EFrYbP91Qso62jeg%40mail.gmail.com

psql: Add completion support for AT [ LOCAL | TIME ZONE ]

AT TIME ZONE is completed with a list of supported timezones, something
not needed by AT LOCAL.

Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Jim Jones
Discussion: https://postgr.es/m/87jzyzsvgv.fsf@wibble.ilmari.org

Add support for AT LOCAL

When converting a timestamp to/from with/without time zone, the SQL
Standard specifies an AT LOCAL variant of AT TIME ZONE which uses the
session's time zone. This includes three system functions able to do
the work in the same way as the existing flavors for AT TIME ZONE,
except that these need to be marked as stable as they depend on the
session's TimeZone GUC.

Bump catalog version.

Author: Vik Fearing
Reviewed-by: Laurenz Albe, Cary Huang, Michael Paquier
Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org

Add wait events for checkpoint delay mechanism.

When MyProc->delayChkptFlags is set to temporarily block phase
transitions in a concurrent checkpoint, the checkpointer enters a
sleep-poll loop to wait for the flag to be cleared. We should show that
as a wait event in the pg_stat_activity view.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGL7Whi8iwKbzkbn_1fixH3Yy8aAPz7mfq6Hpj7FeJrKMg%40mail.gmail.com

doc: Mention timezone(zone, time) in section for AT TIME ZONE

timezone(zone, timestamp) is already mentioned as an equivalent of the
two first patterns in the table describing the AT TIME ZONE variants,
but did not mention the third case about "time" and its equivalent as an
SQL function, so let's be consistent here.

Extracted from a larger patch by the same author.

Author: Vik Fearing
Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org

Unify two isLogSwitch tests in XLogInsertRecord.

An upcoming patch wants to introduce an additional special case in
this function. To keep that as cheap as possible, minimize the amount
of branching that we do based on whether this is an XLOG_SWITCH
record.

Additionally, and also in the interest of keeping the overhead of
special-case code paths as low as possible, apply likely() to the
non-XLOG_SWITCH case, since only a very tiny fraction of WAL records
will be XLOG_SWITCH records.

Patch by me, reviewed by Dilip Kumar, Amit Kapila, Andres Freund,
and Michael Paquier.

Discussion: http://postgr.es/m/CA+TgmoYy-Vc6G9QKcAKNksCa29cv__czr+N9X_QCxEfQVpp_8w@mail.gmail.com

Fix runtime partition pruning for HASH partitioned tables

This could only affect HASH partitioned tables with at least 2 partition
key columns.

If partition pruning was delayed until execution and the query contained
an IS NULL qual on one of the partitioned keys, and some subsequent
partitioned key was being compared to a non-Const, then this could result
in a crash due to the incorrect keyno being used to calculate the
stateidx for the expression evaluation code.

Here we fix this by properly skipping partitioned keys which have a
nullkey set. Effectively, this must be the same as what's going on
inside perform_pruning_base_step().

Sergei Glukhov also provided a patch, but that's not what's being used
here.

Reported-by: Sergei Glukhov
Reviewed-by: tender wang, Sergei Glukhov
Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru
Backpatch-through: 11, where runtime partition pruning was added.

Doc: fix grammatical errors for enable_partitionwise_aggregate

Author: Andrew Atkinson
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/CAG6XLEnC%3DEgq0YHRic2kWWDs4xwQnQ_kBA6qhhzAq1-pO_9Tfw%40mail.gmail.com
Backpatch-through: 11, where enable_partitionwise_aggregate was added

Fix incorrect step generation in HASH partition pruning

get_steps_using_prefix_recurse() incorrectly assumed that it could stop
recursive processing of the 'prefix' list when cur_keyno was one before
the step_lastkeyno.  Since hash partition pruning can prune using IS
NULL quals, and these IS NULL quals are not present in the 'prefix'
list, then that logic could cause more levels of recursion than what is
needed and lead to there being no more items in the 'prefix' list to
process.  This would manifest itself as a crash in some code that
expected the 'start' ListCell not to be NULL.

Here we adjust the logic so that instead of stopping recursion at 1 key
before the step_lastkeyno, we just look at the llast(prefix) item and
ensure we only recursively process up until just before whichever the last
key is.  This effectively allows keys to be missing in the 'prefix' list.

This change does mean that step_lastkeyno is no longer needed, so we
remove that from the static functions.  I also spent quite some time
reading this code and testing it to try to convince myself that there
are no other issues.  That resulted in the irresistible temptation of
rewriting some comments, many of which were just not true or inconcise.

Reported-by: Sergei Glukhov
Reviewed-by: Sergei Glukhov, tender wang
Discussion: https://postgr.es/m/2f09ce72-315e-2a33-589a-8519ada8df61@postgrespro.ru
Backpatch-through: 11, where partition pruning was introduced.

Add option to bgworkers to allow the bypass of role login check

This adds a new option called BGWORKER_BYPASS_ROLELOGINCHECK to the
flags available to BackgroundWorkerInitializeConnection() and
BackgroundWorkerInitializeConnectionByOid().

This gives the possibility to bgworkers to bypass the role login check,
making possible the use of a role that has no login rights while not
being a superuser. PostgresInit() gains a new flag called
INIT_PG_OVERRIDE_ROLE_LOGIN, taking advantage of the refactoring done in
4800a5dfb4c4.

Regression tests are added to worker_spi to check the behavior of this
new option with bgworkers.

Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart, Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/bcc36259-7850-4882-97ef-d6b905d2fc51@gmail.com

Reindent comment in GenericXLogFinish().

Restore pgindent cleanliness, per buildfarm member koel.

Fix missed optimization in relation_excluded_by_constraints().

In commit 3fc6e2d7f, I (tgl) argued that we only need to check for
a constant-FALSE restriction clause when there's exactly one
restriction clause, on the grounds that const-folding would have
thrown away anything ANDed with a Const FALSE.  That's true just after
const-folding has been applied, but subsequent processing such as
equivalence class expansion could result in cases where a Const FALSE
is ANDed with some other stuff.  (Compare for instance joinrels.c's
restriction_is_constant_false.)  Hence, tweak this logic to check all
the elements of the baserestrictinfo list, not just one; that's cheap
enough to not be worth worrying about.

There is one existing test case where this visibly improves the plan.
There would not be any savings in runtime, but the planner effort and
executor startup effort will be reduced, and anyway it's odd that
we can detect related cases but not this one.

Richard Guo (independently discovered by David Rowley)

Discussion: https://postgr.es/m/CAMbWs4_x3-CnVVrCboS1LkEhB5V+W7sLSCabsRiG+n7+5_kqbg@mail.gmail.com

Move canAcceptConnections check from ProcessStartupPacket to caller.

The check is not about processing the startup packet, so the calling
function seems like a more natural place. I'm also working on a patch
that moves 'canAcceptConnections' out of the Port struct, and this
makes that refactoring more convenient.

Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi

Improve some wording in pg_upgrade/IMPLEMENTATION

Author: Gurjeet Singh
Discussion: https://postgr.es/m/CABwTF4VFKtKrb78fBnMXwHvOu4a+-7y86siBSEety2knti2eGA@mail.gmail.com

Refactor InitPostgres() to use bitwise option flags

InitPostgres() has been using a set of boolean arguments to control its
behavior, and a patch under discussion was aiming at expanding it with a
third one.  In preparation for expanding this area, this commit switches
all the current boolean arguments of this routine to a single bits32
argument instead.  Two values are currently supported for the flags:
- INIT_PG_LOAD_SESSION_LIBS to load [session|local]_preload_libraries at
startup.
- INIT_PG_OVERRIDE_ALLOW_CONNS to allow connection to a database even if
it has !datallowconn.  This is used by bgworkers.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZSTn66_BXRZCeaqS@paquier.xyz

doc: pg_upgrade: use dynamic new cluster major version numbers

Also update docs to use more recent old version numbers

Reported-by: mark.a.sloan@gmail.com
Discussion: https://postgr.es/m/169506804412.3727336.8571753495127355296@wrigleys.postgresql.org

Backpatch-through: 16

doc: clarify that SSPI and GSSAPI are interchangeable

Reported-by: tpo_deb@sourcepole.ch
Discussion: https://postgr.es/m/167846222574.1803490.15815104179136215862@wrigleys.postgresql.org

Backpatch-through: 11

doc: Move CREATE ROLE's IN GROUP and USER to deprecated

Reported-by: t.kitynski@gmail.com
Discussion: https://postgr.es/m/167473556945.2667294.2003897901995802549@wrigleys.postgresql.org

Backpatch-through: master

doc: foreign servers with pushdown need matching collation

Reported-by: Pete Storer
Discussion: https://postgr.es/m/BL0PR05MB66283C57D72E321591AE4EB1F3CE9@BL0PR05MB6628.namprd05.prod.outlook.com

Backpatch-through: 11

doc: add SSL configuration section reference

Reported-by: Steve Atkins
Discussion: https://postgr.es/m/B82E80DD-1452-4175-B19C-564FE46705BA@blighty.com

Backpatch-through: 11

doc: clarify how the bootstrap user name is chosen

Discussion: https://postgr.es/m/167931662853.3349090.18217722739345182859@wrigleys.postgresql.org

Backpatch-through: 16

doc: document the need to analyze partitioned tables

Autovacuum does not do it.

Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20210913035409.GA10647@telsasoft.com

Backpatch-through: 11

Fix bug in GenericXLogFinish().

Mark the buffers dirty before writing WAL.

Discussion: https://postgr.es/m/25104133-7df8-cae3-b9a2-1c0aaa1c094a@iki.fi
Reviewed-by: Heikki Linnakangas
Backpatch-through: 11

Replace has_multiple_baserels() with a bitmap test on all_baserels.

Since we added the PlannerInfo.all_baserels set, it's not really
necessary to grovel over the rangetable to count baserels in the
current query. So let's drop has_multiple_baserels() in favor
of a bms_membership() test. This might be microscopically
faster, but the main point is to remove some unnecessary code.

Richard Guo

Discussion: https://postgr.es/m/CAMbWs4_8RcSbbfs1ASZLrMuL0c0EQgXWcoLTQD8swBRY_pQQiA@mail.gmail.com

test_decoding: Remove unnecessary table in twophase test

The end of this test is dropping all the relations created but forgot
about this one. This is not critical, but let's be clean, and the test
expects a cleanup, as documented.

Author: Nishant Sharma
Discussion: https://postgr.es/m/CADrsxdb0ueGV9nrC6s8zvXLkGUhnEjx7Ou_p5wo38TvmSvF83A@mail.gmail.com

pg_resetwal: Corrections around -c option

The present pg_resetwal code hardcodes the minimum value for -c as 2,
which is FrozenTransactionId, but it's not clear why that is allowed.
After some research, it was probably a mistake in the original patch.
Change it to FirstNormalTransactionId, which matches other xid-related
options in pg_resetwal.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/d09f0e91-8757-642b-1a92-da9a52f5589a%40eisentraut.org

Add const to values and nulls arguments

This excludes any changes that would change the external AM APIs.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/flat/14c31f4a-0347-0805-dce8-93a9072c05a5%40eisentraut.org

Fix possible crash in add_paths_to_append_rel()

While working on a8a968a82, I failed to consider that
cheapest_startup_path can be NULL when there is no non-parameterized
path in the pathlist. This is well documented in set_cheapest(), I just
failed to notice.

Here we adjust the code to just check if the RelOptInfo has a
cheapest_startup_path set before adding it to the startup_subpaths list.

Reported-by: Richard Guo
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs49w3t03V69XhdCuw+GDwivny4uQUxrkVp6Gejaspt0wMQ@mail.gmail.com

Revert "Optimize various aggregate deserialization functions"

This reverts commit 608fd198def5390c3490bfe903730207dfd8eeb4.

On 2nd thoughts, the StringInfo API requires that strings are NUL
terminated and pointing directly to the data in a bytea Datum isn't NUL
terminated.

Discussion: https://postgr.es/m/CAApHDvorfO3iBZ=xpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ@mail.gmail.com

worker_spi: Fix another stability issue with BGWORKER_BYPASS_ALLOWCONN

worker_spi_launch() may report that a worker stopped when it fails to
connect on a database that does not allow connections if the worker
exits before the SQL function checks for the current status of the
worker.  The test is switched to use Cluster::psql instead of
safe_psql() so as it does not fail hard when this query errors.  While
on it, this removes a query that looks at pg_stat_activity to simplify
the test, as a check on the contents of the server logs achieves the
same when the worker cannot connect to the database without
datallowconn.

Per buildfarm members kestrel, mamba and serinus.  Bonus thanks to Tom
Lane for providing the logs of the failure from mamba that the buildfarm
was not able to show up.  Note that I have reproduced the failure with a
hardcoded stop point.

Discussion: https://postgr.es/m/3365937.1696801735@sss.pgh.pa.us

Doc: use CURRENT_USER not USER in plpgsql trigger examples.

While these two built-in functions do exactly the same thing,
CURRENT_USER seems preferable to use in documentation examples.
It's easier to look up if the reader is unsure what it is.
Also, this puts these examples in sync with an adjacent example
that already used CURRENT_USER.

Per question from Kirk Parker.

Discussion: https://postgr.es/m/CANwZ8rmN_Eb0h0hoMRS8Feftaik0z89PxVsKg+cP+PctuOq=Qg@mail.gmail.com

Rename StartBackgroundWorker() to BackgroundWorkerMain().

The comment claimed that it is "called from postmaster", but it is
actually called in the child process, pretty early in the process
initialization. I guess you could interpret "called from postmaster"
to mean that, but it seems wrong to me. Rename the function to be
consistent with other functions with similar role.

Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi

Allocate Backend structs in PostmasterContext.

The child processes don't need them. By allocating them in
PostmasterContext, the memory gets free'd and is made available for
other stuff in the child processes.

Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi

Clarify the checks in RegisterBackgroundWorker.

In EXEC_BACKEND or single-user mode, we process
shared_preload_libraries at postmaster startup as usual, but also at
backend startup. When a library calls RegisterBackgroundWorker() when
being loaded into a backend process, we go through the motions to add
the worker to BackgroundWorkerList, even though that is a
postmaster-private data structure. Make it return early when called in
a backend process, without changing BackgroundWorkerList.

You could argue that it was intentional: In non-EXEC_BACKEND mode, the
backend processes inherit BackgroundWorkerList at fork(), so it does
make some sense to initialize it to the same state in EXEC_BACKEND
mode, too. It's clearly a postmaster-private structure, though, and
all the functions that use it are clearly marked as "should only be
called in postmaster".

You could also argue that libraries should not call
RegisterBackgroundWorker() during backend startup. It's too late to
correctly register any static background workers at that stage. But
it's a common pattern in extensions, and it doesn't seem worth the
churn to require all extensions to change it.

Another sloppiness was the exception for "internal" background
workers. We checked that RegisterBackgroundWorker() was called during
shared_preload_libraries processing, or the background worker was an
internal one. That exception was made in commit 665d1fad99 to allow
postmaster to register the logical apply launcher in
ApplyLauncherRegister(). The way the check was written, it would not
complain if you registered an internal background worker in a regular
backend process. But it would complain if postmaster registered a
background worker defined in a shared library, outside
shared_preload_libraries processing. I think the correct rule is that
you can only register static background workers in the postmaster
process, and only before the bgworker shared memory array has been
initialized. Check for that more directly.

Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi

Optimize various aggregate deserialization functions

The serialized representation of an internal aggregate state is a bytea
value.  In each deserial function, in order to "receive" the bytea value
we appended it onto a short-lived StringInfoData using
appendBinaryStringInfo.  This was a little wasteful as it meant having to
palloc memory, copy a (possibly long) series of bytes then later pfree
that memory.  Instead of going to this extra trouble, we can just fake up
a StringInfoData and point the data directly at the bytea's payload.  This
should help increase the performance of internal aggregate
deserialization.

Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAApHDvr=e-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR=cQ@mail.gmail.com

Remove duplicate words in docs and code comments.

Additionally, add a missing "the" in a couple of places.

Author: Vignesh C, Dagfinn Ilmari Mannsåker
Discussion: http://postgr.es/m/CALDaNm28t+wWyPfuyqEaARS810Je=dRFkaPertaLAEJYY2cWYQ@mail.gmail.com

Strip off ORDER BY/DISTINCT aggregate pathkeys in create_agg_path

1349d2790 added code to adjust the PlannerInfo.group_pathkeys so that
ORDER BY / DISTINCT aggregate functions could obtain pre-sorted inputs
to allow faster execution.  That commit forgot to adjust the pathkeys in
create_agg_path().  Some code in there assumed that it was always fine
to make the AggPath's pathkeys the same as its subpath's.  That seems to
have been ok up until 1349d2790, but since that commit adds pathkeys for
columns which are within the aggregate function, those columns won't be
available above the aggregate node.  This can result in "could not find
pathkey item to sort" during create_plan().

The fix here is to strip off the additional pathkeys added by
adjust_group_pathkeys_for_groupagg().  It seems that the pathkeys here
will only ever be group_pathkeys, so all we need to do is check if the
length of the pathkey list is longer than the num_groupby_pathkeys and
get rid of the additional ones only if we see extras.

Reported-by: Justin Pryzby
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/ZQhYYRhUxpW3PSf9%40telsasoft.com
Backpatch-through: 16, where 1349d2790 was introduced

Remove debug_print_rel and replace usages with pprint

Going by c4a1933b4, b33ef397a and 05893712c (to name just a few), it seems
that maintaining debug_print_rel() is often forgotten.  In the case of
c4a1933b4, it was several years before anyone noticed that a path type
was not handled by debug_print_rel().  (debug_print_rel() is only
compiled when building with OPTIMIZER_DEBUG).

After a quick survey on the pgsql-hackers mailing list, nobody came
forward to admit that they use OPTIMIZER_DEBUG.  So to prevent any future
maintenance neglect, let's just remove debug_print_rel() and have
OPTIMIZER_DEBUG make use of pprint() instead (as suggested by Tom Lane).
If anyone wants to come forward to claim they make use of
OPTIMIZER_DEBUG in a way that they need debug_print_rel() then they have
around 10 months remaining in the v17 cycle where we could revert this.
If nobody comes forward in that time, then we can likely safely declare
debug_print_rel() as not worth keeping.

Discussion: https://postgr.es/m/CAApHDvoCdjo8Cu2zEZF4-AxWG-90S+pYXAnoDDa9J3xH-OrczQ@mail.gmail.com

Fix another typo in e0b1ee17dc

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_kHMJDak75y1kBTirv-drS1-knT-7Mpg5LprAjqRJDVA%40mail.gmail.com

Restore proper linkage of pg_char_to_encoding() and friends.

Back in the 8.3 era we discovered that it was problematic if
libpq.so had encoding ID assignments different from the backend,
which is possible because on some platforms libpq.so might be
of a different major version from the calling programs.
psql should use libpq's assignments, but initdb has to use the
backend's, else it will put wrong values into pg_database.
The solution devised in commit 8468146b0 relied on giving initdb
its own copy of encnames.c rather than relying on the functions
exported by libpq.  Later, that metamorphosed into ensuring that
libpgcommon got linked before libpq -- which made things OK for
initdb but broke psql.  We didn't notice for lack of any changes
in enum pg_enc since then.  Commit 06843df4a reversed that, fixing
the latent bug in psql but adding one in initdb.  The meson build
infrastructure is also not being sufficiently careful about link
order, and trying to make it so would be equally fragile.

Hence, let's use a new scheme based on giving the libpq-exported
symbols different real names than the same functions exported from
libpgcommon.a or libpgcommon_srv.a.  (We could distinguish those
two cases as well, but there seems no need to.)  libpq gets the
official names to avoid an ABI break for libpq clients, while the
other cases use #define's to make the real names "xxx_private"
rather than "xxx".  By controlling where the #define's are
applied, we can force any particular client program to use one
set or the other of the encnames.c functions.

We cannot back-patch this, since it'd be an ABI break for backend
loadable modules, but there seems little need to.  We're just
trying to ensure that the world is safe for hypothetical future
additions to enum pg_enc.

In passing this should fix "duplicate symbol" linker warnings
that we've been seeing on AIX buildfarm members since commit
06843df4a.  It's not very clear why that linker is complaining
now, when there were strictly *more* duplicates visible before,
but in any case this should remove the reason for complaint.

Patch by me; thanks to Andres Freund for review.

Discussion: https://postgr.es/m/2385119.1696354473@sss.pgh.pa.us

Fix typos in e0b1ee17dc

Reported-by: Alexander Lakhin

Add test for checking the line length of --help output

There was some discussion what the line length should be. Most output
currently clearly targets around 80 columns, but the maximum in use
currently is 95, so we set that as the current maximum. This just
ensures that there is some guidance and there are no wild deviations.

based on patch by Atsushi Torikoshi <torikoshia@oss.nttdata.com>

Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com

Remove environment-variable-based defaults in psql --help

This seemed inconsistent with the --help output of other tools.
Depending on the values, it can cause ugly formatting. Also, we're
not getting the defaults from libpq, we're just emulating the methods
libpq uses to derive these values, so they might not be 100% correct.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com

Remove extra parenthesis from comment.

Skip checking of scan keys required for directional scan in B-tree

Currently, B-tree code matches every scan key to every item on the page.
Imagine the ordered B-tree scan for the query like this.

SELECT * FROM tbl WHERE col > 'a' AND col < 'b' ORDER BY col;

The (col > 'a') scan key will be always matched once we find the location to
start the scan.  The (col < 'b') scan key will match every item on the page
as long as it matches the last item on the page.

This patch implements prechecking of the scan keys required for directional
scan on beginning of page scan.  If precheck is successful we can skip this
scan keys check for the items on the page.  That could lead to significant
acceleration especially if the comparison operator is expensive.

Idea from patch by Konstantin Knizhnik.

Discussion: https://postgr.es/m/079c3f8e-3371-abe2-e93c-fc8a0ae3f571%40garret.ru
Reviewed-by: Peter Geoghegan, Pavel Borisov

Fix crash on syslogger startup

When syslogger starts up, ListenSockets is still NULL. Don't try to
pfree it. Oversight in commit e29c464395.

Reported-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/ZR-uNkgL7m60lWUe@paquier.xyz

worker_spi: Fix test failure with BGWORKER_BYPASS_ALLOWCONN

A bgworker can spawn parallel workers of its own when executing queries,
and if the worker uses BGWORKER_BYPASS_ALLOWCONN while the database it
is connected to does not allow connections, a parallel worker would fail
to startup. In the case of this module, the step checking for the
presence of the schema to create was spawning a worker, failing the last
test introduced by 991bb0f9653c.

This issue could be reproduced with debug_parallel_query = 'regress',
for example.

Per buildfarm member crake.

worker_spi: Add tests for BGWORKER_BYPASS_ALLOWCONN

This bgworker flag exists in the core code since eed1ce72e159, but was
never tested. This relies on 4f2994647ff1, that has added a way to
start dynamic workers with this flag enabled.

Reviewed-by: Bertrand Drouvot, Bharath Rupireddy
Discussion: https://postgr.es/m/bcc36259-7850-4882-97ef-d6b905d2fc51@gmail.com

Push attcompression and attstorage handling into BuildDescForRelation()

This was previously handled by the callers but it can be moved into a
common place.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org

Move BuildDescForRelation() from tupdesc.c to tablecmds.c

BuildDescForRelation() main job is to convert ColumnDef lists to
pg_attribute/tuple descriptor arrays, which is really mostly an
internal subroutine of DefineRelation() and some related functions,
which is more the remit of tablecmds.c and doesn't have much to do
with the basic tuple descriptor interfaces in tupdesc.c. This is also
supported by observing the header includes we can remove in tupdesc.c.
By moving it over, we can also (in the future) make
BuildDescForRelation() use more internals of tablecmds.c that are not
sensible to be exposed in tupdesc.c.

Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org

Push attidentity and attgenerated handling into BuildDescForRelation()

Previously, this was handled by the callers separately, but it can be
trivially moved into BuildDescForRelation() so that it is handled in a
central place.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e4da@eisentraut.org

Refactor ListenSocket array.

Keep track of the used size of the array. That avoids looping through
the whole array in a few places. It doesn't matter from a performance
point of view since the array is small anyway, but this feels less
surprising and is a little less code. Now that we have an explicit
NumListenSockets variable that is statically initialized to 0, we
don't need the loop to initialize the array.

Allocate the array in PostmasterContext. The array isn't needed in
child processes, so this allows reusing that memory. We could easily
make the array resizable now, but we haven't heard any complaints
about the current 64 sockets limit.

Discussion: https://www.postgresql.org/message-id/7bb7ad65-a018-2419-742f-fa5fd877d338@iki.fi

Improve JsonLexContext's freeability

Previously, the JSON code didn't have to worry too much about freeing
JsonLexContext, because it was never too long-lived.  With new features
being added for SQL/JSON this is no longer the case.  Add a routine
that knows how to free this struct and apply that to a few places, to
prevent this from becoming problematic.

At the same time, we change the API of makeJsonLexContextCstringLen to
make it receive a pointer to JsonLexContext for callers that want it to
be stack-allocated; it can also be passed as NULL to get the original
behavior of a palloc'ed one.

This also causes an ABI break due to the addition of flags to
JsonLexContext, so we can't easily backpatch it.  AFAICS that's not much
of a problem; apparently some leaks might exist in JSON usage of
text-search, for example via json_to_tsvector, but I haven't seen any
complaints about that.

Per Coverity complaint about datum_to_jsonb_internal().

Discussion: https://postgr.es/m/20230808174110.oq3iymllsv6amkih@alvherre.pgsql

Consider cheap startup paths in add_paths_to_append_rel

6b94e7a6d did this for ordered append paths to allow fast startup
MergeAppends, however, nothing was done for the Append case.

Here we adjust add_paths_to_append_rel() to have it build an AppendPath
containing the cheapest startup paths from each of the child relations
when the append rel has "consider_startup" set.

Author: Andy Fan, David Rowley
Discussion: https://www.postgresql.org/message-id/CAKU4AWrXSkUV=Pt-gRxQT7EbfUeNssprGyNsB=5mJibFZ6S3ww@mail.gmail.com

Fix memory leak in Memoize code

Ensure we switch to the per-tuple memory context to prevent any memory
leaks of detoasted Datums in MemoizeHash_hash() and MemoizeHash_equal().

Reported-by: Orlov Aleksej
Author: Orlov Aleksej, David Rowley
Discussion: https://postgr.es/m/83281eed63c74e4f940317186372abfd%40cft.ru
Backpatch-through: 14, where Memoize was added

Constify crc32_sz

Author: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/e08317a0-a2e7-c60d-c14a-ad9fc34f8f6c%40eisentraut.org

Modernize const handling with readline

The comment

    /* On some platforms, readline is declared as readline(char *) */

is obsolete.  The casting away of const can be removed.

The const in the readline() prototype was added in GNU readline 4.2,
released in 2001.  BSD libedit has also had const in the prototype
since at least 2001.

(The commit that introduced this comment (187e865174) talked about
FreeBSD 4.8, which didn't have readline compatibility in libedit yet,
so it must have been talking about GNU readline in the base system.
This checks out, but already FreeBSD 5 had an updated GNU readline
with const.)

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/862fc1d4-9a0c-d2b6-5451-ee3dc750bcab%40eisentraut.org

worker_spi: Expand set of options to start workers

A couple of new options are added to this module to provide more control
on the ways bgworkers are started:
- A new GUC called worker_spi.role to control which role to use by
default when starting a worker.
- worker_spi_launch() gains three arguments: a role OID, a database OID
and flags (currently only BGWORKER_BYPASS_ALLOWCONN).  By default, the
role OID and the database OID are InvalidOid, in which case the worker
would use the related GUCs.

Workers loaded by shared_preload_libraries use the default values
provided by the GUCs, with flags at 0.  The options are given to the
main bgworker routine through bgw_extra.  A test case is tweaked to
start two dynamic workers with databases and roles defined by the caller
of worker_spi_launch().

These additions will have the advantage of expanding the tests for
bgworkers, for at least two cases:
- BGWORKER_BYPASS_ALLOWCONN has no coverage in the core tree.
- A new bgworker flag is under discussion, and this eases the
integration of new tests.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/bcc36259-7850-4882-97ef-d6b905d2fc51@gmail.com

dblink: Replace WAIT_EVENT_EXTENSION with custom wait events

Two custom wait events are added here:
- "DblinkConnect", when waiting to establish a connection to a remote
server.
- "DblinkGetConnect", when waiting to establish a connection to a remote
server but it could not be found in the list of already-opened ones.

Author: Masahiro Ikeda
Discussion: https://postgr.es/m/197bce267fa691a0ac62c86c4ab904c4@oss.nttdata.com

postgres_fdw: Replace WAIT_EVENT_EXTENSION with custom wait events

Three custom wait events are added here:
- "PostgresFdwCleanupResult", waiting while cleaning up PQgetResult() on
transaction abort.
- "PostgresFdwConnect", waiting to establish a connection to a remote
server.
- "PostgresFdwGetResult", waiting to receive a result from a remote
server.

Author: Masahiro Ikeda
Discussion: https://postgr.es/m/197bce267fa691a0ac62c86c4ab904c4@oss.nttdata.com

Document that --sync-method takes an argument.

This was missed in commit 8c16ad3b43.

Reported-by: Robert Haas
Reviewed-by: Daniel Gustafsson, Robert Haas, Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CA%2BTgmoZi7pcx-ec3oJLWSr2R%3DDn2Zeiyx3EXQKc_1TTvA6Eepg%40mail.gmail.com

doc: Clarify not-null constraints in information schema

Add a bit of clarification in various places that not-null constraints
are included under check constraints in the information schema.

test_shm_mq: Replace WAIT_EVENT_EXTENSION with custom wait events

Two custom wait events are added here:
- "TestShmMqBgWorkerStartup", when setting up a set of bgworkers in
wait_for_workers_to_become_ready().
- "TestShmMqMessageQueue", when waiting for a queued message in
test_shm_mq_pipelined().

Author: Masahiro Ikeda
Discussion: https://postgr.es/m/197bce267fa691a0ac62c86c4ab904c4@oss.nttdata.com

worker_spi: Rename custom wait event to "WorkerSpiMain"

This naming is more consistent with all the other user-facing wait event
strings. Other in-core modules will use the same naming convention, so
let's be consistent here as well.

Extracted from a larger patch by the same author.

Author: Masahiro Ikeda
Discussion: https://postgr.es/m/197bce267fa691a0ac62c86c4ab904c4@oss.nttdata.com

Doc: suppress "exceed the available area" warning in PDF build.

Allow a line break in example output, as we have done elsewhere.
Overlength output was added in commit 1e68e43d3.

While here, adjust some shaky grammar in an adjacent note
(from a different commit, c9af05465).

Per buildfarm.

Remove RelationGetIndexRawAttOptions()

There was only one caller left, for which this function was overkill.

Also, having it in relcache.c was inappropriate, since it doesn't work
with the relcache at all.

Discussion: https://www.postgresql.org/message-id/flat/f84640e3-00d3-5abd-3f41-e6a19d33c40b@eisentraut.org

Remove IndexInfo.ii_OpclassOptions field

It is unnecessary to include this field in IndexInfo. It is only used
by DDL code, not during execution. It is really only used to pass
local information around between functions in index.c and indexcmds.c,
for which it is clearer to use local variables, like in similar cases.

Discussion: https://www.postgresql.org/message-id/flat/f84640e3-00d3-5abd-3f41-e6a19d33c40b@eisentraut.org

Add some notes about why "ALTER TYPE enum DROP VALUE" is hard.

In hopes of putting these where any would-be implementer is sure to
find them, make a placeholder grammar production for ALTER DROP VALUE
and put them there. This is really just a docs patch, though.

Vik Fearing, with a bit more wordsmithing by me

Discussion: https://postgr.es/m/9fffd149-da0f-0c9c-6745-731fb688642a@postgresfriends.org

In basebackup.c, refactor to create read_file_data_into_buffer.

This further reduces the length and complexity of sendFile(),
hopefully make it easier to understand and modify. In addition
to moving some logic into a new function, I took this opportunity
to make a few slight adjustments to sendFile() itself, including
renaming the 'len' variable to 'bytes_done', since we use it to represent
the number of bytes we've already handled so far, not the total
length of the file.

Patch by me, reviewed by David Steele.

Discussion: http://postgr.es/m/CA+TgmoYt5jXH4U6cu1dm9Oe2FTn1aae6hBNhZzJJjyjbE_zYig@mail.gmail.com

In basebackup.c, refactor to create verify_page_checksum.

If checksum verification fails for a particular page, we reread the
page and try one more time. The code that does this somewhat complex
and difficult to follow. Move some of the logic into a new function
and rearrange the code a bit to try to make it clearer. This way,
we don't need the block_retry Boolean, a couple of other variables
move from sendFile() into the new function, and some code is now less
deeply indented.

Patch by me, reviewed by David Steele.

Discussion: http://postgr.es/m/CA+TgmoYt5jXH4U6cu1dm9Oe2FTn1aae6hBNhZzJJjyjbE_zYig@mail.gmail.com

Avoid memory size overflow when allocating backend activity buffer

The code in charge of copying the contents of PgBackendStatus to local
memory could fail on memory allocation because of an overflow on the
amount of memory to use. The overflow can happen when combining a high
value track_activity_query_size (max at 1MB) with a large
max_connections, when both multiplied get higher than INT32_MAX as both
parameters treated as signed integers. This could for example trigger
with the following functions, all calling pgstat_read_current_status():
- pg_stat_get_backend_subxact()
- pg_stat_get_backend_idset()
- pg_stat_get_progress_info()
- pg_stat_get_activity()
- pg_stat_get_db_numbackends()

The change to use MemoryContextAllocHuge() has been introduced in
8d0ddccec636, so backpatch down to 12.

Author: Jakub Wartak
Discussion: https://postgr.es/m/CAKZiRmw8QSNVw2qNK-dznsatQqz+9DkCquxP0GHbbv1jMkGHMA@mail.gmail.com
Backpatch-through: 12

Fix incorrect format placeholder

Tidy-up some appendStringInfo*() usages

Make a few newish calls to appendStringInfo() which have no special
formatting use appendStringInfoString() instead. Also, adjust usages of
appendStringInfoString() which only append a string containing a single
character to make use of appendStringInfoChar() instead.

This makes the code marginally faster, but primarily this change is so
we use the StringInfo type as it was intended to be used.

Discussion: https://postgr.es/m/CAApHDvpXKQmL+r=VDNS98upqhr9yGBhv2Jw3GBFFk_wKHcB39A@mail.gmail.com

Fail hard on out-of-memory failures in xlogreader.c

This commit changes the WAL reader routines so as a FATAL for the
backend or exit(FAILURE) for the frontend is triggered if an allocation
for a WAL record decode fails in walreader.c, rather than treating this
case as bogus data, which would be equivalent to the end of WAL.  The
key is to avoid palloc_extended(MCXT_ALLOC_NO_OOM) in walreader.c,
relying on plain palloc() calls.

The previous behavior could make WAL replay finish too early than it
should.  For example, crash recovery finishing earlier may corrupt
clusters because not all the WAL available locally was replayed to
ensure a consistent state.  Out-of-memory failures would show up
randomly depending on the memory pressure on the host, but one simple
case would be to generate a large record, then replay this record after
downsizing a host, as Ethan Mertz originally reported.

This relies on bae868caf222, as the WAL reader routines now do the
memory allocation required for a record only once its header has been
fully read and validated, making xl_tot_len trustable.  Making the WAL
reader react differently on out-of-memory or bogus record data would
require ABI changes, so this is the safest choice for stable branches.
Also, it is worth noting that 3f1ce973467a has been using a plain
palloc() in this code for some time now.

Thanks to Noah Misch and Thomas Munro for the discussion.

Like the other commit, backpatch down to 12, leaving out v11 that will
be EOL'd soon.  The behavior of considering a failed allocation as bogus
data comes originally from 0ffe11abd3a0, where the record length
retrieved from its header was not entirely trustable.

Reported-by: Ethan Mertz
Discussion: https://postgr.es/m/ZRKKdI5-RRlta3aF@paquier.xyz
Backpatch-through: 12

Replace use of stat()[7] by -s switch in TAP tests to retrieve file size

The list form of stat() is an inelegant API as it relies on the position
of the file size in the list returned in result. Like in any other
places of the tree, replace that with a -s switch instead.

Another suggestion from Dagfinn is File::Stat, which we've been already
using for some other fields. It really comes down to a matter of taste
to choose that over -s, and the latter is more used in the tree.

Author: Bertrand Drouvot
Reviewed-by: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/b2020df7-d0fc-4ea5-b2a9-7efc6d36b2ac@gmail.com

Fix omission of column-level privileges in selective pg_restore.

In a selective restore, ACLs for a table should be dumped if the
table is selected to be dumped.  However, if the table has both
table-level and column-level ACLs, only the table-level ACL was
restored.  This happened because _tocEntryRequired assumed that
an ACL could have only one dependency (the one on its table),
and punted if there was more than one.  But since commit ea9125304,
column-level ACLs also depend on the table-level ACL if any, to
ensure correct ordering in parallel restores.  To fix, adjust the
logic in _tocEntryRequired to ignore dependencies on ACLs.

I extended a test case in 002_pg_dump.pl so that it purports to
test for this; but in fact the test passes even without the fix.
That's because this bug only manifests during a selective restore,
while the scenarios 002_pg_dump.pl tests include only selective dumps.
Perhaps somebody would like to extend the script so that it can test
scenarios including selective restore, but I'm not touching that.

Euler Taveira and Tom Lane, per report from Kong Man.
Back-patch to all supported branches.

Discussion: https://postgr.es/m/DM4PR11MB73976902DBBA10B1D652F9498B06A@DM4PR11MB7397.namprd11.prod.outlook.com

Remove retry loop in heap_page_prune().

The retry loop is needed because heap_page_prune() calls
HeapTupleSatisfiesVacuum() and then lazy_scan_prune() does the same
thing again, and they might get different answers due to concurrent
clog updates. But this patch makes heap_page_prune() return the
HeapTupleSatisfiesVacuum() results that it computed back to the
caller, which allows lazy_scan_prune() to avoid needing to recompute
those values in the first place. That's nice both because it eliminates
the need for a retry loop and also because it's cheaper.

Melanie Plageman, reviewed by David Geier, Andres Freund, and me.

Discussion: https://postgr.es/m/CAAKRu_br124qsGJieuYA0nGjywEukhK1dKBfRdby_4yY3E9SXA%40mail.gmail.com

Flush WAL stats in bgwriter

bgwriter can write out WAL, but did not flush the WAL pgstat counters,
so the writes were not seen in pg_stat_wal.

Back-patch to v14, where pg_stat_wal was introduced.

Author: Nazir Bilal Yavuz
Reviewed-by: Matthias van de Meent, Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/CAN55FZ2FPYngovZstr%3D3w1KSEHe6toiZwrurbhspfkXe5UDocg%40mail.gmail.com