Amit Kapila [Thu, 24 Jun 2021 03:43:46 +0000 (09:13 +0530)]
Doc: Update caveats in synchronous logical replication.
Reported-by: Simon Riggs
Author: Takamichi Osumi
Reviewed-by: Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/
20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de
Tom Lane [Wed, 23 Jun 2021 22:41:39 +0000 (18:41 -0400)]
Allow non-quoted identifiers as isolation test session/step names.
For no obvious reason, isolationtester has always insisted that
session and step names be written with double quotes. This is
fairly tedious and does little for test readability, especially
since the names that people actually choose almost always look
like normal identifiers. Hence, let's tweak the lexer to allow
SQL-like identifiers not only double-quoted strings.
(They're SQL-like, not exactly SQL, because I didn't add any
case-folding logic. Also there's no provision for U&"..." names,
not that anyone's likely to care.)
There is one incompatibility introduced by this change: if you write
"foo""bar" with no space, that used to be taken as two identifiers,
but now it's just one identifier with an embedded quote mark.
I converted all the src/test/isolation/ specfiles to remove
unnecessary double quotes, but stopped there because my
eyes were glazing over already.
Like
741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.
Discussion: https://postgr.es/m/759113.
1623861959@sss.pgh.pa.us
Tom Lane [Wed, 23 Jun 2021 18:27:13 +0000 (14:27 -0400)]
Doc: fix confusion about LEAKPROOF in syntax summaries.
The syntax summaries for CREATE FUNCTION and allied commands
made it look like LEAKPROOF is an alternative to
IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal
option. Improve that.
Per gripe from aazamrafeeque0. Thanks to David Johnston for
suggestions.
Discussion: https://postgr.es/m/
162444349581.694.
5818572718530259025@wrigleys.postgresql.org
Tom Lane [Wed, 23 Jun 2021 18:01:32 +0000 (14:01 -0400)]
Don't assume GSSAPI result strings are null-terminated.
Our uses of gss_display_status() and gss_display_name() assumed
that the gss_buffer_desc strings returned by those functions are
null-terminated. It appears that they generally are, given the
lack of field complaints up to now. However, the available
documentation does not promise this, and some man pages
for gss_display_status() show examples that rely on the
gss_buffer_desc.length field instead of expecting null
termination. Also, we now have a report that on some
implementations, clang's address sanitizer is of the opinion
that the byte after the specified length is undefined.
Hence, change the code to rely on the length field instead.
This might well be cosmetic rather than fixing any real bug, but
it's hard to be sure, so back-patch to all supported branches.
While here, also back-patch the v12 changes that made pg_GSS_error
deal honestly with multiple messages available from
gss_display_status.
Per report from Sudheer H R.
Discussion: https://postgr.es/m/
5372B6D4-8276-42C0-B8FB-
BD0918826FC3@tekenlight.com
Tom Lane [Wed, 23 Jun 2021 15:12:31 +0000 (11:12 -0400)]
Improve display of query results in isolation tests.
Previously, isolationtester displayed SQL query results using some
ad-hoc code that clearly hadn't had much effort expended on it.
Field values longer than 14 characters weren't separated from
the next field, and usually caused misalignment of the columns
too. Also there was no visual separation of a query's result
from subsequent isolationtester output. This made test result
files confusing and hard to read.
To improve matters, let's use libpq's PQprint() function. Although
that's long since unused by psql, it's still plenty good enough
for the purpose here.
Like
741d7f104, back-patch to all supported branches, so that this
isn't a stumbling block for back-patching isolation test changes.
Discussion: https://postgr.es/m/582362.
1623798221@sss.pgh.pa.us
Alvaro Herrera [Wed, 23 Jun 2021 13:53:18 +0000 (09:53 -0400)]
Add test case for obsoleting slot with active walsender, take 2
The code to signal a running walsender when its reserved WAL size grows
too large is completely uncovered before this commit; this adds coverage
for that case.
This test involves sending SIGSTOP to walsender and walreceiver, then
advancing enough WAL for a checkpoint to trigger, then sending SIGCONT.
There's no precedent for STOP signalling in Perl tests, and my reading
of relevant manpages says it's likely to fail on Windows. Because of
this, this test is always skipped on that platform.
This version fixes a couple of rarely hit race conditions in the
previous attempt
09126984a263; most notably, both LOG string searches
are loops, not just the second one; we acquire the start-of-log position
before STOP-signalling; and reference the correct process name in the
test description. All per Tom Lane.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
202106102202.mjw4huiix7lo@alvherre.pgsql
Tom Lane [Wed, 23 Jun 2021 01:43:12 +0000 (21:43 -0400)]
Use annotations to reduce instability of isolation-test results.
We've long contended with isolation test results that aren't entirely
stable. Some test scripts insert long delays to try to force stable
results, which is not terribly desirable; but other erratic failure
modes remain, causing unrepeatable buildfarm failures. I've spent a
fair amount of time trying to solve this by improving the server-side
support code, without much success: that way is fundamentally unable
to cope with diffs that stem from chance ordering of arrival of
messages from different server processes.
We can improve matters on the client side, however, by annotating
the test scripts themselves to show the desired reporting order
of events that might occur in different orders. This patch adds
three types of annotations to deal with (a) test steps that might or
might not complete their waits before the isolationtester can see them
waiting; (b) test steps in different sessions that can legitimately
complete in either order; and (c) NOTIFY messages that might arrive
before or after the completion of a step in another session. We might
need more annotation types later, but this seems to be enough to deal
with the instabilities we've seen in the buildfarm. It also lets us
get rid of all the long delays that were previously used, cutting more
than a minute off the runtime of the isolation tests.
Back-patch to all supported branches, because the buildfarm
instabilities affect all the branches, and because it seems desirable
to keep isolationtester's capabilities the same across all branches
to simplify possible future back-patching of tests.
Discussion: https://postgr.es/m/327948.
1623725828@sss.pgh.pa.us
Tom Lane [Tue, 22 Jun 2021 21:48:39 +0000 (17:48 -0400)]
Restore the portal-level snapshot for simple expressions, too.
Commits
84f5c2908 et al missed the need to cover plpgsql's "simple
expression" code path. If the first thing we execute after a
COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command,
we must explicitly do EnsurePortalSnapshotExists() to make sure we have
an outer snapshot. Note that it wouldn't be good enough to just push a
snapshot for the duration of the expression execution: what comes back
might be toasted, so we'd better have a snapshot protecting it.
The test case demonstrating this fact cheats a bit by marking a SQL
function immutable even though it fetches from a table. That's
nothing that users haven't been seen to do, though.
Per report from Jim Nasby. Back-patch to v11, like the previous fix.
Discussion: https://postgr.es/m/
378885e4-f85f-fc28-6c91-
c4d1c080bf26@amazon.com
Peter Geoghegan [Tue, 22 Jun 2021 16:06:32 +0000 (09:06 -0700)]
Add list of ignorable pgindent commits for git-blame.
Add a .git-blame-ignore-revs file with a list of pgindent, pgperlyidy,
and reformat-dat-files commit hashes. Postgres hackers that configure
git to use the ignore file will get git-blame output that avoids
attributing line changes to the ignored indent commits. This makes
git-blame output much easier to work with in practice.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAH2-Wz=cVh3GHTP6SdLU-Gnmt2zRdF8vZkcrFdSzXQ=WhbWm9Q@mail.gmail.com
Joe Conway [Mon, 21 Jun 2021 21:07:55 +0000 (17:07 -0400)]
Stamp 14beta2.
Andres Freund [Mon, 21 Jun 2021 12:13:46 +0000 (05:13 -0700)]
Use correct horizon when vacuuming catalog relations.
In
dc7420c2c92 I (Andres) accidentally used
RelationIsAccessibleInLogicalDecoding() as the sole condition to use the
non-shared catalog horizon in GetOldestNonRemovableTransactionId(). That is
incorrect, as RelationIsAccessibleInLogicalDecoding() checks whether wal_level
is logical.
The correct check, as done e.g. in GlobalVisTestFor(), is to check
IsCatalogRelation() and RelationIsAccessibleInLogicalDecoding().
The observed misbehavior of this bug was that there could be an endless loop
in lazy_scan_prune(), because the horizons used in heap_page_prune() and the
individual tuple liveliness checks did not match. Likely there are other
potential consequences as well.
A later commit will unify the determination which horizon has to be used, and
add additional assertions to make it easier to catch a bug like this.
Reported-By: Justin Pryzby <pryzby@telsasoft.com>
Diagnosed-By: Matthias van de Meent <boekewurm+postgres@gmail.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2Wg32Y9+WJfw=aofkRx1ZRFt_Ev6bNPc4PSaz7PjSFtZgQ@mail.gmail.com
David Rowley [Mon, 21 Jun 2021 11:11:23 +0000 (23:11 +1200)]
Fix assert failure in expand_grouping_sets
linitial_node() fails in assert enabled builds if the given pointer is
not of the specified type. Here the type is IntList. The code thought
it should be expecting List, but it was wrong.
In the existing tests which run this code the initial list element is
always NIL. Since linitial_node() allows NULL, we didn't trigger any
assert failures in the existing regression tests.
There is still some discussion as to whether we need a few more tests in
this area, but for now, since beta2 is looming, fix the bug first.
Bug: #17067
Discussion: https://postgr.es/m/17067-
665d50fa321f79e0@postgresql.org
Reported-by: Yaoguang Chen
Peter Eisentraut [Mon, 21 Jun 2021 10:32:14 +0000 (12:32 +0200)]
Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash:
70796ae860c444c764bb591c885f22cac1c168ec
Noah Misch [Mon, 21 Jun 2021 09:48:11 +0000 (02:48 -0700)]
Finish rename of PQtraceSetFlags() to PQsetTraceFlags().
Jie Zhang
Discussion: https://postgr.es/m/TYWPR01MB767844835390EDD8DB276D75F90A9@TYWPR01MB7678.jpnprd01.prod.outlook.com
Peter Eisentraut [Mon, 21 Jun 2021 09:17:49 +0000 (11:17 +0200)]
amcheck: Fix code comments
Code comments were claiming that verify_heapam() was checking
privileges on the relation it was operating on, but it didn't actually
do that. Perhaps earlier versions of the patch did that, but now the
access is regulated by privileges on the function. Remove the wrong
comments.
Bruce Momjian [Mon, 21 Jun 2021 05:09:32 +0000 (01:09 -0400)]
doc: adjust PG 14 relnotes to be current
Bruce Momjian [Mon, 21 Jun 2021 03:53:00 +0000 (23:53 -0400)]
doc: add mention of +4GB windows file handling in PG14 relnotes
Reported-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoCTHyouoGv-xt1qNjjvPbGMErLi0AJncByTvr66Nq7j8g@mail.gmail.com
Peter Geoghegan [Mon, 21 Jun 2021 01:14:00 +0000 (18:14 -0700)]
Remove overzealous VACUUM failsafe assertions.
The failsafe can trigger when index processing is already disabled.
This can happen when VACUUM's INDEX_CLEANUP parameter is "off" and the
failsafe happens to trigger. Remove assertions that assume that index
processing is directly tied to the failsafe.
Oversight in commit
c242baa4, which made it possible for the failsafe to
trigger in a two-pass strategy VACUUM that has yet to make its first
call to lazy_vacuum_all_indexes().
Alvaro Herrera [Sun, 20 Jun 2021 16:28:08 +0000 (12:28 -0400)]
Revert "Add test case for obsoleting slot with active walsender"
This reverts commit
09126984a263; the test case added there failed once
in circumstances that remain mysterious. It seems better to remove the
test for now so that 14beta2 doesn't have random failures built in.
Tom Lane [Sun, 20 Jun 2021 15:48:44 +0000 (11:48 -0400)]
Stabilize test case added by commit
f61db909d.
Buildfarm members ayu and tern have sometimes shown a different
plan than expected for this query. I'd been unable to reproduce
that before today, but I finally realized what is happening.
If there is a concurrent open transaction (probably an autovacuum
run in the buildfarm, but this can also be arranged manually),
then the index entries for the rows removed by the DELETE a few
lines up are not killed promptly, causing a change in the planner's
estimate of the extremal value of ft2.c1, which moves the rowcount
estimate for "c1 > 1100" by enough to change the join plan from
nestloop to hash.
To fix, change the query condition to "c1 > 1000", causing the
hash plan to be preferred whether or not a concurrent open
transaction exists. Since this UPDATE is tailored to be a no-op,
nothing else changes.
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=ayu&dt=2021-06-09%2022%3A45%3A48
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=ayu&dt=2021-06-13%2022%3A38%3A18
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2021-06-20%2004%3A55%3A36
Tom Lane [Sat, 19 Jun 2021 15:44:39 +0000 (11:44 -0400)]
Provide feature-test macros for libpq features added in v14.
We had a request to provide a way to test at compile time for the
availability of the new pipeline features. More generally, it
seems like a good idea to provide a way to test via #ifdef for
all new libpq API features. People have been using the version
from pg_config.h for that; but that's more likely to represent the
server version than the libpq version, in the increasingly-common
scenario where they're different. It's safer if libpq-fe.h itself
is the source of truth about what features it offers.
Hence, establish a policy that starting in v14 we'll add a suitable
feature-is-present macro to libpq-fe.h when we add new API there.
(There doesn't seem to be much point in applying this policy
retroactively, but it's not too late for v14.)
Tom Lane and Alvaro Herrera, per suggestion from Boris Kolpackov.
Discussion: https://postgr.es/m/boris.
20210617102439@codesynthesis.com
Amit Kapila [Sat, 19 Jun 2021 06:00:33 +0000 (11:30 +0530)]
Handle no replica identity index case in RelationGetIdentityKeyBitmap.
Commit
e7eea52b2d has introduced a new function
RelationGetIdentityKeyBitmap which omits to handle the case where there is
no replica identity index on a relation.
Author: Mark Dilger
Reviewed-by: Takamichi Osumi, Amit Kapila
Discussion: https://www.postgresql.org/message-id/
4C99A862-69C8-431F-960A-
81B1151F1B89@enterprisedb.com
Peter Geoghegan [Sat, 19 Jun 2021 03:04:07 +0000 (20:04 -0700)]
Support disabling index bypassing by VACUUM.
Generalize the INDEX_CLEANUP VACUUM parameter (and the corresponding
reloption): make it into a ternary style boolean parameter. It now
exposes a third option, "auto". The "auto" option (which is now the
default) enables the "bypass index vacuuming" optimization added by
commit
1e55e7d1.
"VACUUM (INDEX_CLEANUP TRUE)" is redefined to once again make VACUUM
simply do any required index vacuuming, regardless of how few dead
tuples are encountered during the first scan of the target heap relation
(unless there are exactly zero). This gives users a way of opting out
of the "bypass index vacuuming" optimization, if for whatever reason
that proves necessary. It is also expected to be used by PostgreSQL
developers as a testing option from time to time.
"VACUUM (INDEX_CLEANUP FALSE)" does the same thing as it always has: it
forcibly disables both index vacuuming and index cleanup. It's not
expected to be used much in PostgreSQL 14. The failsafe mechanism added
by commit
1e55e7d1 addresses the same problem in a simpler way.
INDEX_CLEANUP can now be thought of as a testing and compatibility
option.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-By: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CAH2-WznrBoCST4_Gxh_G9hA8NzGUbeBGnOUC8FcXcrhqsv6OHQ@mail.gmail.com
Alvaro Herrera [Fri, 18 Jun 2021 22:42:00 +0000 (18:42 -0400)]
Add test case for obsoleting slot with active walsender
The code to signal a running walsender when its reserved WAL size grows
too large is completely uncovered before this commit; this adds coverage
for that case.
This test involves sending SIGSTOP to walsender and walreceiver and
running a checkpoint while advancing WAL, then sending SIGCONT. There's
no precedent for this coding in Perl tests, and my reading of relevant
manpages says it's likely to fail on Windows. Because of this, this
test is always skipped on that platform.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
202106102202.mjw4huiix7lo@alvherre.pgsql
Tom Lane [Fri, 18 Jun 2021 22:00:09 +0000 (18:00 -0400)]
Fix misbehavior of DROP OWNED BY with duplicate polroles entries.
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that. If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error. Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.
Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.
Per bug #17062 from Alexander Lakhin. It's been broken all along,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17062-
11f471ae3199ca23@postgresql.org
Tom Lane [Fri, 18 Jun 2021 21:05:23 +0000 (17:05 -0400)]
Improve version reporting in pgbench.
Commit
547f04e73 caused pgbench to start printing its version number,
which seems like a fine idea, but it needs a bit more work:
* Print the server version number too, when different.
* Print the PG_VERSION string, not some reconstructed approximation.
This patch copies psql's well-tested code for the same purpose.
Discussion: https://postgr.es/m/
1226654.
1624036821@sss.pgh.pa.us
Tom Lane [Fri, 18 Jun 2021 15:22:58 +0000 (11:22 -0400)]
Centralize the logic for protective copying of utility statements.
In the "simple Query" code path, it's fine for parse analysis or
execution of a utility statement to scribble on the statement's node
tree, since that'll just be thrown away afterwards. However it's
not fine if the node tree is in the plan cache, as then it'd be
corrupted for subsequent executions. Up to now we've dealt with
that by having individual utility-statement functions apply
copyObject() if they were going to modify the tree. But that's
prone to errors of omission. Bug #17053 from Charles Samborski
shows that CREATE/ALTER DOMAIN didn't get this memo, and can
crash if executed repeatedly from plan cache.
In the back branches, we'll just apply a narrow band-aid for that,
but in HEAD it seems prudent to have a more principled fix that
will close off the possibility of other similar bugs in future.
Hence, let's hoist the responsibility for doing copyObject up into
ProcessUtility from its children, thus ensuring that it happens for
all utility statement types.
Also, modify ProcessUtility's API so that its callers can tell it
whether a copy step is necessary. It turns out that in all cases,
the immediate caller knows whether the node tree is transient, so
this doesn't involve a huge amount of code thrashing. In this way,
while we lose a little bit in the execute-from-cache code path due
to sometimes copying node trees that wouldn't be mutated anyway,
we gain something in the simple-Query code path by not copying
throwaway node trees. Statements that are complex enough to be
expensive to copy are almost certainly ones that would have to be
copied anyway, so the loss in the cache code path shouldn't be much.
(Note that this whole problem applies only to utility statements.
Optimizable statements don't have the issue because we long ago made
the executor treat Plan trees as read-only. Perhaps someday we will
make utility statement execution act likewise, but I'm not holding
my breath.)
Discussion: https://postgr.es/m/931771.
1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-
3ca3f501bbc212b4@postgresql.org
Andrew Dunstan [Fri, 18 Jun 2021 10:51:12 +0000 (06:51 -0400)]
Don't set a fast default for anything but a plain table
The fast default code added in Release 11 omitted to check that the
table a fast default was being added to was a plain table. Thus one
could be added to a foreign table, which predicably blows up. Here we
perform that check.
In addition, on the back branches, since some of these might have
escaped into the wild, if we encounter a missing value for
an attribute of something other than a plain table we ignore it.
Fixes bug #17056
Backpatch to release 11,
Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
Fujii Masao [Fri, 18 Jun 2021 08:57:09 +0000 (17:57 +0900)]
Make archiver process handle barrier events.
Commit
d75288fb27 made WAL archiver process an auxiliary process.
An auxiliary process needs to handle barrier events but the commit
forgot to make archiver process do that.
Reported-by: Thomas Munro
Author: Fujii Masao
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CA+hUKGLah2w1pWKHonZP_+EQw69=q56AHYwCgEN8GDzsRG_Hgw@mail.gmail.com
Michael Paquier [Fri, 18 Jun 2021 05:22:31 +0000 (14:22 +0900)]
doc: Apply markup <productname> to OpenSSL more consistently
Author: Daniel Gustafsson
Discussion: https://postgr.es/m/
CE12DD5C-4BB3-4166-BC9A-
39779568734C@yesql.se
Heikki Linnakangas [Thu, 17 Jun 2021 11:50:42 +0000 (14:50 +0300)]
Tidy up GetMultiXactIdMembers()'s behavior on error
One of the error paths left *members uninitialized. That's not a live
bug, because most callers don't look at *members when the function
returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
"if (members != NULL) pfree(members)", but AFAICS it never passes an
invalid 'multi' value so it should not reach that error case.
The callers are also a bit inconsistent in their expectations.
heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
not a live bug either, because the function should never return 0, but
add an Assert for that to make it more clear. I left the callers alone
for now.
I also moved the line where we set *nmembers. It wasn't wrong before,
but I like to do that right next to the 'return' statement, to make it
clear that it's always set on return.
Also remove one unreachable return statement after ereport(ERROR), for
brevity and for consistency with the similar if-block right after it.
Author: Greg Nancarrow with the additional changes by me
Backpatch-through: 9.6, all supported versions
Amit Kapila [Thu, 17 Jun 2021 04:26:05 +0000 (09:56 +0530)]
Document a few caveats in synchronous logical replication.
In a synchronous logical setup, locking [user] catalog tables can cause
deadlock. This is because logical decoding of transactions can lock
catalog tables to access them so exclusively locking those in transactions
can lead to deadlock. To avoid this users must refrain from having
exclusive locks on catalog tables.
Author: Takamichi Osumi
Reviewed-by: Vignesh C, Amit Kapila
Backpatch-through: 9.6
Discussion: https://www.postgresql.org/message-id/
20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de
Tom Lane [Wed, 16 Jun 2021 23:30:17 +0000 (19:30 -0400)]
Fix plancache refcount leak after error in ExecuteQuery.
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked. I managed to break this rule
while refactoring code in
9dbf2b7d7. There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed. Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.
Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
Tomas Vondra [Wed, 16 Jun 2021 20:53:31 +0000 (22:53 +0200)]
Fix copying data into slots with FDW batching
Commit
b676ac443b optimized handling of tuple slots with bulk inserts
into foreign tables, so that the slots are initialized only once and
reused for all batches. The data was however copied into the slots only
after the initialization, inserting duplicate values when the slot gets
reused. Fixed by moving the ExecCopySlot outside the init branch.
The existing postgres_fdw tests failed to catch this due to inserting
data into foreign tables without unique indexes, and then checking only
the number of inserted rows. This adds a new test with both a unique
index and a check of inserted values.
Reported-by: Alexander Pyhalov
Discussion: https://postgr.es/m/
7a8cf8d56b3d18e5c0bccd6cd42d04ac%40postgrespro.ru
Tom Lane [Wed, 16 Jun 2021 15:52:05 +0000 (11:52 -0400)]
Improve SQLSTATE reporting in some replication-related code.
I started out with the goal of reporting ERRCODE_CONNECTION_FAILURE
when walrcv_connect() fails, but as I looked around I realized that
whoever wrote this code was of the opinion that errcodes are purely
optional. That's not my understanding of our project policy. Hence,
make sure that an errcode is provided in each ereport that (a) is
ERROR or higher level and (b) isn't arguably an internal logic error.
Also fix some very dubious existing errcode assignments.
While this is not per policy, it's also largely cosmetic, since few
of these cases could get reported to applications. So I don't
feel a need to back-patch.
Discussion: https://postgr.es/m/
2189704.
1623512522@sss.pgh.pa.us
Heikki Linnakangas [Wed, 16 Jun 2021 09:34:32 +0000 (12:34 +0300)]
Fix outdated comment that talked about seek position of WAL file.
Since commit
c24dcd0cfd, we have been using pg_pread() to read the WAL
file, which doesn't change the seek position (unless we fall back to
the implementation in src/port/pread.c). Update comment accordingly.
Backpatch-through: 12, where we started to use pg_pread()
Tom Lane [Tue, 15 Jun 2021 20:11:45 +0000 (16:11 -0400)]
Update another variant expected-result file.
This should have been updated in
533e9c6b0, but it was overlooked.
Given the lack of complaints, I won't bother back-patching.
Tom Lane [Tue, 15 Jun 2021 20:09:14 +0000 (16:09 -0400)]
Remove another orphan expected-result file.
aborted-keyrevoke_2.out was apparently needed when it was added (in
commit
0ac5ad513) to handle the case of serializable transaction mode.
However, the output in serializable mode actually matches the regular
aborted-keyrevoke.out file, and AFAICT has done so for a long time.
There's no need to keep dragging this variant along.
Andrew Dunstan [Tue, 15 Jun 2021 19:30:11 +0000 (15:30 -0400)]
Further refinement of stuck_on_old_timeline recovery test
TestLib::perl2host can take a file argument as well as a directory
argument, so that code becomes substantially simpler. Also add comments
on why we're using forward slashes, and why we're setting
PERL_BADLANG=0.
Discussion: https://postgr.es/m/
e9947bcd-20ee-027c-f0fe-
01f736b7e345@dunslane.net
Alexander Korotkov [Tue, 15 Jun 2021 18:43:17 +0000 (21:43 +0300)]
Revert
29854ee8d1 due to buildfarm failures
Reported-by: Tom Lane
Discussion: https://postgr.es/m/CAPpHfdvcnw3x7jdV3r52p4%3D5S4WUxBCzcQKB3JukQHoicv1LSQ%40mail.gmail.com
Peter Geoghegan [Tue, 15 Jun 2021 15:59:36 +0000 (08:59 -0700)]
Remove unneeded field from VACUUM state.
Bugfix commit
5fc89376 effectively made the lock_waiter_detected field
from vacuumlazy.c's global state struct into private state owned by
lazy_truncate_heap(). Finish this off by replacing the struct field
with a local variable.
Alexander Korotkov [Tue, 15 Jun 2021 13:06:32 +0000 (16:06 +0300)]
Add missing type name "multirange" in docs chapter title
Discussion: https://postgr.es/m/CAFj8pRDioOxiJgmgw9TqQqZ3CxnJC4P5B2Oospf2eMgAjJuewA%40mail.gmail.com
Author: Pavel Stehule, Alexander Korotkov
Reviewed-by: Justin Pryzby, Tom Lane
Alexander Korotkov [Tue, 15 Jun 2021 12:59:20 +0000 (15:59 +0300)]
Support for unnest(multirange) and cast multirange as an array of ranges
It has been spotted that multiranges lack of ability to decompose them into
individual ranges. Subscription and proper expanded object representation
require substantial work, and it's too late for v14. This commit
provides the implementation of unnest(multirange) and cast multirange as
an array of ranges, which is quite trivial.
unnest(multirange) is defined as a polymorphic procedure. The catalog
description of the cast underlying procedure is duplicated for each multirange
type because we don't have anyrangearray polymorphic type to use here.
Catversion is bumped.
Reported-by: Jonathan S. Katz
Discussion: https://postgr.es/m/flat/
60258efe-bd7e-4886-82e1-
196e0cac5433%40postgresql.org
Author: Alexander Korotkov
Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu
Amit Kapila [Tue, 15 Jun 2021 02:58:36 +0000 (08:28 +0530)]
Fix decoding of speculative aborts.
During decoding for speculative inserts, we were relying for cleaning
toast hash on confirmation records or next change records. But that
could lead to multiple problems (a) memory leak if there is neither a
confirmation record nor any other record after toast insertion for a
speculative insert in the transaction, (b) error and assertion failures
if the next operation is not an insert/update on the same table.
The fix is to start queuing spec abort change and clean up toast hash
and change record during its processing. Currently, we are queuing the
spec aborts for both toast and main table even though we perform cleanup
while processing the main table's spec abort record. Later, if we have a
way to distinguish between the spec abort record of toast and the main
table, we can avoid queuing the change for spec aborts of toast tables.
Reported-by: Ashutosh Bapat
Author: Dilip Kumar
Reviewed-by: Amit Kapila
Backpatch-through: 9.6, where it was introduced
Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
Tom Lane [Tue, 15 Jun 2021 01:58:26 +0000 (21:58 -0400)]
Update variant expected-result file.
This should have been updated in
d2d8a229b, but it was overlooked.
According to
31a877f18 which added it, this file is meant to show the
results you get under default_transaction_isolation = serializable.
We've largely lost track of that goal in other isolation tests, but
as long as we've got this one, it should be right.
Noted while fooling about with the isolationtester.
Tom Lane [Tue, 15 Jun 2021 01:28:21 +0000 (21:28 -0400)]
Remove orphaned expected-result file.
This should have been removed in
43e084197, which removed the
corresponding spec file. Noted while fooling about with the
isolationtester.
Noah Misch [Tue, 15 Jun 2021 00:29:37 +0000 (17:29 -0700)]
Remove pg_wait_for_backend_termination().
It was unable to wait on a backend that had already left the procarray.
Users tolerant of that limitation can poll pg_stat_activity. Other
users can employ the "timeout" argument of pg_terminate_backend().
Reviewed by Bharath Rupireddy.
Discussion: https://postgr.es/m/
20210605013236.GA208701@rfd.leadboat.com
Noah Misch [Tue, 15 Jun 2021 00:29:37 +0000 (17:29 -0700)]
Copy-edit text for the pg_terminate_backend() "timeout" parameter.
Revert the pg_description entry to its v13 form, since those messages
usually remain shorter and don't discuss individual parameters. No
catversion bump, since pg_description content does not impair backend
compatibility or application compatibility.
Justin Pryzby
Discussion: https://postgr.es/m/
20210612182743.GY16435@telsasoft.com
Alvaro Herrera [Mon, 14 Jun 2021 20:31:12 +0000 (16:31 -0400)]
Fix logic bug in
1632ea43682f
I overlooked that one condition was logically inverted. The fix is a
little bit more involved than simply negating the condition, to make
the code easier to read.
Fix some outdated comments left by the same commit, while at it.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/YMRlmB3/lZw8YBH+@paquier.xyz
Bruce Momjian [Mon, 14 Jun 2021 20:14:04 +0000 (16:14 -0400)]
doc: PG 14 relnotes fixes
Items related to logical replication attribution and BRIN indexes
Reported-by: Tomas Vondra, John Naylor
Discussion: https://postgr.es/m/
0db66294-a668-2caa-2b5e-
a8db60b30662@enterprisedb.com, CAFBsxsH21KnteYdk33F1oZu2O726NSD6_XBq51Tn0jytsA1AnA@mail.gmail.com
Bruce Momjian [Mon, 14 Jun 2021 20:03:18 +0000 (16:03 -0400)]
doc: PG 14 relnote updates
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/
20210612034551.GU16435@telsasoft.com
Bruce Momjian [Mon, 14 Jun 2021 16:49:05 +0000 (12:49 -0400)]
doc: add PG 14 relnote item about array function references
User-defined objects that reference some built-in array functions will
need to be recreated in PG 14.
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/
20210608225618.GR16435@telsasoft.com
Michael Paquier [Mon, 14 Jun 2021 05:57:22 +0000 (14:57 +0900)]
Improve handling of dropped objects in pg_event_trigger_ddl_commands()
An object found as dropped when digging into the list of objects
returned by pg_event_trigger_ddl_commands() could cause a cache lookup
error, as the calls grabbing for the object address and the type name
would fail if the object was missing.
Those lookup errors could be seen with combinations of ALTER TABLE
sub-commands involving identity columns. The lookup logic is changed in
this code path to get a behavior similar to any other SQL-callable
function by ignoring objects that are not found, taking advantage of
2a10fdc. The back-branches are not changed, as they require this commit
that is too invasive for stable branches.
While on it, add test cases to exercise event triggers with identity
columns, and stress more cases with the event ddl_command_end for
relations.
Author: Sven Klemm, Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/CAMCrgp2R1cEXU53iYKtW6yVEp2_yKUz+z=3-CTrYpPP+xryRtg@mail.gmail.com
Michael Paquier [Mon, 14 Jun 2021 00:25:50 +0000 (09:25 +0900)]
Remove forced toast recompression in VACUUM FULL/CLUSTER
The extra checks added by the recompression of toast data introduced in
bbe0a81 is proving to have a performance impact on VACUUM or CLUSTER
even if no recompression is done. This is more noticeable with more
toastable columns that contain non-NULL values.
Improvements could be done to make those extra checks less expensive,
but that's not material for 14 at this stage, and we are not sure either
if the code path of VACUUM FULL/CLUSTER is adapted for this job.
Per discussion with several people, including Andres Freund, Robert
Haas, Álvaro Herrera, Tom Lane and myself.
Discussion: https://postgr.es/m/
20210527003144.xxqppojoiwurc2iz@alap3.anarazel.de
Tom Lane [Sun, 13 Jun 2021 18:32:42 +0000 (14:32 -0400)]
Work around portability issue with newer versions of mktime().
Recent glibc versions have made mktime() fail if tm_isdst is
inconsistent with the prevailing timezone; in particular it fails for
tm_isdst = 1 when the zone is UTC. (This seems wildly inconsistent
with the POSIX-mandated treatment of "incorrect" values for the other
fields of struct tm, so if you ask me it's a bug, but I bet they'll
say it's intentional.) This has been observed to cause cosmetic
problems when pg_restore'ing an archive created in a different
timezone.
To fix, do mktime() using the field values from the archive, and if
that fails try again with tm_isdst = -1. This will give a result
that's off by the UTC-offset difference from the original zone, but
that was true before, too. It's not terribly critical since we don't
do anything with the result except possibly print it. (Someday we
should flush this entire bit of logic and record a standard-format
timestamp in the archive instead. That's not okay for a back-patched
bug fix, though.)
Also, guard our only other use of mktime() by having initdb's
build_time_t() set tm_isdst = -1 not 0. This case could only have
an issue in zones that are DST year-round; but I think some do exist,
or could in future.
Per report from Wells Oliver. Back-patch to all supported
versions, since any of them might need to run with a newer glibc.
Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
Andrew Dunstan [Sun, 13 Jun 2021 11:10:41 +0000 (07:10 -0400)]
Further tweaks to stuck_on_old_timeline recovery test
Translate path slashes on target directory path. This was confusing old
branches, but is applied to all branches for the sake of uniformity.
Perl is perfectly able to understand paths with forward slashes.
Along the way, restore the previous archive_wait query, for the sake of
uniformity with other tests, per gripe from Tom Lane.
Michael Paquier [Sun, 13 Jun 2021 11:07:39 +0000 (20:07 +0900)]
Ignore more environment variables in pg_regress.c
This is similar to the work done in
8279f68 for TestLib.pm, where
environment variables set may cause unwanted failures if using a
temporary installation with pg_regress. The list of variables reset is
adjusted in each stable branch depending on what is supported.
Comments are added to remember that the lists in TestLib.pm and
pg_regress.c had better be kept in sync.
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz
Backpatch-through: 9.6
Tom Lane [Sat, 12 Jun 2021 19:12:10 +0000 (15:12 -0400)]
Restore robustness of TAP tests that wait for postmaster restart.
Several TAP tests use poll_query_until() to wait for the postmaster
to restart. They were checking to see if a trivial query
(e.g. "SELECT 1") succeeds. However, that's problematic in the wake
of commit
11e9caff8, because now that we feed said query to psql
via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql
quits before reading the query. Hence, we can't use a nonempty
query in cases where we need to wait for connection failures to
stop happening.
Per the precedent of commits
c757a3da0 and
6d41dd045, we can pass
"undef" as the query in such cases to ensure that IPC::Run has
nothing to write. However, then we have to say that the expected
output is empty, and this exposes a deficiency in poll_query_until:
if psql fails altogether and returns empty stdout, poll_query_until
will treat that as a success! That's because, contrary to its
documentation, it makes no actual check for psql failure, looking
neither at the exit status nor at stderr.
To fix that, adjust poll_query_until to insist on empty stderr as
well as a stdout match. (I experimented with checking exit status
instead, but it seems that psql often does exit(1) in cases that we
need to consider successes. That might be something to fix someday,
but it would be a non-back-patchable behavior change.)
Back-patch to v10. The test cases needing this exist only as far
back as v11, but it seems wise to keep poll_query_until's behavior
the same in v10, in case we back-patch another such test case in
future. (9.6 does not currently need this change, because in that
branch poll_query_until can't be told to accept empty stdout as
a success case.)
Per assorted buildfarm failures, mostly on hoverfly.
Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
Tom Lane [Sat, 12 Jun 2021 17:29:24 +0000 (13:29 -0400)]
Ensure pg_filenode_relation(0, 0) returns NULL.
Previously, a zero value for the relfilenode resulted in
a confusing error message about "unexpected duplicate".
This function returns NULL for other invalid relfilenode
values, so zero should be treated likewise.
It's been like this all along, so back-patch to all supported
branches.
Justin Pryzby
Discussion: https://postgr.es/m/
20210612023324.GT16435@telsasoft.com
Tom Lane [Sat, 12 Jun 2021 16:59:15 +0000 (12:59 -0400)]
Don't use Asserts to check for violations of replication protocol.
Using an Assert to check the validity of incoming messages is an
extremely poor decision. In a debug build, it should not be that easy
for a broken or malicious remote client to crash the logrep worker.
The consequences could be even worse in non-debug builds, which will
fail to make such checks at all, leading to who-knows-what misbehavior.
Hence, promote every Assert that could possibly be triggered by wrong
or out-of-order replication messages to a full test-and-ereport.
To avoid bloating the set of messages the translation team has to cope
with, establish a policy that replication protocol violation error
reports don't need to be translated. Hence, all the new messages here
use errmsg_internal(). A couple of old messages are changed likewise
for consistency.
Along the way, fix some non-idiomatic or outright wrong uses of
hash_search().
Most of these mistakes are new with the "streaming replication"
patch (commit
464824323), but a couple go back a long way.
Back-patch as appropriate.
Discussion: https://postgr.es/m/
1719083.
1623351052@sss.pgh.pa.us
Andrew Dunstan [Sat, 12 Jun 2021 12:37:16 +0000 (08:37 -0400)]
Fix new recovery test for use under msys
Commit
caba8f0d43 wasn't quite right for msys, as demonstrated by
several buildfarm animals, including jacana and fairywren. We need to
use the msys perl in the archive command, but call it in such a way that
Windows will understand the path. Furthermore, inside the copy script we
need to convert a Windows path to an msys path.
Michael Paquier [Sat, 12 Jun 2021 07:29:11 +0000 (16:29 +0900)]
Simplify some code in getObjectTypeDescription()
This routine is designed to never return an empty description or NULL,
providing description fallbacks even if missing objects are accepted,
but it included a code path where this was considered possible. All the
callers of this routine already consider NULL as not possible, so
change a bit the code to map with the assumptions of the callers, and
add more comments close to the callers of this routine to outline the
behavior expected.
This code is new as of
2a10fdc, so no backpatch is needed.
Discussion: https://postgr.es/m/YMNY6RGPBRCeLmFb@paquier.xyz
Michael Paquier [Sat, 12 Jun 2021 06:29:48 +0000 (15:29 +0900)]
Improve log pattern detection in recently-added TAP tests
ab55d74 has introduced some tests with rows found as missing in logical
replication subscriptions for partitioned tables, relying on a logic
with a lookup of the logs generated, scanning the whole file. This
commit makes the logic more precise, by scanning the logs only from the
position before the key queries are run to the position where we check
for the logs. This will reduce the risk of issues with log patterns
overlapping with each other if those tests get more complicated in the
future.
Per discussion with Tom Lane.
Discussion: https://postgr.es/m/YMP+Gx2S8meYYHW4@paquier.xyz
Backpatch-through: 13
Andres Freund [Sat, 12 Jun 2021 04:22:21 +0000 (21:22 -0700)]
Improve and cleanup ProcArrayAdd(), ProcArrayRemove().
941697c3c1a changed ProcArrayAdd()/Remove() substantially. As reported by
zhanyi, I missed that due to the introduction of the PGPROC->pgxactoff
ProcArrayRemove() does not need to search for the position in
procArray->pgprocnos anymore - that's pgxactoff. Remove the search loop.
The removal of the search loop reduces assertion coverage a bit - but we can
easily do better than before by adding assertions to other loops over
pgprocnos elements.
Also do a bit of cleanup, mostly by reducing line lengths by introducing local
helper variables and adding newlines.
Author: zhanyi <w@hidva.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/tencent_5624AA3B116B3D1C31CA9744@qq.com
Bruce Momjian [Sat, 12 Jun 2021 01:01:52 +0000 (21:01 -0400)]
doc: remove extra right angle bracket in PG 14 relnotes
Reported-by: Justin Pryzby
Bruce Momjian [Fri, 11 Jun 2021 23:51:35 +0000 (19:51 -0400)]
docs: fix incorrect indenting in PG 14 relnotes
Alvaro Herrera [Fri, 11 Jun 2021 23:07:32 +0000 (19:07 -0400)]
Report sort phase progress in parallel btree build
We were already reporting it, but only after the parallel workers were
finished, which is visibly much later than what happens in a serial
build.
With this change we report it when the leader starts its own sort phase
when participating in the build (the normal case). Now this might
happen a little later than when the workers start their sorting phases,
but a) communicating the actual phase start from workers is likely to be
a hassle, and b) the sort phase start is pretty fuzzy anyway, since
sorting per se is actually initiated by tuplesort.c internally earlier
than tuplesort_performsort() is called.
Backpatch to pg12, where the progress reporting code for CREATE INDEX
went in.
Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Greg Nancarrow <gregn4422@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
1128176d-1eee-55d4-37ca-
e63644422adb
Tom Lane [Fri, 11 Jun 2021 20:12:36 +0000 (16:12 -0400)]
Fix multiple crasher bugs in partitioned-table replication logic.
apply_handle_tuple_routing(), having detected and reported that
the tuple it needed to update didn't exist, tried to update that
tuple anyway, leading to a null-pointer dereference.
logicalrep_partition_open() failed to ensure that the
LogicalRepPartMapEntry it built for a partition was fully
independent of that for the partition root, leading to
trouble if the root entry was later freed or rebuilt.
Meanwhile, on the publisher's side, pgoutput_change() sometimes
attempted to apply execute_attr_map_tuple() to a NULL tuple.
The first of these was reported by Sergey Bernikov in bug #17055;
I found the other two while developing some test cases for this
sadly under-tested code.
Diagnosis and patch for the first issue by Amit Langote; patches
for the others by me; new test cases by me. Back-patch to v13
where this logic came in.
Discussion: https://postgr.es/m/17055-
9ba800ec8522668b@postgresql.org
Alvaro Herrera [Fri, 11 Jun 2021 20:05:50 +0000 (16:05 -0400)]
Add 'Portal Close' message to pipelined PQsendQuery()
Commit
acb7e4eb6b1c added a new implementation for PQsendQuery so that
it works in pipeline mode (by using extended query protocol), but it
behaves differently from the 'Q' message (in simple query protocol) used
by regular implementation: the new one doesn't close the unnamed portal.
Change the new code to have identical behavior to the old.
Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
202106072107.d4i55hdscxqj@alvherre.pgsql
Alvaro Herrera [Fri, 11 Jun 2021 19:48:26 +0000 (15:48 -0400)]
Return ReplicationSlotAcquire API to its original form
Per
96540f80f833; the awkward API introduced by
c6550776394e is no
longer needed.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
20210408020913.zzprrlvqyvlt5cyy@alap3.anarazel.de
Tomas Vondra [Fri, 11 Jun 2021 18:19:48 +0000 (20:19 +0200)]
Optimize creation of slots for FDW bulk inserts
Commit
b663a41363 introduced bulk inserts for FDW, but the handling of
tuple slots turned out to be problematic for two reasons. Firstly, the
slots were re-created for each individual batch. Secondly, all slots
referenced the same tuple descriptor - with reasonably small batches
this is not an issue, but with large batches this triggers O(N^2)
behavior in the resource owner code.
These two issues work against each other - to reduce the number of times
a slot has to be created/dropped, larger batches are needed. However,
the larger the batch, the more expensive the resource owner gets. For
practical batch sizes (100 - 1000) this would not be a big problem, as
the benefits (latency savings) greatly exceed the resource owner costs.
But for extremely large batches it might be much worse, possibly even
losing with non-batching mode.
Fixed by initializing tuple slots only once (and reusing them across
batches) and by using a new tuple descriptor copy for each slot.
Discussion: https://postgr.es/m/
ebbbcc7d-4286-8c28-0272-
61b4753af761%40enterprisedb.com
Alvaro Herrera [Fri, 11 Jun 2021 16:16:14 +0000 (12:16 -0400)]
Fix race condition in invalidating obsolete replication slots
The code added to mark replication slots invalid in commit
c6550776394e
had the race condition that a slot can be dropped or advanced
concurrently with checkpointer trying to invalidate it. Rewrite the
code to close those races.
The changes to ReplicationSlotAcquire's API added with
c6550776394e are
not necessary anymore. To avoid an ABI break in released branches, this
commit leaves that unchanged; it'll be changed in a master-only commit
separately.
Backpatch to 13, where this code first appeared.
Reported-by: Andres Freund <andres@anarazel.de>
Author: Andres Freund <andres@anarazel.de>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/
20210408001037.wfmk6jud36auhfqm@alap3.anarazel.de
Michael Paquier [Fri, 11 Jun 2021 06:46:18 +0000 (15:46 +0900)]
Improve psql tab completion for options of subcriptions and publications
The list of options provided by the tab completion was outdated for the
following commands:
- ALTER SUBSCRIPTION
- CREATE SUBSCRIPTION
- ALTER PUBLICATION
- CREATE PUBLICATION
Author: Vignesh C
Reviewed-by: Bharath Rupireddy
Discussion: https://postgr.es/m/CALDaNm18oHDFu6SFCHE=ZbiO153Fx7E-L1MG0YyScbaDV--U+A@mail.gmail.com
Noah Misch [Fri, 11 Jun 2021 04:56:14 +0000 (21:56 -0700)]
Change position of field "transformed" in struct CreateStatsStmt.
Resolve the disagreement with nodes/*funcs.c field order in favor of the
latter, which is better-aligned with the IndexStmt field order. This
field is new in v14.
Discussion: https://postgr.es/m/
20210611045546.GA573364@rfd.leadboat.com
Noah Misch [Fri, 11 Jun 2021 04:56:13 +0000 (21:56 -0700)]
Rename PQtraceSetFlags() to PQsetTraceFlags().
We have a dozen PQset*() functions. PQresultSetInstanceData() and this
were the libpq setter functions having a different word order. Adopt
the majority word order.
Reviewed by Alvaro Herrera and Robert Haas, though this choice of name
was not unanimous.
Discussion: https://postgr.es/m/
20210605060555.GA216695@rfd.leadboat.com
David Rowley [Fri, 11 Jun 2021 01:38:04 +0000 (13:38 +1200)]
Use the correct article for abbreviations
We've accumulated quite a mix of instances of "an SQL" and "a SQL" in the
documents. It would be good to be a bit more consistent with these.
The most recent version of the SQL standard I looked at seems to prefer
"an SQL". That seems like a good lead to follow, so here we change all
instances of "a SQL" to become "an SQL". Most instances correctly use
"an SQL" already, so it also makes sense to use the dominant variation in
order to minimise churn.
Additionally, there were some other abbreviations that needed to be
adjusted. FSM, SSPI, SRF and a few others. Also fix some pronounceable,
abbreviations to use "a" instead of "an". For example, "a SASL" instead
of "an SASL".
Here I've only adjusted the documents and error messages. Many others
still exist in source code comments. Translator hint comments seem to be
the biggest culprit. It currently does not seem worth the churn to change
these.
Discussion: https://postgr.es/m/CAApHDvpML27UqFXnrYO1MJddsKVMQoiZisPvsAGhKE_tsKXquw%40mail.gmail.com
Tom Lane [Thu, 10 Jun 2021 21:11:36 +0000 (17:11 -0400)]
Reconsider the handling of procedure OUT parameters.
Commit
2453ea142 redefined pg_proc.proargtypes to include the types of
OUT parameters, for procedures only. While that had some advantages
for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty
disastrous from a number of other perspectives. Notably, since the
primary key of pg_proc is name + proargtypes, this made it possible to
have multiple procedures with identical names + input arguments and
differing output argument types. That would make it impossible to call
any one of the procedures by writing just NULL (or "?", or any other
data-type-free notation) for the output argument(s). The change also
seems likely to cause grave confusion for client applications that
examine pg_proc and expect the traditional definition of proargtypes.
Hence, revert the definition of proargtypes to what it was, and
undo a number of complications that had been added to support that.
To support the SQL-spec behavior of DROP PROCEDURE, when there are
no argmode markers in the command's parameter list, we perform the
lookup both ways (that is, matching against both proargtypes and
proallargtypes), succeeding if we get just one unique match.
In principle this could result in ambiguous-function failures
that would not happen when using only one of the two rules.
However, overloading of procedure names is thought to be a pretty
rare usage, so this shouldn't cause many problems in practice.
Postgres-specific code such as pg_dump can defend against any
possibility of such failures by being careful to specify argmodes
for all procedure arguments.
This also fixes a few other bugs in the area of CALL statements
with named parameters, and improves the documentation a little.
catversion bump forced because the representation of procedures
with OUT arguments changes.
Discussion: https://postgr.es/m/
3742981.
1621533210@sss.pgh.pa.us
Tom Lane [Thu, 10 Jun 2021 16:27:27 +0000 (12:27 -0400)]
Rearrange logrep worker's snapshot handling some more.
It turns out that worker.c's code path for TRUNCATE was also
careless about establishing a snapshot while executing user-defined
code, allowing the checks added by commit
84f5c2908 to fail when
a trigger is fired in that context.
We could just wrap Push/PopActiveSnapshot around the truncate call,
but it seems better to establish a policy of holding a snapshot
throughout execution of a replication step. To help with that and
possible future requirements, replace the previous ensure_transaction
calls with pairs of begin/end_replication_step calls.
Per report from Mark Dilger. Back-patch to v11, like the previous
changes.
Discussion: https://postgr.es/m/
B4A3AF82-79ED-4F4C-A4E5-
CD2622098972@enterprisedb.com
Tom Lane [Thu, 10 Jun 2021 15:15:13 +0000 (11:15 -0400)]
Shut down EvalPlanQual machinery when LockRows node reaches the end.
Previously, we left the EPQ sub-executor alone until ExecEndLockRows.
This caused any buffer pins or other resources that it might hold to
remain held until ExecutorEnd, which in some code paths means that
they are held till the Portal is closed. That can cause user-visible
problems, such as blocking VACUUM; and it's unlike the behavior of
ordinary table-scanning nodes, which will have released all buffer
pins by the time they return an EOF indication.
We can make LockRows work more like other plan nodes by calling
EvalPlanQualEnd just before returning NULL. We still need to call it
in ExecEndLockRows in case the node was not run to completion, but in
the normal case the second call does nothing and costs little.
Per report from Yura Sokolov. In principle this is a longstanding
bug, but in view of the lack of other complaints and the low severity
of the consequences, I chose not to back-patch.
Discussion: https://postgr.es/m/
4aa370cb91ecf2f9885d98b80ad1109c@postgrespro.ru
Tom Lane [Thu, 10 Jun 2021 14:45:31 +0000 (10:45 -0400)]
Avoid ECPG test failures in some GSS-capable environments.
Buildfarm member hamerkop has been reporting that two cases in
connect/test5.pgc show different error messages than the test expects,
because since commit
ffa2e4670 libpq's connection failure messages
are exposing the fact that a GSS-encrypted connection was attempted
and failed. That's pretty interesting information in itself, and
I certainly don't wish to shoot the messenger, but we need to do
something to stabilize the ECPG results.
For the second of these two failure cases, we can add the
gssencmode=disable option to prevent the discrepancy. However,
that solution is problematic for the first failure, because the only
unique thing about that case is that it's testing a completely-omitted
connection target; there's noplace to add the option without defeating
the point of the test case. After some thrashing around with
alternative fixes that turned out to have undesirable side-effects,
the most workable answer is just to give up and remove that test case.
Perhaps we can revert this later, if we figure out why the GSS code
is misbehaving in hamerkop's environment.
Thanks to Michael Paquier for exploration of alternatives.
Discussion: https://postgr.es/m/YLRZH6CWs9N6Pusy@paquier.xyz
Peter Eisentraut [Thu, 10 Jun 2021 14:21:48 +0000 (16:21 +0200)]
Add some const decorations
One of these functions is new in PostgreSQL 14; might as well start it
out right.
Robert Haas [Thu, 10 Jun 2021 13:08:30 +0000 (09:08 -0400)]
Adjust new test case to set wal_keep_size.
Per buildfarm member conchuela and Kyotaro Horiguchi, it's possible
for the WAL segment that the cascading standby needs to be removed
too quickly. Hopefully this will prevent that.
Kyotaro Horiguchi
Discussion: http://postgr.es/m/
20210610.101240.
1270925505780628275.horikyota.ntt@gmail.com
David Rowley [Thu, 10 Jun 2021 08:13:44 +0000 (20:13 +1200)]
Fix an asssortment of typos in brin_minmax_multi.c and mcv.c
Discussion: https://postgr.es/m/CAApHDvrbyJNOPBws4RUhXghZ7+TBjtdO-rznTsqZECuowNorXg@mail.gmail.com
Robert Haas [Wed, 9 Jun 2021 20:17:00 +0000 (16:17 -0400)]
Fix corner case failure of new standby to follow new primary.
This only happens if (1) the new standby has no WAL available locally,
(2) the new standby is starting from the old timeline, (3) the promotion
happened in the WAL segment from which the new standby is starting,
(4) the timeline history file for the new timeline is available from
the archive but the WAL files for are not (i.e. this is a race),
(5) the WAL files for the new timeline are available via streaming,
and (6) recovery_target_timeline='latest'.
Commit
ee994272ca50f70b53074f0febaec97e28f83c4e introduced this
logic and was an improvement over the previous code, but it mishandled
this case. If recovery_target_timeline='latest' and restore_command is
set, validateRecoveryParameters() can change recoveryTargetTLI to be
different from receiveTLI. If streaming is then tried afterward,
expectedTLEs gets initialized with the history of the wrong timeline.
It's supposed to be a list of entries explaining how to get to the
target timeline, but in this case it ends up with a list of entries
explaining how to get to the new standby's original timeline, which
isn't right.
Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.
Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com
Michael Paquier [Wed, 9 Jun 2021 07:24:52 +0000 (16:24 +0900)]
Fix inconsistencies in psql --help=commands
The set of subcommands supported by \dAp, \do and \dy was described
incorrectly in psql's --help. The documentation was already consistent
with the code.
Reported-by: inoas, from IRC
Author: Matthijs van der Vleuten
Reviewed-by: Neil Chen
Discussion: https://postgr.es/m/
6a984e24-2171-4039-9050-
92d55e7b23fe@www.fastmail.com
Backpatch-through: 9.6
Tom Lane [Tue, 8 Jun 2021 22:40:06 +0000 (18:40 -0400)]
Force NO SCROLL for plpgsql's implicit cursors.
Further thought about bug #17050 suggests that it's a good idea
to use CURSOR_OPT_NO_SCROLL for the implicit cursor opened by
a plpgsql FOR-over-query loop. This ensures that, if somebody
commits inside the loop, PersistHoldablePortal won't try to
rewind and re-read the cursor. While we'd have selected NO_SCROLL
anyway if FOR UPDATE/SHARE appears in the query, there are other
hazards with volatile functions; and in any case, it's silly to
expend effort storing rows that we know for certain won't be needed.
(While here, improve the comment in exec_run_select, which was a bit
confused about the rationale for when we can use parallel mode.
Cursor operations aren't a hazard for nameless portals.)
This wasn't an issue until v11, which introduced the possibility
of persisting such cursors. Hence, back-patch to v11.
Per bug #17050 from Алексей Булгаков.
Discussion: https://postgr.es/m/17050-
f77aa827dc85247c@postgresql.org
Tom Lane [Tue, 8 Jun 2021 21:50:15 +0000 (17:50 -0400)]
Avoid misbehavior when persisting a non-stable cursor.
PersistHoldablePortal has long assumed that it should store the
entire output of the query-to-be-persisted, which requires rewinding
and re-reading the output. This is problematic if the query is not
stable: we might get different row contents, or even a different
number of rows, which'd confuse the cursor state mightily.
In the case where the cursor is NO SCROLL, this is very easy to
solve: just store the remaining query output, without any rewinding,
and tweak the portal's cursor state to match. Aside from removing
the semantic problem, this could be significantly more efficient
than storing the whole output.
If the cursor is scrollable, there's not much we can do, but it
was already the case that scrolling a volatile query's result was
pretty unsafe. We can just document more clearly that getting
correct results from that is not guaranteed.
There are already prohibitions in place on using SCROLL with
FOR UPDATE/SHARE, which is one way for a SELECT query to have
non-stable results. We could imagine prohibiting SCROLL when
the query contains volatile functions, but that would be
expensive to enforce. Moreover, it could break applications
that work just fine, if they have functions that are in fact
stable but the user neglected to mark them so. So settle for
documenting the hazard.
While this problem has existed in some guise for a long time,
it got a lot worse in v11, which introduced the possibility
of persisting plpgsql cursors (perhaps implicit ones) even
when they violate the rules for what can be marked WITH HOLD.
Hence, I've chosen to back-patch to v11 but not further.
Per bug #17050 from Алексей Булгаков.
Discussion: https://postgr.es/m/17050-
f77aa827dc85247c@postgresql.org
Bruce Momjian [Tue, 8 Jun 2021 20:47:14 +0000 (16:47 -0400)]
doc: update release note item about the v2 wire protocol
Protocol v2 was last used in PG 7.3, not 7.2.
Reported-by: Tatsuo Ishii
Discussion: https://postgr.es/m/
20210608.091329.
906837606658882674.t-ishii@sraoss.co.jp
Tomas Vondra [Tue, 8 Jun 2021 18:22:18 +0000 (20:22 +0200)]
Adjust batch size in postgres_fdw to not use too many parameters
The FE/BE protocol identifies parameters with an Int16 index, which
limits the maximum number of parameters per query to 65535. With
batching added to postges_fdw this limit is much easier to hit, as
the whole batch is essentially a single query, making this error much
easier to hit.
The failures are a bit unpredictable, because it also depends on the
number of columns in the query. So instead of just failing, this patch
tweaks the batch_size to not exceed the maximum number of parameters.
Reported-by: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB571603973C0AC2874AD6BF2594299%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Tomas Vondra [Tue, 8 Jun 2021 17:24:27 +0000 (19:24 +0200)]
Fix pg_visibility regression failure with CLOBBER_CACHE_ALWAYS
Commit
8e03eb92e9 reverted a bit too much code, reintroducing one of the
issues fixed by
39b66a91bd - a page might have been left partially empty
after relcache invalidation.
Reported-By: Tom Lane
Author: Masahiko Sawada
Discussion: https://postgr.es/m/822752.
1623032114@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAD21AoA%3D%3Df2VSw3c-Cp_y%3DWLKHMKc1D6s7g3YWsCOvgaYPpJcg%40mail.gmail.com
Tom Lane [Tue, 8 Jun 2021 15:59:34 +0000 (11:59 -0400)]
Don't crash on empty statements in SQL-standard function bodies.
gram.y should discard NULL pointers (empty statements) when
assembling a routine_body_stmt_list, as it does for other
sorts of statement lists.
Julien Rouhaud and Tom Lane, per report from Noah Misch.
Discussion: https://postgr.es/m/
20210606044418.GA297923@rfd.leadboat.com
Peter Eisentraut [Tue, 8 Jun 2021 13:37:54 +0000 (15:37 +0200)]
libpq: Fix SNI host handling
Fix handling of NULL host name (possibly by using hostaddr). It
previously crashed. Also, we should look at connhost, not pghost, to
handle multi-host specifications.
Also remove an unnecessary SSL_CTX_free().
Reported-by: Jacob Champion <pchampion@vmware.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/
504c276ab6eee000bb23d571ea9b0ced4250774e.camel@vmware.com
Etsuro Fujita [Tue, 8 Jun 2021 04:45:00 +0000 (13:45 +0900)]
Doc: Further update documentation for asynchronous execution.
Add a note about asynchronous execution by postgres_fdw when applied to
Append nodes that contain synchronous subplan(s) as well. Follow-up for
commit
27e1f1456.
Andrey Lepikhov and Etsuro Fujita
Discussion: https://postgr.es/m/
58fa2aa5-07f5-80b5-59a1-
fec8a349fee7%40postgrespro.ru
Michael Paquier [Mon, 7 Jun 2021 23:53:12 +0000 (08:53 +0900)]
Reorder superuser check in pg_log_backend_memory_contexts()
The use of this function is limited to superusers and the code includes
a hardcoded check for that. However, the code would look for the PGPROC
entry to signal for the memory dump before checking if the user is a
superuser or not, which does not make sense if we know that an error
will be returned. Note that the code would let one know if a process
was a PostgreSQL process or not even for non-authorized users, which is
not the case now, but this avoids taking ProcArrayLock that will most
likely finish by being unnecessary.
Thanks to Julien Rouhaud and Tom Lane for the discussion.
Discussion: https://postgr.es/m/YLxw1uVGIAP5uMPl@paquier.xyz
Peter Eisentraut [Mon, 7 Jun 2021 19:32:53 +0000 (21:32 +0200)]
Add _outTidRangePath()
We have outNode() coverage for all path nodes, but this one was
missed when it was added.
Tom Lane [Mon, 7 Jun 2021 18:52:42 +0000 (14:52 -0400)]
Stabilize contrib/seg regression test.
If autovacuum comes along just after we fill table test_seg with
some data, it will update the stats to the point where we prefer
a plain indexscan over a bitmap scan, breaking the expected
output (as well as the point of the test case). To fix, just
force a bitmap scan to be chosen here.
This has evidently been wrong since commit
de1d042f5. It's not
clear why we just recently saw any buildfarm failures due to it;
but prairiedog has failed twice on this test in the past week.
Hence, backpatch to v11 where this test case came in.
Tom Lane [Mon, 7 Jun 2021 18:15:25 +0000 (14:15 -0400)]
Fix incautious handling of possibly-miscoded strings in client code.
An incorrectly-encoded multibyte character near the end of a string
could cause various processing loops to run past the string's
terminating NUL, with results ranging from no detectable issue to
a program crash, depending on what happens to be in the following
memory.
This isn't an issue in the server, because we take care to verify
the encoding of strings before doing any interesting processing
on them. However, that lack of care leaked into client-side code
which shouldn't assume that anyone has validated the encoding of
its input.
Although this is certainly a bug worth fixing, the PG security team
elected not to regard it as a security issue, primarily because
any untrusted text should be sanitized by PQescapeLiteral or
the like before being incorporated into a SQL or psql command.
(If an app fails to do so, the same technique can be used to
cause SQL injection, with probably much more dire consequences
than a mere client-program crash.) Those functions were already
made proof against this class of problem, cf CVE-2006-2313.
To fix, invent PQmblenBounded() which is like PQmblen() except it
won't return more than the number of bytes remaining in the string.
In HEAD we can make this a new libpq function, as PQmblen() is.
It seems imprudent to change libpq's API in stable branches though,
so in the back branches define PQmblenBounded as a macro in the files
that need it. (Note that just changing PQmblen's behavior would not
be a good idea; notably, it would completely break the escaping
functions' defense against this exact problem. So we just want a
version for those callers that don't have any better way of handling
this issue.)
Per private report from houjingyi. Back-patch to all supported branches.
Michael Paquier [Mon, 7 Jun 2021 09:12:29 +0000 (18:12 +0900)]
Fix portability issue in test indirect_toast
When run on a server using default_toast_compression set to LZ4, this
test would fail because of a consistency issue with the order of the
tuples treated. LZ4 causes one tuple to be stored inline instead of
getting externalized. As the goal of this test is to check after data
stored externally, stick to pglz as the compression algorithm used, so
as all data of this test is stored the way it should.
Analyzed-by: Dilip Kumar
Discussion: https://postgr.es/m/YLrDWxJgM8WWMoCg@paquier.xyz
Amit Kapila [Mon, 7 Jun 2021 04:02:06 +0000 (09:32 +0530)]
Remove two_phase variable from CreateReplicationSlotCmd struct.
Commit
19890a064e added the option to enable two_phase commits via
pg_create_logical_replication_slot but didn't extend the support of same
in replication protocol. However, by mistake, it added the two_phase
variable in CreateReplicationSlotCmd which is required only when we extend
the replication protocol.
Reported-by: Jeff Davis
Author: Ajin Cherian
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/
64b9f783c6e125f18f88fbc0c0234e34e71d8639.camel@j-davis.com
Etsuro Fujita [Mon, 7 Jun 2021 03:45:00 +0000 (12:45 +0900)]
Fix rescanning of async-aware Append nodes.
In cases where run-time pruning isn't required, the synchronous and
asynchronous subplans for an async-aware Append node determined using
classify_matching_subplans() should be re-used when rescanning the node,
but the previous code re-determined them using that function repeatedly
each time when rescanning the node, leading to incorrect results in a
normal build and an Assert failure in an Assert-enabled build as that
function doesn't assume that it's called repeatedly in such cases. Fix
the code as mentioned above.
My oversight in commit
27e1f1456.
While at it, initialize async-related pointers/variables to NULL/zero
explicitly in ExecInitAppend() and ExecReScanAppend(), just to be sure.
(The variables would have been set to zero before we get to the latter
function, but let's do so.)
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/CAPmGK16Q4B2_KY%2BJH7rb7wQbw54AUprp7TMekGTd2T1B62yysQ%40mail.gmail.com