git.postgresql.org Git - postgresql.git/log

Use 0-based numbering in comments about backup blocks.

The macros and functions that work with backup blocks in the redo function
use 0-based numbering, so let's use that consistently in the function that
generates the records too. Makes it so much easier to compare the
generation and replay functions.

Backpatch to 9.0, where we switched from 1-based to 0-based numbering.

Fix non-C89-compatible coding in pgbench.

C89 says that compound initializers may only contain constant expressions;
a restriction violated by commit 89d00cbe.  While we've had no actual field
complaints about this, C89 is still the project standard, and it's not
saving all that much code to break compatibility here.  So let's adhere to
the old restriction.

In passing, replace a bunch of hardwired constants "256" with
sizeof(target-variable), just because the latter is more readable and
less breakable.  And const-ify where possible.

Back-patch to 9.3 where the nonportable code was added.

Andres Freund and Tom Lane

Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.

Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed
that prepared transaction commits reuse that struct. Fix that.

Because those fields were left unitialized, replaying a commit prepared WAL
record in a hot standby node would fail to remove the relcache init file.
That can lead to "could not open file" errors on the standby. Relcache init
file only needs to be removed when a system table/index is rewritten in the
transaction using two phase commit, so that should be rare in practice. In
HEAD, the incorrect dbId/tsId values are also used for filtering in logical
replication code, causing the transaction to always be filtered out.

Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was
introduced.

Fix unportable setvbuf() usage in initdb.

In yesterday's commit 2dc4f011fd61501cce507be78c39a2677690d44b, I tried
to force buffering of stdout/stderr in initdb to be what it is by
default when the program is run interactively on Unix (since that's how
most manual testing is done).  This tripped over the fact that Windows
doesn't support _IOLBF mode.  We dealt with that a long time ago in
syslogger.c by falling back to unbuffered mode on Windows.  Export that
solution in port.h and use it in initdb.

Back-patch to 8.4, like the previous commit.

Handle duplicate XIDs in txid_snapshot.

The proc array can contain duplicate XIDs, when a transaction is just being
prepared for two-phase commit. To cope, remove any duplicates in
txid_current_snapshot(). Also ignore duplicates in the input functions, so
that if e.g. you have an old pg_dump file that already contains duplicates,
it will be accepted.

Report and fix by Jan Wieck. Backpatch to all supported versions.

Fix race condition in preparing a transaction for two-phase commit.

To lock a prepared transaction's shared memory entry, we used to mark it
with the XID of the backend. When the XID was no longer active according
to the proc array, the entry was implicitly considered as not locked
anymore. However, when preparing a transaction, the backend's proc array
entry was cleared before transfering the locks (and some other state) to
the prepared transaction's dummy PGPROC entry, so there was a window where
another backend could finish the transaction before it was in fact fully
prepared.

To fix, rewrite the locking mechanism of global transaction entries. Instead
of an XID, just have simple locked-or-not flag in each entry (we store the
locking backend's backend id rather than a simple boolean, but that's just
for debugging purposes). The backend is responsible for explicitly unlocking
the entry, and to make sure that that happens, install a callback to unlock
it on abort or process exit.

Backpatch to all supported versions.

In initdb, ensure stdout/stderr buffering behavior is what we expect.

Since this program may print to either stdout or stderr, the relative
ordering of its messages depends on the buffering behavior of those files.
Force stdout to be line-buffered and stderr to be unbuffered, ensuring
that the behavior will match standard Unix interactive behavior, even
when stdout and stderr are rerouted to a file.

Per complaint from Tomas Vondra. The particular case he pointed out is
new in HEAD, but issues of the same sort could arise in any branch with
other error messages, so back-patch to all branches.

I'm unsure whether we might not want to do this in other client programs
as well. For the moment, just fix initdb.

Code review for recent changes in relcache.c.

rd_replidindex should be managed the same as rd_oidindex, and rd_keyattr
and rd_idattr should be managed like rd_indexattr.  Omissions in this area
meant that the bitmapsets computed for rd_keyattr and rd_idattr would be
leaked during any relcache flush, resulting in a slow but permanent leak in
CacheMemoryContext.  There was also a tiny probability of relcache entry
corruption if we ran out of memory at just the wrong point in
RelationGetIndexAttrBitmap.  Otherwise, the fields were not zeroed where
expected, which would not bother the code any AFAICS but could greatly
confuse anyone examining the relcache entry while debugging.

Also, create an API function RelationGetReplicaIndex rather than letting
non-relcache code be intimate with the mechanisms underlying caching of
that value (we won't even mention the memory leak there).

Also, fix a relcache flush hazard identified by Andres Freund:
RelationGetIndexAttrBitmap must not assume that rd_replidindex stays valid
across index_open.

The aspects of this involving rd_keyattr date back to 9.3, so back-patch
those changes.

Initialize padding bytes in btree_gist varbit support.

The code expands a varbit gist leaf key to a node key by copying the bit
data twice in a varlen datum, as both the lower and upper key. The lower key
was expanded to INTALIGN size, but the padding bytes were not initialized.
That's a problem because when the lower/upper keys are compared, the padding
bytes are used compared too, when the values are otherwise equal. That could
lead to incorrect query results.

REINDEX is advised for any btree_gist indexes on bit or bit varying data
type, to fix any garbage padding bytes on disk.

Per Valgrind, reported by Andres Freund. Backpatch to all supported
versions.

Ignore config.pl and buildenv.pl in src/tools/msvc.

config.pl and buildenv.pl can be used to customize build settings when
using MSVC. They should never get committed into the common source tree.

Back-patch to 9.0; it looks like the rules were different in 8.4.

Michael Paquier

Free PQresult on error in pg_receivexlog.

The leak is fairly small and rare, but a leak nevertheless.

Per Coverity report. Backpatch to 9.2, where pg_receivexlog was added.
pg_basebackup shares the code, but it always exits on error, so there is
no real leak.

Accept tcl 8.6 in configure's probe for tclsh.

Usually the search would find plain "tclsh" without any trouble,
but some installations might only have the version-numbered flavor
of that program.

No compatibility problems have been reported with 8.6, so we might
as well back-patch this to all active branches.

Christoph Berg

Get rid of bogus dependency on typcategory in to_json() and friends.

These functions were relying on typcategory to identify arrays and
composites, which is not reliable and not the normal way to do it.
Using typcategory to identify boolean, numeric types, and json itself is
also pretty questionable, though the code in those cases didn't seem to be
at risk of anything worse than wrong output. Instead, use the standard
lsyscache functions to identify arrays and composites, and rely on a direct
check of the type OID for the other cases.

In HEAD, also be sure to look through domains so that a domain is treated
the same as its base type for conversions to JSON. However, this is a
small behavioral change; given the lack of field complaints, we won't
back-patch it.

In passing, refactor so that there's only one copy of the code that decides
which conversion strategy to apply, not multiple copies that could (and
have) gotten out of sync.

Document permissions needed for pg_database_size and pg_tablespace_size.

Back in 8.3, we installed permissions checks in these functions (see
commits 8bc225e7990a and cc26599b7206). But we forgot to document that
anywhere in the user-facing docs; it did get mentioned in the 8.3 release
notes, but nobody's looking at that any more. Per gripe from Suya Huang.

Un-break ecpg test suite under --disable-integer-datetimes.

Commit 4318daecc959886d001a6e79c6ea853e8b1dfb4b broke it. The change in
sub-second precision at extreme dates is normal. The inconsistent
truncation vs. rounding is essentially a bug, albeit a longstanding one.
Back-patch to 8.4, like the causative commit.

Protect against torn pages when deleting GIN list pages.

To-be-deleted list pages contain no useful information, as they are being
deleted, but we must still protect the writes from being torn by a crash
after a partial write. To do that, re-initialize the pages on WAL replay.

Jeff Janes caught this with a test program to test partial writes.
Backpatch to all supported versions.

Include files copied from libpqport in .gitignore

Michael Paquier

Avoid buffer bloat in libpq when server is consistently faster than client.

If the server sends a long stream of data, and the server + network are
consistently fast enough to force the recv() loop in pqReadData() to
iterate until libpq's input buffer is full, then upon processing the last
incomplete message in each bufferload we'd usually double the buffer size,
due to supposing that we didn't have enough room in the buffer to finish
collecting that message.  After filling the newly-enlarged buffer, the
cycle repeats, eventually resulting in an out-of-memory situation (which
would be reported misleadingly as "lost synchronization with server").
Of course, we should not enlarge the buffer unless we still need room
after discarding already-processed messages.

This bug dates back quite a long time: pqParseInput3 has had the behavior
since perhaps 2003, getCopyDataMessage at least since commit 70066eb1a1ad
in 2008.  Probably the reason it's not been isolated before is that in
common environments the recv() loop would always be faster than the server
(if on the same machine) or faster than the network (if not); or at least
it wouldn't be slower consistently enough to let the buffer ramp up to a
problematic size.  The reported cases involve Windows, which perhaps has
different timing behavior than other platforms.

Per bug #7914 from Shin-ichi Morita, though this is different from his
proposed solution.  Back-patch to all supported branches.

Fix failure to set ActiveSnapshot while rewinding a cursor.

ActiveSnapshot needs to be set when we call ExecutorRewind because some
plan node types may execute user-defined functions during their ReScan
calls (nodeLimit.c does so, at least).  The wisdom of that is somewhat
debatable, perhaps, but for now the simplest fix is to make sure the
required context is valid.  Failure to do this typically led to a
null-pointer-dereference core dump, though it's possible that in more
complex cases a function could be executed with the wrong snapshot
leading to very subtle misbehavior.

Per report from Leif Jensen.  It's been broken for a long time, so
back-patch to all active branches.

Fix interval test, which was broken for floating-point timestamps.

Commit 4318daecc959886d001a6e79c6ea853e8b1dfb4b introduced a test that
couldn't be made consistent between integer and floating-point
timestamps.

It was designed to test the longest possible interval output length,
so removing four zeros from the number of hours, as this patch does,
is not ideal. But the test still has some utility for its original
purpose, and there aren't a lot of other good options.

Noah Misch suggested a different approach where we test that the
output either matches what we expect from integer timestamps or what
we expect from floating-point timestamps. That seemed to obscure an
otherwise simple test, however.

Reviewed by Tom Lane and Noah Misch.

Remove tabs after spaces in C comments

This was not changed in HEAD, but will be done later as part of a
pgindent run. Future pgindent runs will also do this.

Report by Tom Lane

Backpatch through all supported branches, but not HEAD

Correct comment in Hot Standby nbtree handling

Logic is correct, matching handling of LP_DEAD elsewhere.

Fix use of free in walsender error handling after a sysid mismatch.

Found via valgrind. The bug exists since the introduction of the walsender,
so backpatch to 9.0.

Andres Freund

Fix handling of array of char pointers in ecpglib.

When array of char * was used as target for a FETCH statement returning more
than one row, it tried to store all the result in the first element. Instead it
should dump array of char pointers with right offset, use the address instead
of the value of the C variable while reading the array and treat such variable
as char **, instead of char * for pointer arithmetic.

Patch by Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>

Fix possible cache invalidation failure in ReceiveSharedInvalidMessages.

Commit fad153ec45299bd4d4f29dec8d9e04e2f1c08148 modified sinval.c to reduce
the number of calls into sinvaladt.c (which require taking a shared lock)
by keeping a local buffer of collected-but-not-yet-processed messages.
However, if processing of the last message in a batch resulted in a
recursive call to ReceiveSharedInvalidMessages, we could overwrite that
message with a new one while the outer invalidation function was still
working on it. This would be likely to lead to invalidation of the wrong
cache entry, allowing subsequent processing to use stale cache data.
The fix is just to make a local copy of each message while we're processing
it.

Spotted by Andres Freund. Back-patch to 8.4 where the bug was introduced.

Fix "quiet inline" configure test for newer clang compilers.

This test used to just define an unused static inline function and check
whether that causes a warning. But newer clang versions warn about
unused static inline functions when defined inside a .c file, but not
when defined in an included header, which is the case we care about.
Change the test to cope.

Andres Freund

Fix yet another corner case in dumping rules/views with USING clauses.

ruleutils.c tries to cope with additions/deletions/renamings of columns in
tables referenced by views, by means of adding machine-generated aliases to
the printed form of a view when needed to preserve the original semantics.
A recent blog post by Marko Tiikkaja pointed out a case I'd missed though:
if one input of a join with USING is itself a join, there is nothing to
stop the user from adding a column of the same name as the USING column to
whichever side of the sub-join didn't provide the USING column. And then
there'll be an error when the view is re-parsed, since now the sub-join
exposes two columns matching the USING specification. We were catching a
lot of related cases, but not this one, so add some logic to cope with it.

Back-patch to 9.3, which is the first release that makes any serious
attempt to cope with such cases (cf commit 2ffa740be and follow-ons).

Fix failure to detoast fields in composite elements of structured types.

If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row.  The same applies for ranges over composite types, nested
composites, etc.  However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites.  For example, given a command such
as

UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...

where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column.  If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced.  A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.

Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().

I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out).  The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.

This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple.  This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first.  With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.

The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before.  On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings.  Experimentation suggests that common SQL
coding patterns are unaffected either way, though.  Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.

In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.

Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled.  The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.

This bug has existed since TOAST was invented, so back-patch to all
supported branches.

Check for interrupts and stack overflow during rule/view dumps.

Since ruleutils.c recurses, it could be driven to stack overflow by
deeply nested constructs. Very large queries might also take long
enough to deparse that a check for interrupts seems like a good idea.
Stick appropriate tests into a couple of key places.

Noted by Greg Stark. Back-patch to all supported branches.

Reduce indentation/parenthesization of set operations in rule/view dumps.

A query such as "SELECT x UNION SELECT y UNION SELECT z UNION ..."
produces a left-deep nested parse tree, which we formerly showed in its
full nested glory and with all the possible parentheses.  This does little
for readability, though, and long UNION lists resulting in excessive
indentation are common.  Instead, let's omit parentheses and indent all
the subqueries at the same level in such cases.

This patch skips indentation/parenthesization whenever the lefthand input
of a SetOperationStmt is another SetOperationStmt of the same kind and
ALL/DISTINCT property.  We could teach the code the exact syntactic
precedence of set operations and thereby avoid parenthesization in some
more cases, but it's not clear that that'd be a readability win: it seems
better to parenthesize if the set operation changes.  (As an example,
if there's one UNION in a long list of UNION ALL, it now stands out like
a sore thumb, which seems like a good thing.)

Back-patch to 9.3.  This completes our response to a complaint from Greg
Stark that since commit 62e666400d there's a performance problem in pg_dump
for views containing long UNION sequences (or other types of deeply nested
constructs).  The previous commit 0601cb54dac14d979d726ab2ebeda251ae36e857
handles the general problem, but this one makes the specific case of UNION
lists look a lot nicer.

Limit overall indentation in rule/view dumps.

Continuing to indent no matter how deeply nested we get doesn't really
do anything for readability; what's worse, it results in O(N^2) total
whitespace, which can become a performance and memory-consumption issue.

To address this, once we get past 40 characters of indentation, reduce
the indentation step distance 4x, and also limit the maximum indentation
by reducing it modulo 40. This latter choice is a bit weird at first
glance, but it seems to preserve readability better than a simple cap
would do.

Back-patch to 9.3, because since commit 62e666400d the performance issue
is a hazard for pg_dump.

Greg Stark and Tom Lane

Fix indentation of JOIN clauses in rule/view dumps.

The code attempted to outdent JOIN clauses further left than the parent
FROM keyword, which was odd in any case, and led to inconsistent formatting
since in simple cases the clauses couldn't be moved any further left than
that.  And it left a permanent decrement of the indentation level, causing
subsequent lines to be much further left than they should be (again, this
couldn't be seen in simple cases for lack of indentation to give up).

After a little experimentation I chose to make it indent JOIN keywords
two spaces from the parent FROM, which is one space more than the join's
lefthand input in cases where that appears on a different line from FROM.

Back-patch to 9.3.  This is a purely cosmetic change, and the bug is quite
old, so that may seem arbitrary; but we are going to be making some other
changes to the indentation behavior in both HEAD and 9.3, so it seems
reasonable to include this in 9.3 too.  I committed this one first because
its effects are more visible in the regression test results as they
currently stand than they will be later.

Add missing SYSTEMQUOTEs

Some popen() calls were missing SYSTEMQUOTEs, which caused initdb and
pg_upgrade to fail on Windows, if the installation path contained both
spaces and @ signs.

Patch by Nikhil Deshpande. Backpatch to all supported versions.

Improve planner to drop constant-NULL inputs of AND/OR where it's legal.

In general we can't discard constant-NULL inputs, since they could change
the result of the AND/OR to be NULL.  But at top level of WHERE, we do not
need to distinguish a NULL result from a FALSE result, so it's okay to
treat NULL as FALSE and then simplify AND/OR accordingly.

This is a very ancient oversight, but in 9.2 and later it can lead to
failure to optimize queries that previous releases did optimize, as a
result of more aggressive parameter substitution rules making it possible
to reduce more subexpressions to NULL constants.  This is the root cause of
bug #10171 from Arnold Scheffler.  We could alternatively have fixed that
by teaching orclauses.c to ignore constant-NULL OR arms, but it seems
better to get rid of them globally.

I resisted the temptation to back-patch this change into all active
branches, but it seems appropriate to back-patch as far as 9.2 so that
there will not be performance regressions of the kind shown in this bug.

Fix two bugs in WAL-logging of GIN pending-list pages.

In writeListPage, never take a full-page image of the page, because we
have all the information required to re-initialize in the WAL record
anyway. Before this fix, a full-page image was always generated, unless
full_page_writes=off, because when the page is initialized its LSN is
always 0. In stable-branches, keep the code to restore the backup blocks
if they exist, in case that the WAL is generated with an older minor
version, but in master Assert that there are no full-page images.

In the redo routine, add missing "off++". Otherwise the tuples are added
to the page in reverse order. That happens to be harmless because we
always scan and remove all the tuples together, but it was clearly wrong.
Also, it was masked by the first bug unless full_page_writes=off, because
the page was always restored from a full-page image.

Backpatch to all supported versions.

Can't completely get rid of #ifndef FRONTEND in palloc.h :-(

pg_controldata includes postgres.h not postgres_fe.h, so utils/palloc.h
must be able to compile in a "#define FRONTEND" context.  It appears that
Solaris Studio is smart enough to persuade us to define PG_USE_INLINE,
but not smart enough to not make a copy of unreferenced static functions;
which leads to an unsatisfied reference to CurrentMemoryContext.  So we
need an #ifndef FRONTEND around that declaration.  Per buildfarm.

Don't #include utils/palloc.h in common/fe_memutils.h.

This breaks the principle that common/ ought not depend on anything in the
server, not only code-wise but in the headers.  The only arguable advantage
is avoidance of duplication of half a dozen extern declarations, and even
that is rather dubious, considering that the previous coding was wrong
about which declarations to duplicate: it exposed pnstrdup() to frontend
code even though no such function is provided in fe_memutils.c.

On the same principle, don't #include utils/memutils.h in the frontend
build of psprintf.c.  This requires duplicating the definition of
MaxAllocSize, but that seems fine to me: there's no a-priori reason why
frontend code should use the same size limit as the backend anyway.

In passing, clean up some rather odd layout and ordering choices that
were imposed on palloc.h to reduce the number of #ifdefs required by
the previous approach.

Per gripe from Christoph Berg.  There's still more work to do to make
include/common/ clean, but this part seems reasonably noncontroversial.

Fix race when updating a tuple concurrently locked by another process

If a tuple is locked, and this lock is later upgraded either to an
update or to a stronger lock, and in the meantime some other process
tries to lock, update or delete the same tuple, it (the tuple) could end
up being updated twice, or having conflicting locks held.

The reason for this is that the second updater checks for a change in
Xmax value, or in the HEAP_XMAX_IS_MULTI infomask bit, after noticing
the first lock; and if there's a change, it restarts and re-evaluates
its ability to update the tuple.  But it neglected to check for changes
in lock strength or in lock-vs-update status when those two properties
stayed the same.  This would lead it to take the wrong decision and
continue with its own update, when in reality it shouldn't do so but
instead restart from the top.

This could lead to either an assertion failure much later (when a
multixact containing multiple updates is detected), or duplicate copies
of tuples.

To fix, make sure to compare the other relevant infomask bits alongside
the Xmax value and HEAP_XMAX_IS_MULTI bit, and restart from the top if
necessary.

Also, in the belt-and-suspenders spirit, add a check to
MultiXactCreateFromMembers that a multixact being created does not have
two or more members that are claimed to be updates.  This should protect
against other bugs that might cause similar bogus situations.

Backpatch to 9.3, where the possibility of multixacts containing updates
was introduced.  (In prior versions it was possible to have the tuple
lock upgraded from shared to exclusive, and an update would not restart
from the top; yet we're protected against a bug there because there's
always a sleep to wait for the locking transaction to complete before
continuing to do anything.  Really, the fact that tuple locks always
conflicted with concurrent updates is what protected against bugs here.)

Per report from Andrew Dunstan and Josh Berkus in thread at
http://www.postgresql.org/message-id/534C8B33.9050807@pgexperts.com

Bug analysis by Andres Freund.

Reset pg_stat_activity.xact_start during PREPARE TRANSACTION.

Once we've completed a PREPARE, our session is not running a transaction,
so its entry in pg_stat_activity should show xact_start as null, rather
than leaving the value as the start time of the now-prepared transaction.

I think possibly this oversight was triggered by faulty extrapolation
from the adjacent comment that says PrepareTransaction should not call
AtEOXact_PgStat, so tweak the wording of that comment.

Noted by Andres Freund while considering bug #10123 from Maxim Boguk,
although this error doesn't seem to explain that report.

Back-patch to all active branches.

Fix incorrect pg_proc.proallargtypes entries for two built-in functions.

pg_sequence_parameters() and pg_identify_object() have had incorrect
proallargtypes entries since 9.1 and 9.3 respectively. This was mostly
masked by the correct information in proargtypes, but a few operations
such as pg_get_function_arguments() (and thus psql's \df display) would
show the wrong data types for these functions' input parameters.

In HEAD, fix the wrong info, bump catversion, and add an opr_sanity
regression test to catch future mistakes of this sort.

In the back branches, just fix the wrong info so that installations
initdb'd with future minor releases will have the right data. We
can't force an initdb, and it doesn't seem like a good idea to add
a regression test that will fail on existing installations.

Andres Freund

Update obsolete comments.

We no longer have a TLI field in the page header.

Fix typos in comment.

pg_stat_statements forgot to let previous occupant of hook get control too.

pgss_post_parse_analyze() neglected to pass the call on to any earlier
occupant of the post_parse_analyze_hook. There are no other users of that
hook in contrib/, and most likely none in the wild either, so this is
probably just a latent bug. But it's a bug nonetheless, so back-patch
to 9.2 where this code was introduced.

Fix unused-variable warning on Windows.

Introduced in 585bca39: msgid is not used in the Windows code path.

Also adjust comments a tad (mostly to keep pgindent from messing it up).

David Rowley

pgcrypto: fix memset() calls that might be optimized away

Specifically, on-stack memset() might be removed, so:

* Replace memset() with px_memset()
* Add px_memset to copy_crlf()
* Add px_memset to pgp-s2k.c

Patch by Marko Kreen

Report by PVS-Studio

Backpatch through 8.4.

pg_upgrade: throw an error for non-existent tablespace directories

Non-existent tablespace directory references can occur if user
tablespaces are created inside data directories and the data directory
is renamed in preparation for running pg_upgrade, and the symbolic links
are not updated.

Backpatch to 9.3.

Set the all-visible flag on heap page before writing WAL record, not after.

If we set the all-visible flag after writing WAL record, and XLogInsert
takes a full-page image of the page, the image would not include the flag.
We will then proceed to set the VM bit, which would then be set without the
corresponding all-visible flag on the heap page.

Found by comparing page images on master and standby, after writing/replaying
each WAL record. (There is still a discrepancy: the all-visible flag won't
be set after replaying the HEAP_CLEAN record, even though it is set in the
master. However, it will be set when replaying the HEAP2_VISIBLE record and
setting the VM bit, so the all-visible flag and VM bit are always consistent
on the standby, even though they are momentarily out-of-sync with master)

Backpatch to 9.3 where this code was introduced.

docs: properly document psql auto encoding mode

In psql, both stdin and stdout must be terminals to get a client
encoding of 'auto'.

Patch by Albe Laurenz

Backpatch to 9.3.

Fix object identities for text search objects

We were neglecting to schema-qualify them.

Backpatch to 9.3, where object identities were introduced as a concept
by commit f8348ea32ec8.

Fix contrib/postgres_fdw's remote-estimate representation of array Params.

We were emitting "(SELECT null::typename)", which is usually interpreted
as a scalar subselect, but not so much in the context "x = ANY(...)".
This led to remote-side parsing failures when remote_estimate is enabled.
A quick and ugly fix is to stick in an extra cast step,
"((SELECT null::typename)::typename)". The cast will be thrown away as
redundant by parse analysis, but not before it's done its job of making
sure the grammar sees the ANY argument as an a_expr rather than a
select_with_parens. Per an example from Hannu Krosing.

Attempt to get plpython regression tests working again for MSVC builds.

This has probably been broken for quite a long time. Buildfarm member
currawong's current results suggest that it's been broken since 9.1, so
backpatch this to that branch.

This only supports Python 2 - I will handle Python 3 separately, but
this is a fairly simple fix.

Use AF_UNSPEC not PF_UNSPEC in getaddrinfo calls.

According to the Single Unix Spec and assorted man pages, you're supposed
to use the constants named AF_xxx when setting ai_family for a getaddrinfo
call.  In a few places we were using PF_xxx instead.  Use of PF_xxx
appears to be an ancient BSD convention that was not adopted by later
standardization.  On BSD and most later Unixen, it doesn't matter much
because those constants have equivalent values anyway; but nonetheless
this code is not per spec.

In the same vein, replace PF_INET by AF_INET in one socket() call, which
wasn't even consistent with the other socket() call in the same function
let alone the remainder of our code.

Per investigation of a Cygwin trouble report from Marco Atzeri.  It's
probably a long shot that this will fix his issue, but it's wrong in
any case.

Fix timeout in LDAP lookup of libpq connection parameters

Bind attempts to an LDAP server should time out after two seconds,
allowing additional lines in the service control file to be parsed
(which provide a fall back to a secondary LDAP server or default options).
The existing code failed to enforce that timeout during TCP connect,
resulting in a hang far longer than two seconds if the LDAP server
does not respond.

Laurenz Albe

check socket creation errors against PGINVALID_SOCKET

Previously, in some places, socket creation errors were checked for
negative values, which is not true for Windows because sockets are
unsigned. This masked socket creation errors on Windows.

Backpatch through 9.0. 8.4 doesn't have the infrastructure to fix this.

Use correctly-sized buffer when zero-filling a WAL file.

I mixed up BLCKSZ and XLOG_BLCKSZ when I changed the way the buffer is
allocated a couple of weeks ago. With the default settings, they are both
8k, but they can be changed at compile-time.

Several fixes to array handling in ecpg.

Patches by Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>

Fix hot standby bug with GiST scans.

Don't reset the rightlink of a page when replaying a page update record.
This was a leftover from pre-hot standby days, when it was not possible to
have scans concurrent with WAL replay. Resetting the right-link was not
necessary back then either, but it was done for the sake of tidiness. But
with hot standby, it's wrong, because a concurrent scan might still need it.

Backpatch all versions with hot standby, 9.0 and above.

Assert that strong-lock count is >0 everywhere it's decremented.

The one existing assertion of this type has tripped a few times in the
buildfarm lately, but it's not clear whether the problem is really
originating there or whether it's leftovers from a trip through one
of the other two paths that lack a matching assertion. So add one.

Since the same bug(s) most likely exist(s) in the back-branches also,
back-patch to 9.2, where the fast-path lock mechanism was added.

Block signals earlier during postmaster startup.

Formerly, we set up the postmaster's signal handling only when we were
about to start launching subprocesses.  This is a bad idea though, as
it means that for example a SIGINT arriving before that will kill the
postmaster instantly, perhaps leaving lockfiles, socket files, shared
memory, etc laying about.  We'd rather that such a signal caused orderly
postmaster termination including releasing of those resources.  A simple
fix is to move the PostmasterMain stanza that initializes signal handling
to an earlier point, before we've created any such resources.  Then, an
early-arriving signal will be blocked until we're ready to deal with it
in the usual way.  (The only part that really needs to be moved up is
blocking of signals, but it seems best to keep the signal handler
installation calls together with that; for one thing this ensures the
kernel won't drop any signals we wished to get.  The handlers won't get
invoked in any case until we unblock signals in ServerLoop.)

Per a report from MauMau.  He proposed changing the way "pg_ctl stop"
works to deal with this, but that'd just be masking one symptom not
fixing the core issue.

It's been like this since forever, so back-patch to all supported branches.

Fix processing of PGC_BACKEND GUC parameters on Windows.

EXEC_BACKEND builds (i.e., Windows) failed to absorb values of PGC_BACKEND
parameters if they'd been changed post-startup via the config file.  This
for example prevented log_connections from working if it were turned on
post-startup.  The mechanism for handling this case has always been a bit
of a kluge, and it wasn't revisited when we implemented EXEC_BACKEND.
While in a normal forking environment new backends will inherit the
postmaster's value of such settings, EXEC_BACKEND backends have to read
the settings from the CONFIG_EXEC_PARAMS file, and they were mistakenly
rejecting them.  So this case has always been broken in the Windows port;
so back-patch to all supported branches.

Amit Kapila

Fix tablespace creation WAL replay to work on Windows.

The code segment that removes the old symlink (if present) wasn't clued
into the fact that on Windows, symlinks are junction points which have
to be removed with rmdir().

Backpatch to 9.0, where the failing code was introduced.

MauMau, reviewed by Muhammad Asif Naeem and Amit Kapila

Allow "-C variable" and "--describe-config" even to root users.

There's no really compelling reason to refuse to do these read-only,
non-server-starting options as root, and there's at least one good
reason to allow -C: pg_ctl uses -C to find out the true data directory
location when pointed at a config-only directory.  On Windows, this is
done before dropping administrator privileges, which means that pg_ctl
fails for administrators if and only if a config-only layout is used.

Since the root-privilege check is done so early in startup, it's a bit
awkward to check for these switches.  Make the somewhat arbitrary
decision that we'll only skip the root check if -C is the first switch.
This is not just to make the code a bit simpler: it also guarantees that
we can't misinterpret a --boot mode switch.  (While AuxiliaryProcessMain
doesn't currently recognize any such switch, it might have one in the
future.)  This is no particular problem for pg_ctl, and since the whole
behavior is undocumented anyhow, it's not a documentation issue either.
(--describe-config only works as the first switch anyway, so this is
no restriction for that case either.)

Back-patch to 9.2 where pg_ctl first began to use -C.

MauMau, heavily edited by me

Fix bogus time printout in walreceiver's debug log messages.

The displayed sendtime and receipttime were always exactly equal, because
somebody forgot that timestamptz_to_str returns a static buffer (thereby
simplifying life for most callers, at the cost of complicating it for those
who need two results concurrently).  Apply the same pstrdup solution used
by the other call sites with this issue.  Back-patch to 9.2 where the
faulty code was introduced.  Per bug #9849 from Haruka Takatsuka, though
this is not exactly his patch.

Possibly we should change timestamptz_to_str's API, but I wouldn't want
to do so in the back branches.

Move multixid allocation out of critical section.

It can fail if you run out of memory.

This call was added in 9.3, so backpatch to 9.3 only.

Avoid allocations in critical sections.

If a palloc in a critical section fails, it becomes a PANIC.

Fix non-equivalence of VARIADIC and non-VARIADIC function call formats.

For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...)
and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the
former is converted to the latter at parse time.  They have indeed been
equivalent, in all releases before 9.3.  However, commit 75b39e790 made an
ill-considered decision to record which syntax had been used in FuncExpr
nodes, and then to make equal() test that in checking node equality ---
which caused the syntaxes to not be seen as equivalent by the planner.
This is the underlying cause of bug #9817 from Dmitry Ryabov.

It might seem that a quick fix would be to make equal() disregard
FuncExpr.funcvariadic, but the same commit made that untenable, because
the field actually *is* semantically significant for some VARIADIC ANY
functions.  This patch instead adopts the approach of redefining
funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument
is a variadic array, whether it got that way by parser intervention or was
supplied explicitly by the user.  Therefore the value will always be true
for non-ANY variadic functions, restoring the principle of equivalence.
(However, the planner will continue to consider use of VARIADIC as a
meaningful difference for VARIADIC ANY functions, even though some such
functions might disregard it.)

In HEAD, this change lets us simplify the decompilation logic in
ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether
to print VARIADIC.  However, in 9.3 we have to continue to cope with
existing stored rules/views that might contain the previous definition.
Fortunately, this just means no change in ruleutils.c, since its existing
behavior effectively ignores funcvariadic for all cases other than VARIADIC
ANY functions.

In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic
changed meanings; this is sort of pro forma, since I don't believe any
built-in views are affected.

Unfortunately, this patch doesn't magically fix everything for affected
9.3 users.  After installing 9.3.5, they might need to recreate their
rules/views/indexes containing variadic function calls in order to get
everything consistent with the new definition.  As in the cited bug,
the symptom of a problem would be failure to use a nominally matching
index that has a variadic function call in its definition.  We'll need
to mention this in the 9.3.5 release notes.

Fix documentation about joining pg_locks to other views.

The advice to join to pg_prepared_xacts via the transaction column was not
updated when the transaction column was replaced by virtualtransaction.
Since it's not quite obvious how to do that join, give an explicit example.
For consistency also give an example for the adjacent case of joining to
pg_stat_activity. And link-ify the view references too, just because we
can. Per bug #9840 from Alexey Bashtanov.

Michael Paquier and Tom Lane

Fix documentation about size of interval type.

It's been 16 bytes, not 12, for ages. This was fixed in passing in HEAD
(commit 146604ec), but as a factual error it should have been back-patched.
Per gripe from Tatsuhito Kasahara.

Avoid palloc in critical section in GiST WAL-logging.

Memory allocation can fail if you run out of memory, and inside a critical
section that will lead to a PANIC. Use conservatively-sized arrays in stack
instead.

There was previously no explicit limit on the number of pages a GiST split
can produce, it was only limited by the number of LWLocks that can be held
simultaneously (100 at the moment). This patch adds an explicit limit of 75
pages. That should be plenty, a typical split shouldn't produce more than
2-3 page halves.

The bug has been there forever, but only backpatch down to 9.1. The code
was changed significantly in 9.1, and it doesn't seem worth the risk or
trouble to adapt this for 9.0 and 8.4.

Fix assorted issues in client host name lookup.

The code for matching clients to pg_hba.conf lines that specify host names
(instead of IP address ranges) failed to complain if reverse DNS lookup
failed; instead it silently didn't match, so that you might end up getting
a surprising "no pg_hba.conf entry for ..." error, as seen in bug #9518
from Mike Blackwell.  Since we don't want to make this a fatal error in
situations where pg_hba.conf contains a mixture of host names and IP
addresses (clients matching one of the numeric entries should not have to
have rDNS data), remember the lookup failure and mention it as DETAIL if
we get to "no pg_hba.conf entry".  Apply the same approach to forward-DNS
lookup failures, too, rather than treating them as immediate hard errors.

Along the way, fix a couple of bugs that prevented us from detecting an
rDNS lookup error reliably, and make sure that we make only one rDNS lookup
attempt; formerly, if the lookup attempt failed, the code would try again
for each host name entry in pg_hba.conf.  Since more or less the whole
point of this design is to ensure there's only one lookup attempt not one
per entry, the latter point represents a performance bug that seems
sufficient justification for back-patching.

Also, adjust src/port/getaddrinfo.c so that it plays as well as it can
with this code.  Which is not all that well, since it does not have actual
support for rDNS lookup, but at least it should return the expected (and
required by spec) error codes so that the main code correctly perceives the
lack of functionality as a lookup failure.  It's unlikely that PG is still
being used in production on any machines that require our getaddrinfo.c,
so I'm not excited about working harder than this.

To keep the code in the various branches similar, this includes
back-patching commits c424d0d1052cb4053c8712ac44123f9b9a9aa3f2 and
1997f34db4687e671690ed054c8f30bb501b1168 into 9.2 and earlier.

Back-patch to 9.1 where the facility for hostnames in pg_hba.conf was
introduced.

Fix bugs in manipulation of PgBackendStatus.st_clienthostname.

Initialization of this field was not being done according to the
st_changecount protocol (it has to be done within the changecount increment
range, not outside). And the test to see if the value should be reported
as null was wrong. Noted while perusing uses of Port.remote_hostname.

This was wrong from the introduction of this code (commit 4a25bc145),
so back-patch to 9.1.

Fix typo in comment.

Amit Langote

Mark FastPathStrongRelationLocks volatile.

Otherwise, the compiler might decide to move modifications to data
within this structure outside the enclosing SpinLockAcquire /
SpinLockRelease pair, leading to shared memory corruption.

This may or may not explain a recent lmgr-related buildfarm failure
on prairiedog, but it needs to be fixed either way.

Count buffers dirtied due to hints in pgBufferUsage.shared_blks_dirtied.

Previously, such buffers weren't counted, with the possible result that
EXPLAIN (BUFFERS) and pg_stat_statements would understate the true
number of blocks dirtied by an SQL statement.

Back-patch to 9.2, where this counter was introduced.

Amit Kapila

Fix dumping of a materialized view that depends on a table's primary key.

It is possible for a view or materialized view to depend on a table's
primary key, if the view query relies on functional dependency to
abbreviate a GROUP BY list.  This is problematic for pg_dump since we
ordinarily want to dump view definitions in the pre-data section but
indexes in post-data.  pg_dump knows how to deal with this situation for
regular views, by breaking the view's ON SELECT rule apart from the view
proper.  But it had not been taught what to do about materialized views,
and in fact mistakenly dumped them as regular views in such cases, as
seen in bug #9616 from Jesse Denardo.

If we had CREATE OR REPLACE MATERIALIZED VIEW, we could fix this in a
manner analogous to what's done for regular views; but we don't yet,
and we'd not back-patch such a thing into 9.3 anyway.  As a hopefully-
temporary workaround, break the circularity by postponing the matview
into post-data altogether when this case occurs.

Revert "Secure Unix-domain sockets of "make check" temporary clusters."

About half of the buildfarm members use too-long directory names,
strongly suggesting that this approach is a dead end.

Secure Unix-domain sockets of "make check" temporary clusters.

Any OS user able to access the socket can connect as the bootstrap
superuser and in turn execute arbitrary code as the OS user running the
test.  Protect against that by placing the socket in the temporary data
directory, which has mode 0700 thanks to initdb.  Back-patch to 8.4 (all
supported versions).  The hazard remains wherever the temporary cluster
accepts TCP connections, notably on Windows.

Attempts to run "make check" from a directory with a long name will now
fail.  An alternative not sharing that problem was to place the socket
in a subdirectory of /tmp, but that is only secure if /tmp is sticky.
The PG_REGRESS_SOCK_DIR environment variable is available as a
workaround when testing from long directory paths.

As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR.  Popular non-default values
like /var/run/postgresql are often unwritable to the build user.

Security: CVE-2014-0067

Document platform-specificity of unix_socket_permissions.

Back-patch to 8.4 (all supported versions).

Revert "Document that Python 2.3 requires cdecimal module for full functionality."

This reverts commit 2a33b796ee27732c1854f41594329afbb9ffba3d.
The change requiring cdecimal is new in 9.4 (see 7919398bac),
so we should not claim previous branches need it.

Document that Python 2.3 requires cdecimal module for full functionality.

This has been true for some time, but we were leaving users to discover it
the hard way.

Back-patch to 9.2. It might've been true before that, but we were claiming
Python 2.2 compatibility before that, so I won't guess at the exact
requirements back then.

Fix refcounting bug in PLy_modify_tuple().

We must increment the refcount on "plntup" as soon as we have the
reference, not sometime later.  Otherwise, if an error is thrown in
between, the Py_XDECREF(plntup) call in the PG_CATCH block removes a
refcount we didn't add, allowing the object to be freed even though
it's still part of the plpython function's parsetree.

This appears to be the cause of crashes seen on buildfarm member
prairiedog.  It's a bit surprising that we've not seen it fail repeatably
before, considering that the regression tests have been exercising the
faulty code path since 2009.

The real-world impact is probably minimal, since it's unlikely anyone would
be provoking the "TD["new"] is not a dictionary" error in production, and
that's the only case that is actually wrong.  Still, it's a bug affecting
the regression tests, so patch all supported branches.

In passing, remove dead variable "plstr", and demote "platt" to a local
variable inside the PG_TRY block, since we don't need to clean it up
in the PG_CATCH path.

Don't forget to flush XLOG_PARAMETER_CHANGE record.

Backpatch to 9.0 where XLOG_PARAMETER_CHANGE record was instroduced.

Fix typos in pg_basebackup documentation

Joshua Tolley

Fix START_REPLICATION syntax in document.

Enclose "TIMELINE tli" part with brackets because it's optional.

Backport to 9.3 where TIMELINE option was introduced.

Noted by Marko Tiikkaja

Address ccvalid/ccnoinherit in TupleDesc support functions.

equalTupleDescs() neglected both of these ConstrCheck fields, and
CreateTupleDescCopyConstr() neglected ccnoinherit. At this time, the
only known behavior defect resulting from these omissions is constraint
exclusion disregarding a CHECK constraint validated by an ALTER TABLE
VALIDATE CONSTRAINT statement issued earlier in the same transaction.
Back-patch to 9.2, where these fields were introduced.

Properly check for readdir/closedir() failures

Clear errno before calling readdir() and handle old MinGW errno bug
while adding full test coverage for readdir/closedir failures.

Backpatch through 8.4.

Fix memory leak during regular expression execution.

For a regex containing backrefs, pg_regexec() might fail to free all the
sub-DFAs that were created during execution, resulting in a permanent
(session lifespan) memory leak. Problem was introduced by me in commit
587359479acbbdc95c8e37da40707e37097423f5. Per report from Sandro Santilli;
diagnosis by Greg Stark.

Fix pg_dumpall option parsing: -i doesn't take an argument.

This used to work properly, but got fat-fingered in commit
3dee636e0404885d07885d41c0d70e50c784f324. Per bug #9620 from
Nicolas Payart.

Stamp 9.3.4.

Release notes for 9.3.4, 9.2.8, 9.1.13, 9.0.17, 8.4.21.

During index build, check and elog (not just Assert) for broken HOT chain.

The recently-fixed bug in WAL replay could result in not finding a parent
tuple for a heap-only tuple. The existing code would either Assert or
generate an invalid index entry, neither of which is desirable. Throw a
regular error instead.

Fix bug in clean shutdown of walsender that pg_receiving is connecting to.

On clean shutdown, walsender waits for all WAL to be replicated to a standby,
and exits. It determined whether that replication had been completed by
checking whether its sent location had been equal to a standby's flush
location. Unfortunately this condition never becomes true when the standby
such as pg_receivexlog which always returns an invalid flush location is
connecting to walsender, and then walsender waits forever.

This commit changes walsender so that it just checks a standby's write
location if a flush location is invalid.

Back-patch to 9.1 where enough infrastructure for this exists.

plperl: Fix memory leak in hek2cstr

Backpatch all the way back to 9.1, where it was introduced by commit
50d89d42.

Reported by Sergey Burladyan in #9223
Author: Alex Hunsaker

Translation updates

Fix unportable shell-script syntax in pg_upgrade's test.sh.

I discovered the hard way that on some old shells, the locution
    FOO=""   unset FOO
does not behave the same as
    FOO="";  unset FOO
and in fact leaves FOO set to an empty string.  test.sh was inconsistently
spelling it different ways on adjacent lines.

This got broken relatively recently, in commit c737a2e56, so the lack of
field reports to date doesn't represent a lot of evidence that the problem
is rare.

Make punctuation consistent

Fix advertised dispsize for libpq's sslmode connection parameter.

"8" was correct back when "disable" was the longest allowed value, but
since "verify-full" was added, it should be "12". Given the lack of
complaints, I wouldn't be surprised if nobody is actually using these
values ... but still, if they're in the API, they should be right.

Noticed while pursuing a different problem. It's been wrong for quite
a long time, so back-patch to all supported branches.

Update time zone data files to tzdata release 2014a.

DST law changes in Fiji, Turkey; historical changes in Israel, Ukraine.

Prevent interrupts while reporting non-ERROR elog messages.

This should eliminate the risk of recursive entry to syslog(3), which
appears to be the cause of the hang reported in bug #9551 from James
Morton.

Arguably, the real problem here is auth.c's willingness to turn on
ImmediateInterruptOK while executing fairly wide swaths of backend code.
We may well need to work at narrowing the code ranges in which the
authentication_timeout interrupt is enabled. For the moment, though,
this is a cheap and reasonably noninvasive fix for a field-reported
failure; the other approach would be complex and not necessarily
bug-free itself.

Back-patch to all supported branches.

Avoid transaction-commit race condition while receiving a NOTIFY message.

Use TransactionIdIsInProgress, then TransactionIdDidCommit, to distinguish
whether a NOTIFY message's originating transaction is in progress,
committed, or aborted.  The previous coding could accept a message from a
transaction that was still in-progress according to the PGPROC array;
if the client were fast enough at starting a new transaction, it might fail
to see table rows added/updated by the message-sending transaction.  Which
of course would usually be the point of receiving the message.  We noted
this type of race condition long ago in tqual.c, but async.c overlooked it.

The race condition probably cannot occur unless there are multiple NOTIFY
senders in action, since an individual backend doesn't send NOTIFY signals
until well after it's done committing.  But if two senders commit in close
succession, it's certainly possible that we could see the second sender's
message within the race condition window while responding to the signal
from the first one.

Per bug #9557 from Marko Tiikkaja.  This patch is slightly more invasive
than what he proposed, since it removes the now-redundant
TransactionIdDidAbort call.

Back-patch to 9.0, where the current NOTIFY implementation was introduced.