Fix resource management bug with replication=database.
authorRobert Haas <rhaas@postgresql.org>
Sat, 4 Apr 2020 02:28:37 +0000 (22:28 -0400)
committerRobert Haas <rhaas@postgresql.org>
Sat, 4 Apr 2020 02:28:37 +0000 (22:28 -0400)
Commit 0d8c9c1210c44b36ec2efcb223a1dfbe897a3661 allowed BASE_BACKUP to
acquire a ResourceOwner without a transaction so that the backup
manifest functionality could use a BufFile, but it overlooked the fact
that when a walsender is used with replication=database, it might have
a transaction in progress, because in that mode, SQL and replication
commands can be mixed.  Try to fix things up so that the two cleanup
mechanisms don't conflict.

Per buildfarm member serinus, which triggered the problem when
CREATE_REPLICATION_SLOT failed from inside a transaction.  It passed
on the subsequent run, so evidently the failure doesn't happen every
time.

src/backend/replication/walsender.c

index 3b117d836733ccdde9d46a46a78120c6f2f5fbc2..9e5611574cc11f971e4e31f6de78b3db33be506d 100644 (file)
@@ -315,7 +315,13 @@ WalSndErrorCleanup(void)
 
    replication_active = false;
 
-   WalSndResourceCleanup(false);
+   /*
+    * If there is a transaction in progress, it will clean up our
+    * ResourceOwner, but if a replication command set up a resource owner
+    * without a transaction, we've got to clean that up now.
+    */
+   if (!IsTransactionOrTransactionBlock())
+       WalSndResourceCleanup(false);
 
    if (got_STOPPING || got_SIGUSR2)
        proc_exit(0);