Fix bug in clean shutdown of walsender that pg_receiving is connecting to.
authorFujii Masao <fujii@postgresql.org>
Mon, 17 Mar 2014 11:41:12 +0000 (20:41 +0900)
committerFujii Masao <fujii@postgresql.org>
Mon, 17 Mar 2014 11:41:12 +0000 (20:41 +0900)
On clean shutdown, walsender waits for all WAL to be replicated to a standby,
and exits. It determined whether that replication had been completed by
checking whether its sent location had been equal to a standby's flush
location. Unfortunately this condition never becomes true when the standby
such as pg_receivexlog which always returns an invalid flush location is
connecting to walsender, and then walsender waits forever.

This commit changes walsender so that it just checks a standby's write
location if a flush location is invalid.

Back-patch to 9.1 where enough infrastructure for this exists.

src/backend/replication/walsender.c

index 54ed311473b1380a69148dc765e9bd993badc0ff..4abb0007c7c85a445af515b57142df38c0d49c3f 100644 (file)
@@ -1053,9 +1053,20 @@ WalSndLoop(void)
             */
            if (walsender_ready_to_stop)
            {
+               XLogRecPtr  replicatedPtr;
+
                /* ... let's just be real sure we're caught up ... */
                XLogSend(&caughtup);
-               if (caughtup && sentPtr == MyWalSnd->flush &&
+
+               /*
+                * Check a write location to see whether all the WAL have
+                * successfully been replicated if this walsender is connecting
+                * to a standby such as pg_receivexlog which always returns
+                * an invalid flush location. Otherwise, check a flush location.
+                */
+               replicatedPtr = XLogRecPtrIsInvalid(MyWalSnd->flush) ?
+                   MyWalSnd->write : MyWalSnd->flush;
+               if (caughtup && sentPtr == replicatedPtr &&
                    !pq_is_send_pending())
                {
                    /* Inform the standby that XLOG streaming is done */