Slow runs of buildfarm members chipmunk, hornet and mandrill saw the
shorter timeouts expire. The 180s timeout in poll_query_until has been
trouble-free since
2a0f89cd717ce6d49cdc47850577823682167e87 introduced
it two years ago, so use 180s more widely. Back-patch to 9.6, where the
first of these timeouts was introduced.
Reviewed by Michael Paquier.
Discussion: https://postgr.es/m/
20181209001601.GC2973271@rfd.leadboat.com
deadlock). A test that uses this ability must manually specify valid
permutations, i.e. those that would not expect a blocked session to execute a
command. If a test fails to follow that rule, isolationtester will cancel it
-after 60 seconds. If the cancel doesn't work, isolationtester will exit
-uncleanly after a total of 75 seconds of wait time. Testing invalid
+after 180 seconds. If the cancel doesn't work, isolationtester will exit
+uncleanly after a total of 200 seconds of wait time. Testing invalid
permutations should be avoided because they can make the isolation tests take
a very long time to run, and they serve no useful testing purpose.
td += (int64) current_time.tv_usec - (int64) start_time.tv_usec;
/*
- * After 60 seconds, try to cancel the query.
+ * After 180 seconds, try to cancel the query.
*
* If the user tries to test an invalid permutation, we don't want
* to hang forever, especially when this is running in the
- * buildfarm. So try to cancel it after a minute. This will
- * presumably lead to this permutation failing, but remaining
- * permutations and tests should still be OK.
+ * buildfarm. This will presumably lead to this permutation
+ * failing, but remaining permutations and tests should still be
+ * OK.
*/
- if (td > 60 * USECS_PER_SEC && !canceled)
+ if (td > 180 * USECS_PER_SEC && !canceled)
{
PGcancel *cancel = PQgetCancel(conn);
}
/*
- * After 75 seconds, just give up and die.
+ * After 200 seconds, just give up and die.
*
* Since cleanup steps won't be run in this case, this may cause
* later tests to fail. That stinks, but it's better than waiting
* forever for the server to respond to the cancel.
*/
- if (td > 75 * USECS_PER_SEC)
+ if (td > 200 * USECS_PER_SEC)
{
- fprintf(stderr, "step %s timed out after 75 seconds\n",
+ fprintf(stderr, "step %s timed out after 200 seconds\n",
step->name);
exit_nicely();
}
print "waiting to replay $endpos\n";
my $stdout_recv = $node_master->pg_recvlogical_upto(
- 'postgres', 'test_slot', $endpos, 10,
+ 'postgres', 'test_slot', $endpos, 180,
'include-xids' => '0',
'skip-empty-xacts' => '1');
chomp($stdout_recv);
) or die "slot never became inactive";
$stdout_recv = $node_master->pg_recvlogical_upto(
- 'postgres', 'test_slot', $endpos, 10,
+ 'postgres', 'test_slot', $endpos, 180,
'include-xids' => '0',
'skip-empty-xacts' => '1');
chomp($stdout_recv);
$stdout = $node_replica->pg_recvlogical_upto(
'postgres', 'before_basebackup',
- $endpos, 30,
+ $endpos, 180,
'include-xids' => '0',
'skip-empty-xacts' => '1');