git.postgresql.org Git - postgresql.git/commit

author	Nathan Bossart <nathan@postgresql.org>
	Wed, 31 Jul 2024 15:12:42 +0000 (10:12 -0500)
committer	Nathan Bossart <nathan@postgresql.org>
	Wed, 31 Jul 2024 15:12:42 +0000 (10:12 -0500)
commit	bd15b7db489deadb2d9af7f21d16a6ed4a09465b
tree	a5bbfddad0729130f790fe79eea5141cd386f689	tree
parent	c8b06bb969bf26c01f10b835e59d0aff39b7f516	commit \| diff

Improve performance of dumpSequenceData().

As one might guess, this function dumps the sequence data.  It is
called once per sequence, and each such call executes a query to
retrieve the relevant data for a single sequence.  This can cause
pg_dump to take significantly longer, especially when there are
many sequences.

This commit improves the performance of this function by gathering
all the sequence data with a single query at the beginning of
pg_dump.  This information is stored in a sorted array that
dumpSequenceData() can bsearch() for what it needs.  This follows a
similar approach as previous commits that introduced sorted arrays
for role information, pg_class information, and sequence metadata.
As with those commits, this patch will cause pg_dump to use more
memory, but that isn't expected to be too egregious.

Note that we use the brand new function pg_sequence_read_tuple() in
the query that gathers all sequence data, so we must continue to
use the preexisting query-per-sequence approach for versions older
than 18.

Reviewed-by: Euler Taveira, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13