Documentation for transaction-ID-wraparound changes. Add a chapter to

author Tom Lane <tgl@sss.pgh.pa.us>

Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)

committer Tom Lane <tgl@sss.pgh.pa.us>

Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)
author Tom Lane <tgl@sss.pgh.pa.us>
Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)
committer Tom Lane <tgl@sss.pgh.pa.us>
Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)
diff --git a/doc/src/sgml/admin.sgml b/doc/src/sgml/admin.sgml

index b79146cb99f7a9303d0e9218edb0cb20a66021b0..32c0983b1708492d395e7b44516330259d939f56 100644 (file)
--- a/doc/src/sgml/admin.sgml
+++ b/doc/src/sgml/admin.sgml
@@ -1,5 +1,5 @@
  <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.34 2001/05/12 22:51:34 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.35 2001/08/26 21:17:12 tgl Exp $
  -->
  
  <book id="admin">
@@ -29,6 +29,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/admin.sgml,v 1.34 2001/05/12 22:51:34
    &charset;
    &manage-ag;
    &user-manag;
+  &maintenance;
    &backup;
    &wal;
    &storage;
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml

index 45451ecabc5ea158b39297903cfbcb4835587205..4b613db8e5b8d9c2cc6c0fe487271b9d4e4bcba2 100644 (file)
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -1,4 +1,4 @@
-<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.13 2001/06/02 18:25:16 petere Exp $ -->
+<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.14 2001/08/26 21:17:12 tgl Exp $ -->
  
  <!entity history    SYSTEM "history.sgml">
  <!entity info       SYSTEM "info.sgml">
@@ -43,6 +43,7 @@
  <!entity client-auth   SYSTEM "client-auth.sgml">
  <!entity installation  SYSTEM "installation.sgml">
  <!entity installw      SYSTEM "install-win32.sgml">
+<!entity maintenance   SYSTEM "maintenance.sgml">
  <!entity manage-ag     SYSTEM "manage-ag.sgml">
  <!entity recovery      SYSTEM "recovery.sgml">
  <!entity regress       SYSTEM "regress.sgml">
diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml

new file mode 100644 (file)

index 0000000..7629c5b
--- /dev/null
+++ b/doc/src/sgml/maintenance.sgml
@@ -0,0 +1,371 @@
+<!--
+$Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.1 2001/08/26 21:17:12 tgl Exp $
+-->
+
+<chapter id="maintenance">
+ <title>Routine Database Maintenance Tasks</title>
+
+ <sect1 id="maintenance-general">
+  <title>General Discussion</Title>
+
+  <para>
+   There are a few routine maintenance chores that must be performed on
+   a regular basis to keep a <productname>PostgreSQL</productname>
+   installation running smoothly.  The tasks discussed here are repetitive
+   in nature and can easily be automated using standard Unix tools such
+   as <filename>cron</filename> scripts.  But it is the database
+   administrator's responsibility to set up appropriate scripts, and to
+   check that they execute successfully.
+  </para>
+
+  <para>
+   One obvious maintenance task is creation of backup copies of the data on a
+   regular schedule.  Without a recent backup, you have no chance of recovery
+   after a catastrophe (disk failure, fire, mistakenly dropping a critical
+   table, etc).  The backup and recovery mechanisms available in
+   <productname>PostgreSQL</productname> are discussed at length in
+   <xref linkend="backup">.
+  </para>
+
+  <para>
+   The other main category of maintenance task is periodic <quote>vacuuming</>
+   of the database.  This activity is discussed in
+   <xref linkend="routine-vacuuming">.
+  </para>
+
+<!-- probably ought to have something about log rotation here;
+     any other important maintenance tasks?
+-->
+
+  <para>
+   <productname>PostgreSQL</productname> is low-maintenance compared to
+   some other database products.  Nonetheless, appropriate attention to
+   these tasks will go far towards ensuring a pleasant and productive
+   experience with the system.
+  </para>
+
+ </sect1>
+
+ <sect1 id="routine-vacuuming">
+  <title>Routine Vacuuming</title>
+
+  <para>
+   <productname>PostgreSQL</productname>'s <command>VACUUM</> command must be
+   run on a regular basis for several reasons:
+
+    <orderedlist>
+     <listitem>
+      <simpara>To recover disk space occupied by updated or deleted
+      rows.</simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>To update data statistics used by the
+      <productname>PostgreSQL</productname> query planner.</simpara>
+     </listitem>
+
+     <listitem>
+      <simpara>To protect against loss of very old data due to
+      <firstterm>transaction ID wraparound</>.</simpara>
+     </listitem>
+    </orderedlist>
+
+   The frequency and scope of <command>VACUUM</>s performed for each of
+   these reasons will vary depending on the needs of each installation.
+   Therefore, database administrators must understand these issues and
+   develop an appropriate maintenance strategy.  This section concentrates
+   on explaining the high-level issues; for details about command syntax
+   and so on, see the <command>VACUUM</> command reference page.
+  </para>
+
+  <para>
+   Beginning in <productname>PostgreSQL</productname> 7.2, the standard form
+   of <command>VACUUM</> can run in parallel with normal database operations
+   (selects, inserts, updates, deletes, but not changes to table schemas).
+   Routine vacuuming is therefore not nearly as intrusive as it was in prior
+   releases, and it's not as critical to try to schedule it at low-usage
+   times of day.
+  </para>
+
+  <sect2 id="vacuum-for-space-recovery">
+   <title>Recovering disk space</title>
+
+   <para>
+    In normal <productname>PostgreSQL</productname> operation, an UPDATE or
+    DELETE of a row does not immediately remove the old <firstterm>tuple</>
+    (version of the row).  This approach is necessary to gain the benefits
+    of multi-version concurrency control (see the User's Guide): the tuple
+    must not be deleted while 
+    it is still potentially visible to other transactions.  But eventually,
+    an outdated or deleted tuple is no longer of interest to any transaction.
+    The space it occupies must be reclaimed for reuse by new tuples, to avoid
+    infinite growth of disk space requirements.  This is done by running
+    <command>VACUUM</>.
+   </para>
+
+   <para>
+    Clearly, a table that receives frequent updates or deletes will need
+    to be vacuumed more often than tables that are seldom updated.  It may
+    be useful to set up periodic cron tasks that vacuum only selected tables,
+    skipping tables that are known not to change often.  This is only likely
+    to be helpful if you have both large heavily-updated tables and large
+    seldom-updated tables --- the extra cost of vacuuming a small table
+    isn't enough to be worth worrying about.
+   </para>
+
+   <para>
+    The standard form of <command>VACUUM</> is best used with the goal of
+    maintaining a fairly level steady-state usage of disk space.  The standard
+    form finds old tuples and makes their space available for re-use within
+    the table, but it does not try very hard to shorten the table file and
+    return disk space to the operating system.  If you need to return disk
+    space to the operating system you can use <command>VACUUM FULL</> ---
+    but what's the point of releasing disk space that will only have to be
+    allocated again soon?  Moderately frequent standard <command>VACUUM</>s
+    are a better approach than infrequent <command>VACUUM FULL</>s for
+    maintaining heavily-updated tables.
+   </para>
+
+   <para>
+    Recommended practice for most sites is to schedule a database-wide
+    <command>VACUUM</> once a day at a low-usage time of day, supplemented
+    by more frequent vacuuming of heavily-updated tables if necessary.
+    (If you have multiple databases in an installation, don't forget to
+    vacuum each one; the <filename>vacuumdb</> script may be helpful.)
+    Use plain <command>VACUUM</>, not <command>VACUUM FULL</>, for routine
+    vacuuming for space recovery.
+   </para>
+
+   <para>
+    <command>VACUUM FULL</> is recommended for cases where you know you have
+    deleted the majority of tuples in a table, so that the steady-state size
+    of the table can be shrunk substantially with <command>VACUUM FULL</>'s
+    more aggressive approach.
+   </para>
+
+   <para>
+    If you have a table whose contents are deleted completely every so often,
+    consider doing it with <command>TRUNCATE</> rather than using
+    <command>DELETE</> followed by <command>VACUUM</>.
+   </para>
+  </sect2>
+
+  <sect2 id="vacuum-for-statistics">
+   <title>Updating planner statistics</title>
+
+   <para>
+    The <productname>PostgreSQL</productname> query planner relies on
+    statistical information about the contents of tables in order to
+    generate good plans for queries.  These statistics are gathered by
+    the <command>ANALYZE</> command, which can be invoked by itself or
+    as an optional step in <command>VACUUM</>.  It is important to have
+    reasonably accurate statistics, otherwise poor choices of plans may
+    degrade database performance.
+   </para>
+
+   <para>
+    As with vacuuming for space recovery, frequent updates of statistics
+    are more useful for heavily-updated tables than for seldom-updated ones.
+    But even for a heavily-updated table, there may be no need for
+    statistics updates if the statistical distribution of the data is not
+    changing much.  A simple rule of thumb is to think about how much
+    the minimum and maximum values of the columns in the table change.
+    For example, a timestamp column that contains the time of row update
+    will have a constantly-increasing maximum value as rows are added and
+    updated; such a column will probably need more frequent statistics
+    updates than, say, a column containing URLs for pages accessed on a
+    website.  The URL column may receive changes just as often, but the
+    statistical distribution of its values probably changes relatively
+    slowly.
+   </para>
+
+   <para>
+    It is possible to run <command>ANALYZE</> on specific tables and even
+    just specific columns of a table, so the flexibility exists to update some
+    statistics more frequently than others if your application requires it.
+    In practice, however, the usefulness of this feature is doubtful.
+    Beginning in <productname>PostgreSQL</productname> 7.2,
+    <command>ANALYZE</> is a fairly fast operation even on large tables,
+    because it uses a statistical random sampling of the rows of a table
+    rather than reading every single row.  So it's probably much simpler
+    to just run it over the whole database every so often.
+   </para>
+
+   <tip>
+    <para>
+     Although per-column tweaking of <command>ANALYZE</> frequency may not be
+     very productive, you may well find it worthwhile to do per-column
+     adjustment of the level of detail of the statistics collected by
+     <command>ANALYZE</>.  Columns that are heavily used in WHERE clauses
+     and have highly irregular data distributions may require a finer-grain
+     data histogram than other columns.  See <command>ALTER TABLE SET
+     STATISTICS</>.
+    </para>
+   </tip>
+
+   <para>
+    Recommended practice for most sites is to schedule a database-wide
+    <command>ANALYZE</> once a day at a low-usage time of day; this can
+    usefully be combined with a nightly <command>VACUUM</>.  However,
+    sites with relatively slowly changing table statistics may find that
+    this is overkill, and that less-frequent <command>ANALYZE</> runs
+    are sufficient.
+   </para>
+  </sect2>
+
+  <sect2 id="vacuum-for-wraparound">
+   <title>Preventing transaction ID wraparound failures</title>
+
+   <para>
+    <productname>PostgreSQL</productname>'s MVCC transaction semantics
+    depend on being able to compare transaction ID (<firstterm>XID</>)
+    numbers: a tuple with an insertion XID newer than the current
+    transaction's XID is <quote>in the future</> and should not be visible
+    to the current transaction.  But since transaction IDs have limited size
+    (32 bits at this writing) an installation that runs for a long time (more
+    than 4 billion transactions) will suffer <firstterm>transaction ID
+    wraparound</>: the XID counter wraps around to zero, and all of a sudden
+    transactions that were in the past appear to be in the future --- which
+    means their outputs become invisible.  In short, catatrophic data loss.
+    (Actually the data is still there, but that's cold comfort if you can't
+    get at it.)
+   </para>
+
+   <para>
+    Prior to <productname>PostgreSQL</productname> 7.2, the only defense
+    against XID wraparound was to re-initdb at least every 4 billion
+    transactions.  This of course was not very satisfactory for high-traffic
+    sites, so a better solution has been devised.  The new approach allows an
+    installation to remain up indefinitely, without initdb or any sort of
+    restart.  The price is this maintenance requirement: <emphasis>every table
+    in the database must be VACUUMed more often than once every billion
+    transactions</emphasis>.
+   </para>
+
+   <para>
+    In practice this isn't an onerous requirement, but since the consequences
+    of failing to meet it can be complete data loss (not just wasted disk
+    space or slow performance), some special provisions have been made to help
+    database administrators keep track of the time since the last
+    <command>VACUUM</>.  The remainder of this section gives the details.
+   </para>
+
+   <para>
+    The new approach to XID comparison distinguishes two special XIDs, numbers
+    1 and 2 (<literal>BootstrapXID</> and <literal>FrozenXID</>).  These two
+    XIDs are always considered older than every normal XID.  Normal XIDs (those
+    greater than 2) are compared using modulo-2^31 arithmetic.  This means
+    that for every normal XID, there are two billion XIDs that are
+    <quote>older</> and two billion that are <quote>newer</>; another way to
+    say it is that the normal XID space is circular with no endpoint.
+    Therefore, once a tuple has been created with a particular normal XID, the
+    tuple will appear to be <quote>in the past</> for the next two billion 
+    transactions, no matter which normal XID we are talking about.  If the
+    tuple still exists after more than two billion transactions, it would
+    suddenly appear to be in the future.  To prevent data loss, old tuples
+    must be reassigned the XID <literal>FrozenXID</> sometime before they reach
+    the two-billion-transactions-old mark.  Once they are assigned this
+    special XID, they will appear to be <quote>in the past</> to all normal
+    transactions regardless of wraparound issues, and so such tuples will be
+    good until deleted, no matter how long that is.  This reassignment of
+    XID is handled by <command>VACUUM</>.
+   </para>
+
+   <para>
+    <command>VACUUM</>'s normal policy is to reassign <literal>FrozenXID</>
+    to any tuple with a normal XID more than one billion transactions in the
+    past.  This policy preserves the original insertion XID until it is not
+    likely to be of interest anymore (in fact, most tuples will probably
+    live and die without ever being <quote>frozen</>).  This policy means
+    that the maximum safe interval between <command>VACUUM</>s of any table
+    is exactly one billion transactions: if you wait longer, it's possible
+    that a tuple that was not quite old enough to be reassigned last time
+    is now more than two billion transactions old and has wrapped around
+    into the future --- ie, is lost to you.  (Of course, it'll reappear
+    after another two billion transactions, but that's no help.)
+   </para>
+
+   <para>
+    Since periodic <command>VACUUM</>s are needed anyway for the reasons
+    described earlier, it's unlikely that any table would not be vacuumed
+    for as long as a billion transactions.  But to help administrators ensure
+    this constraint is met, <command>VACUUM</> stores transaction ID
+    statistics in the system table <filename>pg_database</>.  In particular,
+    the <filename>datfrozenxid</> field of a database's
+    <filename>pg_database</> row is updated at the completion of any
+    database-wide vacuum operation (ie, <command>VACUUM</> that does not
+    name a specific table).  The value stored in this field is the freeze
+    cutoff XID that was used by that <command>VACUUM</> command.  All normal
+    XIDs older than this cutoff XID are guaranteed to have been replaced by
+    <literal>FrozenXID</> within that database.  A convenient way to
+    examine this information is to execute the query
+    <informalexample>
+<programlisting>
+select datname, age(datfrozenxid) from pg_database;
+</programlisting>
+    </informalexample>
+    The <literal>age</> column measures the number of transactions from the
+    cutoff XID to the current transaction's XID.
+   </para>
+
+   <para>
+    With the standard freezing policy, the <literal>age</> column will start
+    at one billion for a freshly-vacuumed database.  When the <literal>age</>
+    approaches two billion, the database must be vacuumed again to avoid
+    risk of wraparound failures.  Recommended practice is to vacuum each
+    database at least once every half-a-billion (500 million) transactions,
+    so as to provide plenty of safety margin.  To help meet this rule,
+    each database-wide <command>VACUUM</> automatically delivers a warning
+    if there are any <filename>pg_database</> entries showing an
+    <literal>age</> of more than 1.5 billion transactions, for example:
+    <informalexample>
+<programlisting>
+play=# vacuum;
+NOTICE:  Some databases have not been vacuumed in 1613770184 transactions.
+        Better vacuum them within 533713463 transactions,
+        or you may have a wraparound failure.
+VACUUM
+</programlisting>
+    </informalexample>
+   </para>
+
+   <para>
+    <command>VACUUM</> with the <command>FREEZE</> option uses a more
+    aggressive freezing policy: tuples are frozen if they are old enough
+    to be considered good by all open transactions.  In particular, if
+    a <command>VACUUM FREEZE</> is performed in an otherwise-idle database,
+    it is guaranteed that <emphasis>all</> tuples in that database will be
+    frozen.  Hence, as long the database is not modified in any way, it
+    will not need subsequent vacuuming to avoid transaction ID wraparound
+    problems.  This technique is used by <filename>initdb</> to prepare the
+    <filename>template0</> database.  It should also be used to prepare any
+    user-created databases that are to be marked <literal>datallowconn</> =
+    <literal>false</> in <filename>pg_database</>, since there isn't any
+    convenient way to vacuum a database that you can't connect to.  Note
+    that VACUUM's automatic warning message about unvacuumed databases will
+    ignore  <filename>pg_database</> entries with <literal>datallowconn</> =
+    <literal>false</>, so as to avoid giving false warnings about these
+    databases; therefore it's up to you to ensure that such databases are
+    frozen correctly.
+   </para>
+
+  </sect2>
+ </sect1>
+</chapter>
+
+<!-- Keep this comment at the end of the file
+Local variables:
+mode:sgml
+sgml-omittag:nil
+sgml-shorttag:t
+sgml-minimize-attributes:nil
+sgml-always-quote-attributes:t
+sgml-indent-step:1
+sgml-indent-data:t
+sgml-parent-document:nil
+sgml-default-dtd-file:"./reference.ced"
+sgml-exposed-tags:nil
+sgml-local-catalogs:("/usr/lib/sgml/catalog")
+sgml-local-ecat-files:nil
+End:
+-->
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml

index fd340b0c9400f5179e21c50a98e9ea1db01caa79..ea65b3c2831c51736933ae0bfd0d60c9b481d959 100644 (file)
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1,5 +1,5 @@
  <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.44 2001/08/10 18:57:32 tgl Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.45 2001/08/26 21:17:12 tgl Exp $
  -->
  
  <chapter id="sql-syntax">
@@ -640,7 +640,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
      <indexterm>
       <primary>OID</primary>
      </indexterm>
-    The unique identifier (object ID) of a row.  This is a serial number
+    The object identifier (object ID) of a row.  This is a serial number
      that is automatically added by Postgres to all table rows (unless
      the table was created WITHOUT OIDS, in which case this column is
      not present).
@@ -688,7 +688,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
     <para>
      The identity (transaction ID) of the deleting transaction,
      or zero for an undeleted tuple.  It is possible for this field
-    to be nonzero in a visible tuple: that indicates that the
+    to be nonzero in a visible tuple: that usually indicates that the
      deleting transaction hasn't committed yet, or that an attempted
      deletion was rolled back.
     </para>
@@ -712,7 +712,8 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
      (block number, tuple index within block) that identifies the
      physical location of the tuple.  Note that although the ctid
      can be used to locate the tuple very quickly, a row's ctid
-    will change each time it is updated or moved by VACUUM.
+    will change each time it is updated or moved by <command>VACUUM
+    FULL</>.
      Therefore ctid is useless as a long-term row identifier.
      The OID, or even better a user-defined serial number, should
      be used to identify logical rows.
@@ -737,7 +738,19 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
      </para>
  
      <para>
-     Transaction and command identifiers are 32-bit quantities.
+     Transaction identifiers are 32-bit quantities.  In a long-lived
+     database it is possible for transaction IDs to wrap around.  This
+     is not a fatal problem given appropriate maintenance procedures;
+     see the Administrator's Guide for details.  However, it is unwise
+     to depend on uniqueness of transaction IDs over the long term
+     (more than one billion transactions).
+    </para>
+
+    <para>
+     Command identifiers are also 32-bit quantities.  This creates a hard
+     limit of 2^32 (4 billion) SQL commands within a single transaction.
+     In practice this limit is not a problem --- note that the limit is on
+     number of SQL queries, not number of tuples processed.
      </para>
  
      <para>
author	Tom Lane <tgl@sss.pgh.pa.us>
	Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)
committer	Tom Lane <tgl@sss.pgh.pa.us>
	Sun, 26 Aug 2001 21:17:12 +0000 (21:17 +0000)
doc/src/sgml/admin.sgml		patch \| blob \| blame \| history
doc/src/sgml/filelist.sgml		patch \| blob \| blame \| history
doc/src/sgml/maintenance.sgml	[new file with mode: 0644]	patch \| blob
doc/src/sgml/syntax.sgml		patch \| blob \| blame \| history