* Run-Time Partition Pruning Support.
*
* The following series of functions exist to support the removal of unneeded
- * subnodes for queries against partitioned tables. The supporting functions
- * here are designed to work with any node type which supports an arbitrary
- * number of subnodes, e.g. Append, MergeAppend.
+ * subplans for queries against partitioned tables. The supporting functions
+ * here are designed to work with any plan type which supports an arbitrary
+ * number of subplans, e.g. Append, MergeAppend.
*
* When pruning involves comparison of a partition key to a constant, it's
* done by the planner. However, if we have a comparison to a non-constant
*
* We must distinguish expressions containing PARAM_EXEC Params from
* expressions that don't contain those. Even though a PARAM_EXEC Param is
- * considered to be a stable expression, it can change value from one node
- * scan to the next during query execution. Stable comparison expressions
- * that don't involve such Params allow partition pruning to be done once
- * during executor startup. Expressions that do involve such Params require
- * us to prune separately for each scan of the parent plan node.
+ * considered to be a stable expression, it can change value from one plan
+ * node scan to the next during query execution. Stable comparison
+ * expressions that don't involve such Params allow partition pruning to be
+ * done once during executor startup. Expressions that do involve such Params
+ * require us to prune separately for each scan of the parent plan node.
*
- * Note that pruning away unneeded subnodes during executor startup has the
- * added benefit of not having to initialize the unneeded subnodes at all.
+ * Note that pruning away unneeded subplans during executor startup has the
+ * added benefit of not having to initialize the unneeded subplans at all.
*
*
* Functions:
*
* ExecSetupPartitionPruneState:
- * This must be called by nodes before any partition pruning is
- * attempted. Normally executor startup is a good time. This function
- * creates the PartitionPruneState details which are required by each
- * of the two pruning functions, details include information about
- * how to map the partition index details which are returned by the
- * planner's partition prune function into subnode indexes.
+ * Creates the PartitionPruneState required by each of the two pruning
+ * functions. Details stored include how to map the partition index
+ * returned by the partition pruning code into subplan indexes.
*
* ExecFindInitialMatchingSubPlans:
- * Returns indexes of matching subnodes. Partition pruning is attempted
+ * Returns indexes of matching subplans. Partition pruning is attempted
* without any evaluation of expressions containing PARAM_EXEC Params.
- * This function must be called during executor startup for the given
- * node before the subnodes themselves are initialized. Subnodes which
- * are found not to match by this function must not be included in the
- * node's list of subnodes as this function performs a remap of the
- * partition index to subplan index map and the newly created map
- * provides indexes only for subnodes which remain after calling this
- * function.
+ * This function must be called during executor startup for the parent
+ * plan before the subplans themselves are initialized. Subplans which
+ * are found not to match by this function must be removed from the
+ * plan's list of subplans during execution, as this function performs a
+ * remap of the partition index to subplan index map and the newly
+ * created map provides indexes only for subplans which remain after
+ * calling this function.
*
* ExecFindMatchingSubPlans:
- * Returns indexes of matching subnodes after evaluating all available
- * expressions. This function can only be called while the executor is
- * running.
+ * Returns indexes of matching subplans after evaluating all available
+ * expressions. This function can only be called during execution and
+ * must be called again each time the value of a Param listed in
+ * PartitionPruneState's 'execparamids' changes.
*-------------------------------------------------------------------------
*/
/*
* ExecSetupPartitionPruneState
- * Setup the required data structure which is required for calling
+ * Set up the data structure required for calling
* ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans.
*
+ * 'planstate' is the parent plan node's execution state.
+ *
* 'partitionpruneinfo' is a List of PartitionPruneInfos as generated by
- * make_partition_pruneinfo. Here we build a PartitionPruneContext for each
- * item in the List. These contexts can be re-used each time we re-evaulate
- * which partitions match the pruning steps provided in each
- * PartitionPruneInfo.
+ * make_partition_pruneinfo. Here we build a PartitionPruneState containing a
+ * PartitionPruningData for each item in that List. This data can be re-used
+ * each time we re-evaluate which partitions match the pruning steps provided
+ * in each PartitionPruneInfo.
*/
PartitionPruneState *
ExecSetupPartitionPruneState(PlanState *planstate, List *partitionpruneinfo)
{
- PartitionPruningData *prunedata;
PartitionPruneState *prunestate;
+ PartitionPruningData *prunedata;
ListCell *lc;
int i;
Assert(partitionpruneinfo != NIL);
+ /*
+ * Allocate the data structure
+ */
prunestate = (PartitionPruneState *) palloc(sizeof(PartitionPruneState));
prunedata = (PartitionPruningData *)
palloc(sizeof(PartitionPruningData) * list_length(partitionpruneinfo));
- /*
- * The first item in the array contains the details for the query's target
- * partition, so record that as the root of the partition hierarchy.
- */
prunestate->partprunedata = prunedata;
prunestate->num_partprunedata = list_length(partitionpruneinfo);
prunestate->do_initial_prune = false; /* may be set below */
prunestate->execparamids = NULL;
/*
- * Create a sub memory context which we'll use when making calls to the
- * query planner's function to determine which partitions will match. The
- * planner is not too careful about freeing memory, so we'll ensure we
- * call the function in this context to avoid any memory leaking in the
- * executor's memory context.
+ * Create a short-term memory context which we'll use when making calls to
+ * the partition pruning functions. This avoids possible memory leaks,
+ * since the pruning functions call comparison functions that aren't under
+ * our control.
*/
prunestate->prune_context =
AllocSetContextCreate(CurrentMemoryContext,
* We must make a copy of this rather than pointing directly to the
* plan's version as we may end up making modifications to it later.
*/
- pprune->subnode_map = palloc(sizeof(int) * pinfo->nparts);
- memcpy(pprune->subnode_map, pinfo->subnode_map,
+ pprune->subplan_map = palloc(sizeof(int) * pinfo->nparts);
+ memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
/* We can use the subpart_map verbatim, since we never modify it */
/*
* Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this node.
+ * partitioning decisions at this plan node.
*/
prunestate->execparamids = bms_add_members(prunestate->execparamids,
pinfo->execparamids);
/*
* ExecFindInitialMatchingSubPlans
- * Determine which subset of subplan nodes we need to initialize based
- * on the details stored in 'prunestate'. Here we only determine the
- * matching partitions using values known during plan startup, which
- * excludes any expressions containing PARAM_EXEC Params.
+ * Identify the set of subplans that cannot be eliminated by initial
+ * pruning (disregarding any pruning constraints involving PARAM_EXEC
+ * Params). Also re-map the translation matrix which allows conversion
+ * of partition indexes into subplan indexes to account for the unneeded
+ * subplans having been removed.
*
- * It is expected that callers of this function do so only once during their
- * init plan. The caller must only initialize the subnodes which are returned
- * by this function. The remaining subnodes should be discarded. Once this
- * function has been called, future calls to ExecFindMatchingSubPlans will
- * return its matching subnode indexes assuming that the caller discarded
- * the original non-matching subnodes.
+ * Must only be called once per 'prunestate', and only if initial pruning
+ * is required.
*
- * 'nsubnodes' must be passed as the total number of unpruned subnodes.
+ * 'nsubplans' must be passed as the total number of unpruned subplans.
*/
Bitmapset *
-ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubnodes)
+ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate, int nsubplans)
{
PartitionPruningData *pprune;
MemoryContext oldcontext;
ResetExprContext(pprune->context.planstate->ps_ExprContext);
/*
- * If any subnodes were pruned, we must re-sequence the subnode indexes so
+ * If any subplans were pruned, we must re-sequence the subplan indexes so
* that ExecFindMatchingSubPlans properly returns the indexes from the
- * subnodes which will remain after execution of this function.
+ * subplans which will remain after execution of this function.
*/
- if (bms_num_members(result) < nsubnodes)
+ if (bms_num_members(result) < nsubplans)
{
- int *new_subnode_indexes;
+ int *new_subplan_indexes;
int i;
int newidx;
/*
* First we must build an array which we can use to adjust the
- * existing subnode_map so that it contains the new subnode indexes.
+ * existing subplan_map so that it contains the new subplan indexes.
*/
- new_subnode_indexes = (int *) palloc(sizeof(int) * nsubnodes);
+ new_subplan_indexes = (int *) palloc(sizeof(int) * nsubplans);
newidx = 0;
- for (i = 0; i < nsubnodes; i++)
+ for (i = 0; i < nsubplans; i++)
{
if (bms_is_member(i, result))
- new_subnode_indexes[i] = newidx++;
+ new_subplan_indexes[i] = newidx++;
else
- new_subnode_indexes[i] = -1; /* Newly pruned */
+ new_subplan_indexes[i] = -1; /* Newly pruned */
}
/*
- * Now we can re-sequence each PartitionPruneInfo's subnode_map so
- * that they point to the new index of the subnode.
+ * Now we can re-sequence each PartitionPruneInfo's subplan_map so
+ * that they point to the new index of the subplan.
*/
for (i = 0; i < prunestate->num_partprunedata; i++)
{
/*
* We also need to reset the present_parts field so that it only
- * contains partition indexes that we actually still have subnodes
+ * contains partition indexes that we actually still have subplans
* for. It seems easier to build a fresh one, rather than trying
* to update the existing one.
*/
for (j = 0; j < nparts; j++)
{
- int oldidx = pprune->subnode_map[j];
+ int oldidx = pprune->subplan_map[j];
/*
- * If this partition existed as a subnode then change the old
- * subnode index to the new subnode index. The new index may
+ * If this partition existed as a subplan then change the old
+ * subplan index to the new subplan index. The new index may
* become -1 if the partition was pruned above, or it may just
- * come earlier in the subnode list due to some subnodes being
+ * come earlier in the subplan list due to some subplans being
* removed earlier in the list.
*/
if (oldidx >= 0)
{
- pprune->subnode_map[j] = new_subnode_indexes[oldidx];
+ pprune->subplan_map[j] = new_subplan_indexes[oldidx];
- if (new_subnode_indexes[oldidx] >= 0)
+ if (new_subplan_indexes[oldidx] >= 0)
pprune->present_parts =
bms_add_member(pprune->present_parts, j);
}
}
}
- pfree(new_subnode_indexes);
+ pfree(new_subplan_indexes);
}
return result;
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
- * 'pprune' for the current comparison expression values.
+ * 'prunestate' for the current comparison expression values.
*
* Here we assume we may evaluate PARAM_EXEC Params.
*/
partset = pprune->present_parts;
}
- /* Translate partset into subnode indexes */
+ /* Translate partset into subplan indexes */
i = -1;
while ((i = bms_next_member(partset, i)) >= 0)
{
- if (pprune->subnode_map[i] >= 0)
+ if (pprune->subplan_map[i] >= 0)
*validsubplans = bms_add_member(*validsubplans,
- pprune->subnode_map[i]);
+ pprune->subplan_map[i]);
else
{
int partidx = pprune->subpart_map[i];
- if (partidx != -1)
+ if (partidx >= 0)
find_matching_subplans_recurse(prunestate,
&prunestate->partprunedata[partidx],
initial_prune, validsubplans);
else
{
- /*
- * This could only happen if clauses used in planning where
- * more restrictive than those used here, or if the maps are
- * somehow corrupt.
- */
+ /* Shouldn't happen */
elog(ERROR, "partition missing from subplans");
}
}
COPY_BITMAPSET_FIELD(present_parts);
COPY_SCALAR_FIELD(nparts);
COPY_SCALAR_FIELD(nexprs);
- COPY_POINTER_FIELD(subnode_map, from->nparts * sizeof(int));
+ COPY_POINTER_FIELD(subplan_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(subpart_map, from->nparts * sizeof(int));
COPY_POINTER_FIELD(hasexecparam, from->nexprs * sizeof(bool));
COPY_SCALAR_FIELD(do_initial_prune);
WRITE_INT_FIELD(nparts);
WRITE_INT_FIELD(nexprs);
- appendStringInfoString(str, " :subnode_map");
+ appendStringInfoString(str, " :subplan_map");
for (i = 0; i < node->nparts; i++)
- appendStringInfo(str, " %d", node->subnode_map[i]);
+ appendStringInfo(str, " %d", node->subplan_map[i]);
appendStringInfoString(str, " :subpart_map");
for (i = 0; i < node->nparts; i++)
READ_BITMAPSET_FIELD(present_parts);
READ_INT_FIELD(nparts);
READ_INT_FIELD(nexprs);
- READ_INT_ARRAY(subnode_map, local_node->nparts);
+ READ_INT_ARRAY(subplan_map, local_node->nparts);
READ_INT_ARRAY(subpart_map, local_node->nparts);
READ_BOOL_ARRAY(hasexecparam, local_node->nexprs);
READ_BOOL_FIELD(do_initial_prune);
* Support for partition pruning during query planning and execution
*
* This module implements partition pruning using the information contained in
- * table's partition descriptor, query clauses, and run-time parameters.
+ * a table's partition descriptor, query clauses, and run-time parameters.
*
* During planning, clauses that can be matched to the table's partition key
* are turned into a set of "pruning steps", which are then executed to
- * produce a set of partitions (as indexes of the RelOptInfo->part_rels array)
- * that satisfy the constraints in the step. Partitions not in the set are said
- * to have been pruned.
+ * identify a set of partitions (as indexes in the RelOptInfo->part_rels
+ * array) that satisfy the constraints in the step. Partitions not in the set
+ * are said to have been pruned.
*
- * A base pruning step may also consist of expressions whose values are only
- * known during execution, such as Params, in which case pruning cannot occur
+ * A base pruning step may involve expressions whose values are only known
+ * during execution, such as Params, in which case pruning cannot occur
* entirely during planning. In that case, such steps are included alongside
* the plan, so that they can be used by the executor for further pruning.
*
- * There are two kinds of pruning steps: a "base" pruning step, which contains
- * information extracted from one or more clauses that are matched to the
- * (possibly multi-column) partition key, such as the expressions whose values
- * to match against partition bounds and operator strategy to associate to
- * each expression. The other kind is a "combine" pruning step, which combines
- * the outputs of some other steps using the appropriate combination method.
- * All steps that are constructed are executed in succession such that for any
- * "combine" step, all of the steps whose output it depends on are executed
- * first and their ouput preserved.
+ * There are two kinds of pruning steps. A "base" pruning step represents
+ * tests on partition key column(s), typically comparisons to expressions.
+ * A "combine" pruning step represents a Boolean connector (AND/OR), and
+ * combines the outputs of some previous steps using the appropriate
+ * combination method.
*
* See gen_partprune_steps_internal() for more details on step generation.
*
*/
typedef struct PartClauseInfo
{
- int keyno; /* Partition key number (0 to partnatts - 1) */
- Oid opno; /* operator used to compare partkey to 'expr' */
+ int keyno; /* Partition key number (0 to partnatts - 1) */
+ Oid opno; /* operator used to compare partkey to expr */
bool op_is_ne; /* is clause's original operator <> ? */
Expr *expr; /* expr the partition key is compared to */
Oid cmpfn; /* Oid of function to compare 'expr' to the
* partition key */
- int op_strategy; /* cached info. */
+ int op_strategy; /* btree strategy identifying the operator */
} PartClauseInfo;
/*
* PartClauseMatchStatus
- * Describes the result match_clause_to_partition_key produces for a
- * given clause and the partition key to match with that are passed to it
+ * Describes the result of match_clause_to_partition_key()
*/
typedef enum PartClauseMatchStatus
{
static bool partkey_datum_from_expr(PartitionPruneContext *context,
Expr *expr, int stateidx, Datum *value);
+
/*
* make_partition_pruneinfo
* Build List of PartitionPruneInfos, one for each 'partitioned_rels'.
List *subpaths, List *prunequal)
{
RelOptInfo *targetpart = NULL;
- ListCell *lc;
List *pinfolist = NIL;
- int *relid_subnode_map;
+ bool doruntimeprune = false;
+ int *relid_subplan_map;
int *relid_subpart_map;
+ ListCell *lc;
int i;
- bool doruntimeprune = false;
/*
* Allocate two arrays to store the 1-based indexes of the 'subpaths' and
* 'partitioned_rels' by relid.
*/
- relid_subnode_map = palloc0(sizeof(int) * root->simple_rel_array_size);
+ relid_subplan_map = palloc0(sizeof(int) * root->simple_rel_array_size);
relid_subpart_map = palloc0(sizeof(int) * root->simple_rel_array_size);
i = 1;
Assert(IS_SIMPLE_REL(pathrel));
Assert(pathrel->relid < root->simple_rel_array_size);
- relid_subnode_map[pathrel->relid] = i++;
+ relid_subplan_map[pathrel->relid] = i++;
}
/* Likewise for the partition_rels */
Bitmapset *present_parts;
int nparts = subpart->nparts;
int partnatts = subpart->part_scheme->partnatts;
- int *subnode_map;
+ int *subplan_map;
int *subpart_map;
List *partprunequal;
List *pruning_steps;
return NIL;
}
- subnode_map = (int *) palloc(nparts * sizeof(int));
+ subplan_map = (int *) palloc(nparts * sizeof(int));
subpart_map = (int *) palloc(nparts * sizeof(int));
present_parts = NULL;
for (i = 0; i < nparts; i++)
{
RelOptInfo *partrel = subpart->part_rels[i];
- int subnodeidx = relid_subnode_map[partrel->relid] - 1;
+ int subplanidx = relid_subplan_map[partrel->relid] - 1;
int subpartidx = relid_subpart_map[partrel->relid] - 1;
- subnode_map[i] = subnodeidx;
+ subplan_map[i] = subplanidx;
subpart_map[i] = subpartidx;
/*
* Record the indexes of all the partition indexes that we have
- * subnodes or subparts for. This allows an optimization to skip
- * attempting any run-time pruning when no Params are found
- * matching the partition key at this level.
+ * subplans or subparts for. This allows an optimization to skip
+ * attempting any run-time pruning when it's irrelevant.
*/
- if (subnodeidx >= 0 || subpartidx >= 0)
+ if (subplanidx >= 0 || subpartidx >= 0)
present_parts = bms_add_member(present_parts, i);
}
pinfo->pruning_steps = pruning_steps;
pinfo->present_parts = present_parts;
pinfo->nparts = nparts;
- pinfo->subnode_map = subnode_map;
+ pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
/* Determine which pruning types should be enabled at this level */
- doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps, partnatts);
+ doruntimeprune |= analyze_partkey_exprs(pinfo, pruning_steps,
+ partnatts);
pinfolist = lappend(pinfolist, pinfo);
}
- pfree(relid_subnode_map);
+ pfree(relid_subplan_map);
pfree(relid_subpart_map);
if (doruntimeprune)
/*-----------------------
* PartitionPruningData - Encapsulates all information required to support
- * elimination of partitions in node types which support arbitrary Lists of
- * subplans. Information stored here allows the planner's partition pruning
- * functions to be called and the return value of partition indexes translated
- * into the subpath indexes of node types such as Append, thus allowing us to
- * bypass certain subnodes when we have proofs that indicate that no tuple
- * matching the 'pruning_steps' will be found within.
+ * elimination of partitions in plan types which support arbitrary Lists of
+ * subplans. Information stored here allows the partition pruning functions
+ * to be called and the return value of partition indexes translated into the
+ * subpath indexes of plan types such as Append, thus allowing us to bypass a
+ * subplan when we can prove that no tuple matching the 'pruning_steps' will
+ * be found within.
*
- * subnode_map An array containing the subnode index which
+ * subplan_map An array containing the subplan index which
* matches this partition index, or -1 if the
- * subnode has been pruned already.
- * subpart_map An array containing the offset into the
- * 'partprunedata' array in PartitionPruning, or
+ * subplan has been pruned already.
+ * subpart_map An array containing the index into the
+ * partprunedata array in PartitionPruneState, or
* -1 if there is no such element in that array.
* present_parts A Bitmapset of the partition indexes that we
- * have subnodes mapped for.
+ * have subplans mapped for.
* context Contains the context details required to call
* the partition pruning code.
* pruning_steps List of PartitionPruneSteps used to
*/
typedef struct PartitionPruningData
{
- int *subnode_map;
+ int *subplan_map;
int *subpart_map;
Bitmapset *present_parts;
PartitionPruneContext context;
} PartitionPruningData;
/*-----------------------
- * PartitionPruneState - State object required for executor nodes to perform
- * partition pruning elimination of their subnodes. This encapsulates a
+ * PartitionPruneState - State object required for plan nodes to perform
+ * partition pruning elimination of their subplans. This encapsulates a
* flattened hierarchy of PartitionPruningData structs.
- * This struct can be attached to node types which support arbitrary Lists of
- * subnodes containing partitions to allow subnodes to be eliminated due to
- * the clauses being unable to match to any tuple that the subnode could
+ * This struct can be attached to plan types which support arbitrary Lists of
+ * subplans containing partitions to allow subplans to be eliminated due to
+ * the clauses being unable to match to any tuple that the subplan could
* possibly produce.
*
- * partprunedata Array of PartitionPruningData for the node's target
+ * partprunedata Array of PartitionPruningData for the plan's target
* partitioned relation. First element contains the
* details for the target partitioned table.
* num_partprunedata Number of items in 'partprunedata' array.
* startup (at any hierarchy level).
* do_exec_prune true if pruning should be performed during
* executor run (at any hierarchy level).
- * prune_context A memory context which can be used to call the query
- * planner's partition prune functions.
* execparamids Contains paramids of PARAM_EXEC Params found within
- * any of the partprunedata structs.
+ * any of the partprunedata structs. Pruning must be
+ * done again each time the value of one of these
+ * parameters changes.
+ * prune_context A short-lived memory context in which to execute the
+ * partition pruning functions.
*-----------------------
*/
typedef struct PartitionPruneState
int num_partprunedata;
bool do_initial_prune;
bool do_exec_prune;
- MemoryContext prune_context;
Bitmapset *execparamids;
+ MemoryContext prune_context;
} PartitionPruneState;
extern PartitionTupleRouting *ExecSetupPartitionTupleRouting(ModifyTableState *mtstate,
List *partitionpruneinfo);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate);
extern Bitmapset *ExecFindInitialMatchingSubPlans(PartitionPruneState *prunestate,
- int nsubnodes);
+ int nsubplans);
#endif /* EXECPARTITION_H */
* partitions.
*
* Here we store mapping details to allow translation of a partitioned table's
- * index into subnode indexes for node types which support arbitrary numbers
- * of sub nodes, such as Append.
+ * index as returned by the partition pruning code into subplan indexes for
+ * plan types which support arbitrary numbers of subplans, such as Append.
+ * We also store various details to tell the executor when it should be
+ * performing partition pruning.
*/
typedef struct PartitionPruneInfo
{
NodeTag type;
Oid reloid; /* Oid of partition rel */
List *pruning_steps; /* List of PartitionPruneStep, see below */
- Bitmapset *present_parts; /* Indexes of all partitions which subnodes
+ Bitmapset *present_parts; /* Indexes of all partitions which subplans
* are present for. */
- int nparts; /* Length of subnode_map[] and subpart_map[] */
+ int nparts; /* Length of subplan_map[] and subpart_map[] */
int nexprs; /* Length of hasexecparam[] */
- int *subnode_map; /* subnode index by partition id, or -1 */
+ int *subplan_map; /* subplan index by partition id, or -1 */
int *subpart_map; /* subpart index by partition id, or -1 */
bool *hasexecparam; /* true if corresponding pruning_step contains
* any PARAM_EXEC Params. */
* strategy of the operator in the clause matched to the last partition key.
* 'exprs' contains expressions which comprise the lookup key to be passed to
* the partition bound search function. 'cmpfns' contains the OIDs of
- * comparison function used to compare aforementioned expressions with
+ * comparison functions used to compare aforementioned expressions with
* partition bounds. Both 'exprs' and 'cmpfns' contain the same number of
- * items up to partnatts items.
+ * items, up to partnatts items.
*
* Once we find the offset of a partition bound using the lookup key, we
* determine which partitions to include in the result based on the value of