Skip to content

Commit 6a4d1cf

Browse files
leonardocearmruphiscomnenciagbartolini
authored
feat: parallel WAL archiving
This patch adds a parallel WAL archiving feature, similar to the one that Barman has for its archive_command. This new feature can be enabled via the `maxParallel` option of the WAL archiving configuration. Every time the archive_command is invoked by PostgreSQL, we pre-archive a maximum of `maxParallel` WAL files chosen from the ready ones. The archival status is kept inside a spool folder. Co-authored-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Co-authored-by: Philippe Scorsolini <philippe.scorsolini@enterprisedb.com> Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
1 parent 3360715 commit 6a4d1cf

File tree

9 files changed

+499
-44
lines changed

9 files changed

+499
-44
lines changed

api/v1/cluster_types.go

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -808,6 +808,15 @@ type WalBackupConfiguration struct {
808808
// `AES256` and `aws:kms`
809809
// +kubebuilder:validation:Enum=AES256;"aws:kms"
810810
Encryption EncryptionType `json:"encryption,omitempty"`
811+
812+
// Number of WAL files to be either archived in parallel (when the
813+
// PostgreSQL instance is archiving to a backup object store) or
814+
// restored in parallel (when a PostgreSQL standby is fetching WAL
815+
// files from a recovery object store). If not specified, WAL files
816+
// will be processed one at a time. It accepts a positive integer as a
817+
// value - with 1 being the minimum accepted value.
818+
// +kubebuilder:validation:Minimum=1
819+
MaxParallel int `json:"maxParallel,omitempty"`
811820
}
812821

813822
// DataBackupConfiguration is the configuration of the backup of

config/crd/bases/postgresql.k8s.enterprisedb.io_clusters.yaml

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -935,6 +935,16 @@ spec:
935935
- AES256
936936
- aws:kms
937937
type: string
938+
maxParallel:
939+
description: Number of WAL files to be either archived
940+
in parallel (when the PostgreSQL instance is archiving
941+
to a backup object store) or restored in parallel (when
942+
a PostgreSQL standby is fetching WAL files from a recovery
943+
object store). If not specified, WAL files will be processed
944+
one at a time. It accepts a positive integer as a value
945+
- with 1 being the minimum accepted value.
946+
minimum: 1
947+
type: integer
938948
type: object
939949
required:
940950
- destinationPath
@@ -1359,6 +1369,17 @@ spec:
13591369
- AES256
13601370
- aws:kms
13611371
type: string
1372+
maxParallel:
1373+
description: Number of WAL files to be either archived
1374+
in parallel (when the PostgreSQL instance is archiving
1375+
to a backup object store) or restored in parallel
1376+
(when a PostgreSQL standby is fetching WAL files from
1377+
a recovery object store). If not specified, WAL files
1378+
will be processed one at a time. It accepts a positive
1379+
integer as a value - with 1 being the minimum accepted
1380+
value.
1381+
minimum: 1
1382+
type: integer
13621383
type: object
13631384
required:
13641385
- destinationPath

docs/src/api_reference.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -767,8 +767,9 @@ Name | Description
767767

768768
WalBackupConfiguration is the configuration of the backup of the WAL stream
769769

770-
Name | Description | Type
771-
----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------
772-
`compression` | Compress a WAL file before sending it to the object store. Available options are empty string (no compression, default), `gzip` or `bzip2`. | CompressionType
773-
`encryption ` | Whenever to force the encryption of files (if the bucket is not already configured for that). Allowed options are empty string (use the bucket policy, default), `AES256` and `aws:kms` | EncryptionType
770+
Name | Description | Type
771+
----------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------
772+
`compression` | Compress a WAL file before sending it to the object store. Available options are empty string (no compression, default), `gzip` or `bzip2`. | CompressionType
773+
`encryption ` | Whenever to force the encryption of files (if the bucket is not already configured for that). Allowed options are empty string (use the bucket policy, default), `AES256` and `aws:kms` | EncryptionType
774+
`maxParallel` | Number of WAL files to be either archived in parallel (when the PostgreSQL instance is archiving to a backup object store) or restored in parallel (when a PostgreSQL standby is fetching WAL files from a recovery object store). If not specified, WAL files will be processed one at a time. It accepts a positive integer as a value - with 1 being the minimum accepted value. | int
774775

docs/src/backup_recovery.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,37 @@ spec:
458458
You can configure the encryption directly in your bucket, and the operator
459459
will use it unless you override it in the cluster configuration.
460460

461+
PostgreSQL implements a sequential archiving scheme, where the
462+
`archive_command` will be executed sequentially for every WAL
463+
segment to be archived.
464+
465+
When the bandwidth between the PostgreSQL instance and the object
466+
store allows archiving more than one WAL file in parallel, you
467+
can use the parallel WAL archiving feature of the instance manager
468+
like in the following example:
469+
470+
```yaml
471+
apiVersion: postgresql.k8s.enterprisedb.io/v1
472+
kind: Cluster
473+
[...]
474+
spec:
475+
backup:
476+
barmanObjectStore:
477+
[...]
478+
wal:
479+
compression: gzip
480+
maxParallel: 8
481+
encryption: AES256
482+
```
483+
484+
In the previous example, the instance manager optimizes the WAL
485+
archiving process by archiving in parallel at most eight ready
486+
WALs, including the one requested by PostgreSQL.
487+
488+
When PostgreSQL will request the archiving of a WAL that has
489+
already been archived by the instance manager as an optimization,
490+
that archival request will be just dismissed with a positive status.
491+
461492
## Recovery
462493

463494
Cluster restores are not performed "in-place" on an existing cluster.

internal/cmd/manager/walarchive/cmd.go

Lines changed: 135 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,30 @@ Copyright (C) 2019-2021 EnterpriseDB Corporation.
88
package walarchive
99

1010
import (
11+
"context"
12+
"errors"
1113
"fmt"
12-
"os/exec"
14+
"os"
15+
"path"
16+
"path/filepath"
17+
"strings"
18+
"time"
1319

1420
"github.com/spf13/cobra"
1521

1622
apiv1 "github.com/EnterpriseDB/cloud-native-postgresql/api/v1"
1723
"github.com/EnterpriseDB/cloud-native-postgresql/internal/management/cache"
1824
cacheClient "github.com/EnterpriseDB/cloud-native-postgresql/internal/management/cache/client"
1925
"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/barman"
20-
barmanCapabilities "github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/barman/capabilities"
21-
"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/execlog"
26+
"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/barman/archiver"
2227
"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/log"
28+
"github.com/EnterpriseDB/cloud-native-postgresql/pkg/postgres"
29+
)
30+
31+
const (
32+
// SpoolDirectory is the directory where we spool the WAL files that
33+
// were pre-archived in parallel
34+
SpoolDirectory = postgres.ScratchDataDirectory + "/wal-archive-spool"
2335
)
2436

2537
// NewCmd creates the new cobra command
@@ -30,7 +42,9 @@ func NewCmd() *cobra.Command {
3042
Args: cobra.ExactArgs(1),
3143
RunE: func(cobraCmd *cobra.Command, args []string) error {
3244
contextLog := log.WithName("wal-archive")
33-
err := run(contextLog, args)
45+
ctx := log.IntoContext(cobraCmd.Context(), contextLog)
46+
47+
err := run(ctx, args)
3448
if err != nil {
3549
contextLog.Error(err, "failed to run wal-archive command")
3650
return err
@@ -42,15 +56,15 @@ func NewCmd() *cobra.Command {
4256
return &cmd
4357
}
4458

45-
func run(contextLog log.Logger, args []string) error {
59+
func run(ctx context.Context, args []string) error {
60+
startTime := time.Now()
61+
contextLog := log.FromContext(ctx)
4662
walName := args[0]
4763

4864
var cluster *apiv1.Cluster
4965
var err error
5066

51-
cluster, err = cacheClient.GetCluster()
52-
if err != nil {
53-
contextLog.Error(err, "Error while getting cluster from cache")
67+
if cluster, err = cacheClient.GetCluster(); err != nil {
5468
return fmt.Errorf("failed to get cluster: %w", err)
5569
}
5670

@@ -64,53 +78,135 @@ func run(contextLog log.Logger, args []string) error {
6478
return nil
6579
}
6680

67-
options, err := barmanCloudWalArchiveOptions(*cluster, cluster.Name, walName)
68-
if err != nil {
69-
contextLog.Error(err, "while getting barman-cloud-wal-archive options")
70-
return err
81+
maxParallel := 1
82+
if cluster.Spec.Backup.BarmanObjectStore.Wal != nil {
83+
maxParallel = cluster.Spec.Backup.BarmanObjectStore.Wal.MaxParallel
7184
}
7285

73-
env, err := cacheClient.GetEnv(cache.WALArchiveKey)
86+
// Get environment from cache
87+
var env []string
88+
env, err = cacheClient.GetEnv(cache.WALArchiveKey)
7489
if err != nil {
75-
contextLog.Error(err, "Error while getting environment from cache")
7690
return fmt.Errorf("failed to get envs: %w", err)
7791
}
7892

79-
contextLog.Trace("Executing "+barmanCapabilities.BarmanCloudWalArchive,
80-
"walName", walName,
81-
"currentPrimary", cluster.Status.CurrentPrimary,
82-
"targetPrimary", cluster.Status.TargetPrimary,
83-
"options", options,
84-
)
85-
86-
barmanCloudWalArchiveCmd := exec.Command(barmanCapabilities.BarmanCloudWalArchive, options...) // #nosec G204
87-
barmanCloudWalArchiveCmd.Env = env
93+
// Create the archiver
94+
var walArchiver *archiver.WALArchiver
95+
if walArchiver, err = archiver.New(cluster, env, SpoolDirectory); err != nil {
96+
return fmt.Errorf("while creating the archiver: %w", err)
97+
}
8898

89-
err = execlog.RunStreaming(barmanCloudWalArchiveCmd, barmanCapabilities.BarmanCloudWalArchive)
99+
// Step 1: check if this WAL file has not been already archived
100+
var isDeletedFromSpool bool
101+
isDeletedFromSpool, err = walArchiver.DeleteFromSpool(walName)
90102
if err != nil {
91-
contextLog.Error(err, "Error invoking "+barmanCapabilities.BarmanCloudWalArchive,
103+
return fmt.Errorf("while testing the existence of the WAL file in the spool directory: %w", err)
104+
}
105+
if isDeletedFromSpool {
106+
contextLog.Info("Archived WAL file (parallel)",
92107
"walName", walName,
93108
"currentPrimary", cluster.Status.CurrentPrimary,
94-
"targetPrimary", cluster.Status.TargetPrimary,
95-
"options", options,
96-
"exitCode", barmanCloudWalArchiveCmd.ProcessState.ExitCode(),
97-
)
98-
return fmt.Errorf("unexpected failure invoking %s: %w", barmanCapabilities.BarmanCloudWalArchive, err)
109+
"targetPrimary", cluster.Status.TargetPrimary)
110+
return nil
111+
}
112+
113+
// Step 3: gather the WAL files names to archive
114+
walFilesList := gatherWALFilesToArchive(ctx, walName, maxParallel)
115+
116+
options, err := barmanCloudWalArchiveOptions(cluster, cluster.Name)
117+
if err != nil {
118+
log.Error(err, "while getting barman-cloud-wal-archive options")
119+
return err
99120
}
100121

101-
contextLog.Info("Archived WAL file",
102-
"walName", walName,
103-
"currentPrimary", cluster.Status.CurrentPrimary,
104-
"targetPrimary", cluster.Status.TargetPrimary,
105-
)
122+
// Step 4: archive the WAL files in parallel
123+
uploadStartTime := time.Now()
124+
walStatus := walArchiver.ArchiveList(walFilesList, options)
125+
if len(walStatus) > 1 {
126+
contextLog.Info("Completed archive command (parallel)",
127+
"walsCount", len(walStatus),
128+
"startTime", startTime,
129+
"uploadStartTime", uploadStartTime,
130+
"uploadTotalTime", time.Since(uploadStartTime),
131+
"totalTime", time.Since(startTime))
132+
}
133+
134+
// We return only the first error to PostgreSQL, because the first error
135+
// is the one raised by the file that PostgreSQL has requested to archive.
136+
// The other errors are related to WAL files that were pre-archived as
137+
// a performance optimization and are just logged
138+
return walStatus[0].Err
139+
}
140+
141+
// gatherWALFilesToArchive reads from the archived status the list of WAL files
142+
// that can be archived in parallel way.
143+
// `requestedWALFile` is the name of the file whose archiving was requested by
144+
// PostgreSQL, and that file is always the first of the list and is always included.
145+
// `parallel` is the maximum number of WALs that we can archive in parallel
146+
func gatherWALFilesToArchive(ctx context.Context, requestedWALFile string, parallel int) (walList []string) {
147+
contextLog := log.FromContext(ctx)
148+
pgWalDirectory := path.Join(os.Getenv("PGDATA"), "pg_wal")
149+
archiveStatusPath := path.Join(pgWalDirectory, "archive_status")
150+
noMoreWALFilesNeeded := errors.New("no more files needed")
151+
152+
// slightly more optimized, but equivalent to:
153+
// walList = []string{requestedWALFile}
154+
walList = make([]string, 1, 1+parallel)
155+
walList[0] = requestedWALFile
156+
157+
err := filepath.WalkDir(archiveStatusPath, func(path string, d os.DirEntry, err error) error {
158+
// If err is set, it means the current path is a directory and the readdir raised an error
159+
// The only available option here is to skip the path and log the error.
160+
if err != nil {
161+
contextLog.Error(err, "failed reading path", "path", path)
162+
return filepath.SkipDir
163+
}
164+
165+
if len(walList) >= parallel {
166+
return noMoreWALFilesNeeded
167+
}
168+
169+
// We don't process directories beside the archive status path
170+
if d.IsDir() {
171+
// We want to proceed exploring the archive status folder
172+
if path == archiveStatusPath {
173+
return nil
174+
}
175+
176+
return filepath.SkipDir
177+
}
178+
179+
// We only process ready files
180+
if !strings.HasSuffix(path, ".ready") {
181+
return nil
182+
}
183+
184+
walFileName := strings.TrimSuffix(filepath.Base(path), ".ready")
185+
186+
// We are already archiving the requested WAL file,
187+
// and we need to avoid archiving it twice.
188+
// requestedWALFile is usually "pg_wal/wal_file_name" and
189+
// we compare it with the path we read
190+
if strings.HasSuffix(requestedWALFile, walFileName) {
191+
return nil
192+
}
193+
194+
walList = append(walList, filepath.Join("pg_wal", walFileName))
195+
return nil
196+
})
197+
198+
// In this point err must be nil or noMoreWALFilesNeeded, if it is something different
199+
// there is a programming error
200+
if err != nil && err != noMoreWALFilesNeeded {
201+
contextLog.Error(err, "unexpected error while reading the list of WAL files to archive")
202+
}
106203

107-
return nil
204+
return walList
108205
}
109206

110207
func barmanCloudWalArchiveOptions(
111-
cluster apiv1.Cluster,
208+
cluster *apiv1.Cluster,
112209
clusterName string,
113-
walName string,
114210
) ([]string, error) {
115211
configuration := cluster.Spec.Backup.BarmanObjectStore
116212

@@ -147,7 +243,6 @@ func barmanCloudWalArchiveOptions(
147243
options = append(
148244
options,
149245
configuration.DestinationPath,
150-
serverName,
151-
walName)
246+
serverName)
152247
return options, nil
153248
}

0 commit comments

Comments
 (0)