Skip to content

Archive data format

Akira Kurogane edited this page Oct 21, 2019 · 1 revision

Remote Storage

When a backup or restore operation is run one pbm-agent for each replicaset will do the work. The connection and credentials needed to connect to the remote storage are saved in the PBM config (admin.pbmConfig).

When it is a filesystem location there is no hostname, username and password, etc. There is only a directory path which must exist (identically) on all the mongod-hosting servers the pbm-agent processes are one. To have any purpose as disaster recovery system that should be network mount to a different backup storage server. It is assumed that it will be the same backup server for all hosts in the same MongoDB cluster or non-sharded replicaset.

Whether S3-compatible object storage or filesystem location the concept is that the storage location is a directory.

File layout

A complete backup will be files that all start with the same ISO UTC timestamp. The time in this timestamp is the starting time, not the completion time / consistent data time. There will be one dump and one oplog slice for each replicaset, with the replicaset name as part of the filename. There will also be a backup metadata file (.pbm.json).

[2019-10-02 17:25:39 JST]  1.1KiB 2019-10-02T08:25:36Z.pbm.json
[2019-10-02 17:25:37 JST]  108KiB 2019-10-02T08:25:36Z_configrs.dump.gz
[2019-10-02 17:25:39 JST]    742B 2019-10-02T08:25:36Z_configrs.oplog.gz
[2019-10-02 17:25:37 JST]  343KiB 2019-10-02T08:25:36Z_s2rs.dump.gz
[2019-10-02 17:25:39 JST]     23B 2019-10-02T08:25:36Z_s2rs.oplog.gz
[2019-10-02 17:25:39 JST]  651KiB 2019-10-02T08:25:36Z_testrs.dump.gz
[2019-10-02 17:25:40 JST]     23B 2019-10-02T08:25:36Z_testrs.oplog.gz

File format

The <timestamp>.pbm.json metadata file contains the start and end time (with end time = the consistent time), and pointers to the other files in backup.

The dump files are the same format that mongodump creates when using --archive and --gzip. It does not include an oplog.

The oplog files are captured by PBM's own code. They are simply the local.oplog.rs docs for the backup's timespan dumped as serial BSON documents, gzip'ed. As if mongodump -d local -c oplog.rs --query <that timespan> | gzip > xxxxx.oplog.gz was run.

Clone this wiki locally