forked from percona/percona-backup-mongodb
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PBM-815: physical restore + logical PITR (percona#844)
Add an oplog reply stage during the restore data post-processing. All these steps are done whilst the PSMDB cluster is down and isn’t accessible from the outside world. We cannot start a full replica set at this stage without exposing it to external users. So oplog reply is done only on the “pimary” node in the single-node replicates state. The data will be propagated to the rest of the nodes during the cluster start. This leads to: 1. PITR for physical backup happens only to the primary node and later on is copied during cluster start (cluster initial sync) 2. PITR for sharded collections works only for writes, not for the creation of sharded collections, therefore **whenever a sharded collection is created a full physical backup is needed**. With the logical backup, it’s covered as a full replica set is restored rather than one node in case of physical. `--base-snashot` always flag must be used with the physical base. For example `pbm restore --time=2023-07-03T16:23:56 -w --base-snapshot=2023-07-03T16:18:09Z`. Without `--base-snapshot` will always look for a logical backup even if there is no logical or a physical one more recent. Distributed transactions ========= The old way on sync all dist transactions between the shards won't work due to no DB available during the physical restore. Hence it would be way too slow to do such a sync over the remote storage. The new algorithm (changed in logical restores as well in a case of code consistency): By looking at just transactions in the oplog we can't tell which shards were participating in it. But we can assume that if there is commitTransaction at least on one shard then the transaction is commited everywhere. Otherwise, transactions won't be in the oplog or everywhere would be transactionAbort. So we just treat distributed as non-distributed - apply opps once a commit message for this txn is encountered. It might happen that by the end of the oplog there are some distributed txns without commit messages. We should commit such transactions only if the data is full (all prepared statements observed) and this txn was committed at least by one other shard. For that, each shard saves the last 100 dist transactions that were committed, so other shards can check if they should commit their leftovers. We store the last 100, as prepared statements and commits might be separated by other oplog events so it might happen that several commit messages can be cut away on some shards but present on other(s). Given oplog events of dist txns are more or less aligned in [cluster]time, checking the last 100 should be more than enough. If the transaction is more than 16Mb it will be split into several prepared messages. So it might happen that one shard committed the txn but another has observed not all prepared messages by the end of the oplog. In such a case we should report it in logs and describe-restore. It also adds some restore stat for dist transactions (into the restore metadata): `partial` - the num of transactions that were allied on other shards but can't be applied on this one since not all prepare messages got into the oplog (shouldn't happen). `shard_uncommitted` - the number of uncommitted transactions before the sync. Basically, the transaction is full but no commit message in the oplog of this shard. `left_uncommitted` - the num of transactions that remain uncommitted after the sync. The transaction is full but no commit message in the oplog of any shard.
- Loading branch information
Showing
17 changed files
with
1,038 additions
and
857 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.