Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash during update of snapshot causes loss of data #51

Open
palmskog opened this issue May 21, 2017 · 0 comments
Open

Crash during update of snapshot causes loss of data #51

palmskog opened this issue May 21, 2017 · 0 comments

Comments

@palmskog
Copy link
Collaborator

palmskog commented May 21, 2017

From @pfons on April 13, 2016 0:29

A crash of the server when it executed the function that writes a snapshot to disk (updating the existing snapshot) can cause loss of data and prevent the server from recovering correctly afterworlds.

This bug is more serious than issue #50 because it can lead to loss of data. Loss of data can happen because the server, when it crashes while executing the function save, deletes/truncates the existing disk snapshot before it safely writes the new snapshot to disk.

This problem can be reproduced by simulating a crash immediately after the snapshot file is opened with O_TRUNC (save function in Shim.ml) and before the write is actually made, for example, by adding the statement assert(env.saves < 10000);.

It is probably harder to fix this bug than issue #50 because a correct implementation needs to ensure that several steps (i.e., replacing the old snapshot with the new snapshot and truncating the log) are atomic despite crashes.

Copied from original issue: uwplse/verdi#39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant