Skip to content

Commit

Permalink
Clone: fix race condition that may skip some SST files (#1274)
Browse files Browse the repository at this point in the history
Summary:
A donor thread in the main copy loop took a file to send, released the donor state mutex, and then opened the file. If an ENOENT was received at this point, it was assumed that this was a stale SST file from an older checkpoint that was rolled since.

Because the donor state mutex was released between taking of the file and opening it, the following race was possible:
1) Thread 1 takes the file
2) Thread 2 decides to roll the checkpoint, the old checkpoint is deleted 3) Thread 1 tries to open the file, gets ENOENT
4) Thread 2 creates the new checkpoint, the file re-appears, but it's too late.

Rolling the checkpoint in a donor state mutex critical section is a possible fix, but such section would do a lot of I/O, serializing the parallel threads. Instead, fix by taking the file and opening it in the same critical section.

Pull Request resolved: #1274

Reviewed By: sunshine-Chun

Differential Revision: D43629546

Pulled By: hermanlee

fbshipit-source-id: 6e7f315
  • Loading branch information
laurynas-biveinis authored and facebook-github-bot committed Feb 27, 2023
1 parent 1854db1 commit 526e737
Showing 1 changed file with 9 additions and 8 deletions.
17 changes: 9 additions & 8 deletions storage/rocksdb/clone/donor.cc
Original file line number Diff line number Diff line change
Expand Up @@ -525,7 +525,7 @@ class [[nodiscard]] donor final : public myrocks::clone::session {
}

[[nodiscard]] std::string path(const std::string &fn) const {
assert(get_state() != donor_state::INITIAL);
assert(m_state != donor_state::INITIAL);

return myrocks::has_file_extension(fn, ".log")
? myrocks::rdb_concat_paths(myrocks::get_wal_dir(), fn)
Expand Down Expand Up @@ -780,11 +780,6 @@ int donor::copy(const THD *thd, uint task_id, Ha_clone_cbk &cbk) {
m_state == donor_state::FINAL_CHECKPOINT ||
m_state == donor_state::FINAL_CHECKPOINT_WITH_LOGS);

mysql_mutex_unlock(&m_donor_mutex);

auto err = handle_any_error(thd);
if (err != 0) return err;

const auto &name = metadata.get_name();
const auto donor_file_path = path(name);
auto open_flags = O_RDONLY;
Expand All @@ -804,18 +799,24 @@ int donor::copy(const THD *thd, uint task_id, Ha_clone_cbk &cbk) {
"Not found, assuming old checkpoint: %s",
donor_file_path.c_str());
assert(myrocks::has_file_extension(donor_file_path, ".sst"));
mysql_mutex_lock(&m_donor_mutex);
const auto erased_count [[maybe_unused]] =
m_in_progress_files.erase(metadata.get_id());
mysql_mutex_unlock(&m_donor_mutex);
assert(erased_count == 1);
mysql_mutex_unlock(&m_donor_mutex);
continue;
}
mysql_mutex_unlock(&m_donor_mutex);

MyOsError(errn, EE_FILENOTFOUND, MYF(0), donor_file_path.c_str());
save_error(ER_CANT_OPEN_FILE, donor_file_path, errn);
return ER_CANT_OPEN_FILE;
}

mysql_mutex_unlock(&m_donor_mutex);

auto err = handle_any_error(thd);
if (err != 0) return err;

#ifdef __APPLE__
if (use_direct_io && fcntl(fd, F_NOCACHE, 1) == -1) {
const auto errn = my_errno();
Expand Down

0 comments on commit 526e737

Please sign in to comment.