Skip to content

Commit

Permalink
Fix some secondary/read-only DB logic (#13441)
Browse files Browse the repository at this point in the history
Summary:
Primarily, fix an issue from #13316 with opening secondary DB with preserve/preclude option (crash test disable in #13439). The issue comes down to mixed-up interpretations of "read_only" which should now be resolved. I've introduced the stronger notion of "unchanging" which means the VersionSet never sees any changes to the LSM tree, and the weaker notion of "read_only" which means LSM tree changes are not written through this VersionSet/etc. but can pick up externally written changes. In particular, ManifestTailer should use read_only=true (along with unchanging=false) for proper handling of preserve/preclude options.

A new assertion in VersionSet::CreateColumnFamily to help ensure sane usage of the two boolean flags is incompatible with the known wart of allowing CreateColumnFamily on a read-only DB. So to keep that assertion, I have fixed that issue by disallowing it. And this in turn required downstream clean-up in ldb, where I cleaned up some call sites as well.

Also, rename SanitizeOptions for ColumnFamilyOptions to SanitizeCfOptions, for ease of search etc.

Pull Request resolved: #13441

Test Plan:
* Added preserve option to a test in db_secondary_test, which reproduced the failure seen in the crash test.
* Revert #13439 to re-enable crash test functionality
* Update some tests to deal with disallowing CF creation on read-only DB
* Add some testing around read-only DBs and CreateColumnFamily(ies)
* Resurrect a nearby test for read-only DB to be sure it doesn't write to the DB dir. New EnforcedReadOnlyReopen should probably be used in more places but didn't want to attempt a big migration here and now. (Suggested follow-up.)

Reviewed By: jowlyzhang

Differential Revision: D70808033

Pulled By: pdillinger

fbshipit-source-id: 486b4e9f9c9045150a0ebb9cb302753d03932a3f
  • Loading branch information
pdillinger authored and facebook-github-bot committed Mar 7, 2025
1 parent 5d1c0a8 commit b9c7481
Show file tree
Hide file tree
Showing 19 changed files with 161 additions and 84 deletions.
8 changes: 4 additions & 4 deletions db/column_family.cc
Original file line number Diff line number Diff line change
Expand Up @@ -200,9 +200,9 @@ const uint64_t kDefaultTtl = 0xfffffffffffffffe;
const uint64_t kDefaultPeriodicCompSecs = 0xfffffffffffffffe;
} // anonymous namespace

ColumnFamilyOptions SanitizeOptions(const ImmutableDBOptions& db_options,
bool read_only,
const ColumnFamilyOptions& src) {
ColumnFamilyOptions SanitizeCfOptions(const ImmutableDBOptions& db_options,
bool read_only,
const ColumnFamilyOptions& src) {
ColumnFamilyOptions result = src;
size_t clamp_max = std::conditional<
sizeof(size_t) == 4, std::integral_constant<size_t, 0xffffffff>,
Expand Down Expand Up @@ -569,7 +569,7 @@ ColumnFamilyData::ColumnFamilyData(
dropped_(false),
flush_skip_reschedule_(false),
internal_comparator_(cf_options.comparator),
initial_cf_options_(SanitizeOptions(db_options, read_only, cf_options)),
initial_cf_options_(SanitizeCfOptions(db_options, read_only, cf_options)),
ioptions_(db_options, initial_cf_options_),
mutable_cf_options_(initial_cf_options_),
is_delete_range_supported_(
Expand Down
6 changes: 3 additions & 3 deletions db/column_family.h
Original file line number Diff line number Diff line change
Expand Up @@ -281,9 +281,9 @@ Status CheckConcurrentWritesSupported(const ColumnFamilyOptions& cf_options);
Status CheckCFPathsSupported(const DBOptions& db_options,
const ColumnFamilyOptions& cf_options);

ColumnFamilyOptions SanitizeOptions(const ImmutableDBOptions& db_options,
bool read_only,
const ColumnFamilyOptions& src);
ColumnFamilyOptions SanitizeCfOptions(const ImmutableDBOptions& db_options,
bool read_only,
const ColumnFamilyOptions& src);
// Wrap user defined table properties collector factories `from cf_options`
// into internal ones in internal_tbl_prop_coll_factories. Add a system internal
// one too.
Expand Down
8 changes: 4 additions & 4 deletions db/column_family_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -271,8 +271,8 @@ class ColumnFamilyTestBase : public testing::Test {
// them.
ASSERT_OK(RocksDBOptionsParser::VerifyCFOptions(
ConfigOptions(), desc.options,
SanitizeOptions(dbfull()->immutable_db_options(), /*read_only*/ false,
current_cf_opt)));
SanitizeCfOptions(dbfull()->immutable_db_options(),
/*read_only*/ false, current_cf_opt)));
cfi++;
}
}
Expand Down Expand Up @@ -2233,7 +2233,7 @@ TEST_P(ColumnFamilyTest, CreateMissingColumnFamilies) {
ASSERT_EQ(my_fs->options_files_created.load(), 2);
}

TEST_P(ColumnFamilyTest, SanitizeOptions) {
TEST_P(ColumnFamilyTest, SanitizeCfOptions) {
DBOptions db_options;
for (int s = kCompactionStyleLevel; s <= kCompactionStyleUniversal; ++s) {
for (int l = 0; l <= 2; l++) {
Expand All @@ -2249,7 +2249,7 @@ TEST_P(ColumnFamilyTest, SanitizeOptions) {
original.write_buffer_size =
l * 4 * 1024 * 1024 + i * 1024 * 1024 + j * 1024 + k;

ColumnFamilyOptions result = SanitizeOptions(
ColumnFamilyOptions result = SanitizeCfOptions(
ImmutableDBOptions(db_options), /*read_only*/ false, original);
ASSERT_TRUE(result.level0_stop_writes_trigger >=
result.level0_slowdown_writes_trigger);
Expand Down
35 changes: 25 additions & 10 deletions db/db_basic_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ TEST_F(DBBasicTest, UniqueSession) {

ASSERT_EQ(sid2, sid3);

DestroyAndReopen(options);
CreateAndReopenWithCF({"goku"}, options);
ASSERT_OK(db_->GetDbSessionId(sid1));
ASSERT_OK(Put("bar", "e1"));
Expand All @@ -179,6 +180,7 @@ TEST_F(DBBasicTest, UniqueSession) {
TEST_F(DBBasicTest, ReadOnlyDB) {
ASSERT_OK(Put("foo", "v1"));
ASSERT_OK(Put("bar", "v2"));
ASSERT_OK(Flush());
ASSERT_OK(Put("foo", "v3"));
Close();

Expand Down Expand Up @@ -208,37 +210,50 @@ TEST_F(DBBasicTest, ReadOnlyDB) {

auto options = CurrentOptions();
assert(options.env == env_);
ASSERT_OK(ReadOnlyReopen(options));
ASSERT_OK(EnforcedReadOnlyReopen(options));
ASSERT_EQ("v3", Get("foo"));
ASSERT_EQ("v2", Get("bar"));
verify_all_iters();
ASSERT_EQ(Flush().code(), Status::Code::kNotSupported);
Close();

// Reopen and flush memtable.
Reopen(options);
ASSERT_OK(Flush());
Close();
// Now check keys in read only mode.
ASSERT_OK(ReadOnlyReopen(options));
ASSERT_OK(EnforcedReadOnlyReopen(options));
ASSERT_EQ("v3", Get("foo"));
ASSERT_EQ("v2", Get("bar"));
verify_all_iters();
ASSERT_TRUE(db_->SyncWAL().IsNotSupported());
ASSERT_EQ(db_->SyncWAL().code(), Status::Code::kNotSupported);

// More ops that should fail
std::vector<ColumnFamilyHandle*> cfhs{{}};
ASSERT_EQ(db_->CreateColumnFamily(options, "blah", &cfhs[0]).code(),
Status::Code::kNotSupported);

ASSERT_EQ(db_->CreateColumnFamilies(options, {"blah"}, &cfhs).code(),
Status::Code::kNotSupported);

std::vector<ColumnFamilyDescriptor> cfds;
cfds.push_back({"blah", options});
ASSERT_EQ(db_->CreateColumnFamilies(cfds, &cfhs).code(),
Status::Code::kNotSupported);
}

// TODO akanksha: Update the test to check that combination
// does not actually write to FS (use open read-only with
// CompositeEnvWrapper+ReadOnlyFileSystem).
TEST_F(DBBasicTest, DISABLED_ReadOnlyDBWithWriteDBIdToManifestSet) {
TEST_F(DBBasicTest, ReadOnlyDBWithWriteDBIdToManifestSet) {
auto options = CurrentOptions();
options.write_dbid_to_manifest = false;
DestroyAndReopen(options);
ASSERT_OK(Put("foo", "v1"));
ASSERT_OK(Put("bar", "v2"));
ASSERT_OK(Put("foo", "v3"));
Close();

auto options = CurrentOptions();
options.write_dbid_to_manifest = true;
assert(options.env == env_);
ASSERT_OK(ReadOnlyReopen(options));
ASSERT_OK(EnforcedReadOnlyReopen(options));
std::string db_id1;
ASSERT_OK(db_->GetDbIdentity(db_id1));
ASSERT_EQ("v3", Get("foo"));
Expand All @@ -258,7 +273,7 @@ TEST_F(DBBasicTest, DISABLED_ReadOnlyDBWithWriteDBIdToManifestSet) {
ASSERT_OK(Flush());
Close();
// Now check keys in read only mode.
ASSERT_OK(ReadOnlyReopen(options));
ASSERT_OK(EnforcedReadOnlyReopen(options));
ASSERT_EQ("v3", Get("foo"));
ASSERT_EQ("v2", Get("bar"));
ASSERT_TRUE(db_->SyncWAL().IsNotSupported());
Expand Down
4 changes: 2 additions & 2 deletions db/db_impl/db_impl_open.cc
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,8 @@ Options SanitizeOptions(const std::string& dbname, const Options& src,
auto db_options =
SanitizeOptions(dbname, DBOptions(src), read_only, logger_creation_s);
ImmutableDBOptions immutable_db_options(db_options);
auto cf_options = SanitizeOptions(immutable_db_options, read_only,
ColumnFamilyOptions(src));
auto cf_options = SanitizeCfOptions(immutable_db_options, read_only,
ColumnFamilyOptions(src));
return Options(db_options, cf_options);
}

Expand Down
23 changes: 23 additions & 0 deletions db/db_impl/db_impl_readonly.h
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,29 @@ class DBImplReadOnly : public DBImpl {
return Status::NotSupported("Not supported operation in read only mode.");
}

using DB::CreateColumnFamily;
using DBImpl::CreateColumnFamily;
Status CreateColumnFamily(const ColumnFamilyOptions& /*cf_options*/,
const std::string& /*column_family*/,
ColumnFamilyHandle** /*handle*/) override {
return Status::NotSupported("Not supported operation in read only mode.");
}

using DB::CreateColumnFamilies;
using DBImpl::CreateColumnFamilies;
Status CreateColumnFamilies(
const ColumnFamilyOptions& /*cf_options*/,
const std::vector<std::string>& /*column_family_names*/,
std::vector<ColumnFamilyHandle*>* /*handles*/) override {
return Status::NotSupported("Not supported operation in read only mode.");
}

Status CreateColumnFamilies(
const std::vector<ColumnFamilyDescriptor>& /*column_families*/,
std::vector<ColumnFamilyHandle*>* /*handles*/) override {
return Status::NotSupported("Not supported operation in read only mode.");
}

// FIXME: some missing overrides for more "write" functions

protected:
Expand Down
2 changes: 1 addition & 1 deletion db/db_options_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ class DBOptionsTest : public DBTestBase {
ImmutableDBOptions db_options(options);
test::RandomInitCFOptions(&options, options, rnd);
auto sanitized_options =
SanitizeOptions(db_options, /*read_only*/ false, options);
SanitizeCfOptions(db_options, /*read_only*/ false, options);
auto opt_map = GetMutableCFOptionsMap(sanitized_options);
delete options.compaction_filter;
return opt_map;
Expand Down
1 change: 1 addition & 0 deletions db/db_secondary_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ TEST_F(DBSecondaryTest, NonExistingDb) {
TEST_F(DBSecondaryTest, ReopenAsSecondary) {
Options options;
options.env = env_;
options.preserve_internal_time_seconds = 300;
Reopen(options);
ASSERT_OK(Put("foo", "foo_value"));
ASSERT_OK(Put("bar", "bar_value"));
Expand Down
12 changes: 12 additions & 0 deletions db/db_test_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@

#include "cache/cache_reservation_manager.h"
#include "db/forward_iterator.h"
#include "env/fs_readonly.h"
#include "env/mock_env.h"
#include "port/lang.h"
#include "rocksdb/cache.h"
Expand Down Expand Up @@ -716,6 +717,17 @@ Status DBTestBase::ReadOnlyReopen(const Options& options) {
return DB::OpenForReadOnly(options, dbname_, &db_);
}

Status DBTestBase::EnforcedReadOnlyReopen(const Options& options) {
Close();
Options options_copy = options;
MaybeInstallTimeElapseOnlySleep(options_copy);
auto fs_read_only =
std::make_shared<ReadOnlyFileSystem>(env_->GetFileSystem());
env_read_only_ = std::make_shared<CompositeEnvWrapper>(env_, fs_read_only);
options_copy.env = env_read_only_.get();
return DB::OpenForReadOnly(options_copy, dbname_, &db_);
}

Status DBTestBase::TryReopen(const Options& options) {
Close();
last_options_.table_factory.reset();
Expand Down
4 changes: 4 additions & 0 deletions db/db_test_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -1062,6 +1062,7 @@ class DBTestBase : public testing::Test {
MockEnv* mem_env_;
Env* encrypted_env_;
SpecialEnv* env_;
std::shared_ptr<Env> env_read_only_;
std::shared_ptr<Env> env_guard_;
DB* db_;
std::vector<ColumnFamilyHandle*> handles_;
Expand Down Expand Up @@ -1178,6 +1179,9 @@ class DBTestBase : public testing::Test {

Status ReadOnlyReopen(const Options& options);

// With a filesystem wrapper that fails on attempted write
Status EnforcedReadOnlyReopen(const Options& options);

Status TryReopen(const Options& options);

bool IsDirectIOSupported();
Expand Down
10 changes: 6 additions & 4 deletions db/repair.cc
Original file line number Diff line number Diff line change
Expand Up @@ -100,13 +100,15 @@ class Repairer {
db_options_(SanitizeOptions(dbname_, db_options)),
immutable_db_options_(ImmutableDBOptions(db_options_)),
icmp_(default_cf_opts.comparator),
default_cf_opts_(SanitizeOptions(immutable_db_options_,
/*read_only*/ false, default_cf_opts)),
default_cf_opts_(SanitizeCfOptions(immutable_db_options_,
/*read_only*/ false,
default_cf_opts)),
default_iopts_(
ImmutableOptions(immutable_db_options_, default_cf_opts_)),
default_mopts_(MutableCFOptions(default_cf_opts_)),
unknown_cf_opts_(SanitizeOptions(immutable_db_options_,
/*read_only*/ false, unknown_cf_opts)),
unknown_cf_opts_(SanitizeCfOptions(immutable_db_options_,
/*read_only*/ false,
unknown_cf_opts)),
create_unknown_cfs_(create_unknown_cfs),
raw_table_cache_(
// TableCache can be small since we expect each table to be opened
Expand Down
2 changes: 1 addition & 1 deletion db/version_edit_handler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -408,7 +408,7 @@ void VersionEditHandler::CheckIterationResult(const log::Reader& reader,
if (cfd->IsDropped()) {
continue;
}
if (read_only_) {
if (version_set_->unchanging()) {
cfd->table_cache()->SetTablesAreImmortal();
}
*s = LoadTables(cfd, /*prefetch_index_and_filter_in_cache=*/false,
Expand Down
12 changes: 7 additions & 5 deletions db/version_edit_handler.h
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,9 @@ class VersionEditHandler : public VersionEditHandlerBase {
bool prefetch_index_and_filter_in_cache,
bool is_initial_load);

virtual bool MustOpenAllColumnFamilies() const { return !read_only_; }
virtual bool MustOpenAllColumnFamilies() const {
return !version_set_->unchanging();
}

const bool read_only_;
std::vector<ColumnFamilyDescriptor> column_families_;
Expand Down Expand Up @@ -334,10 +336,10 @@ class ManifestTailer : public VersionEditHandlerPointInTime {
const ReadOptions& read_options,
EpochNumberRequirement epoch_number_requirement =
EpochNumberRequirement::kMustPresent)
: VersionEditHandlerPointInTime(/*read_only=*/false, column_families,
version_set, io_tracer, read_options,
/*allow_incomplete_valid_version=*/false,
epoch_number_requirement),
: VersionEditHandlerPointInTime(
/*read_only=*/true, column_families, version_set, io_tracer,
read_options,
/*allow_incomplete_valid_version=*/false, epoch_number_requirement),
mode_(Mode::kRecovery) {}

Status VerifyFile(ColumnFamilyData* cfd, const std::string& fpath, int level,
Expand Down
10 changes: 6 additions & 4 deletions db/version_set.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5114,7 +5114,7 @@ VersionSet::VersionSet(
BlockCacheTracer* const block_cache_tracer,
const std::shared_ptr<IOTracer>& io_tracer, const std::string& db_id,
const std::string& db_session_id, const std::string& daily_offpeak_time_utc,
ErrorHandler* const error_handler, const bool read_only)
ErrorHandler* error_handler, bool unchanging)
: column_family_set_(new ColumnFamilySet(
dbname, _db_options, storage_options, table_cache,
write_buffer_manager, write_controller, block_cache_tracer, io_tracer,
Expand Down Expand Up @@ -5143,12 +5143,12 @@ VersionSet::VersionSet(
db_session_id_(db_session_id),
offpeak_time_option_(OffpeakTimeOption(daily_offpeak_time_utc)),
error_handler_(error_handler),
read_only_(read_only),
unchanging_(unchanging),
closed_(false) {}

Status VersionSet::Close(FSDirectory* db_dir, InstrumentedMutex* mu) {
Status s;
if (closed_ || read_only_ || !manifest_file_number_ || !descriptor_log_) {
if (closed_ || unchanging_ || !manifest_file_number_ || !descriptor_log_) {
return s;
}

Expand Down Expand Up @@ -7297,6 +7297,8 @@ ColumnFamilyData* VersionSet::CreateColumnFamily(
const ColumnFamilyOptions& cf_options, const ReadOptions& read_options,
const VersionEdit* edit, bool read_only) {
assert(edit->IsColumnFamilyAdd());
// Unchanging LSM tree implies no writes to the CF
assert(!unchanging_ || read_only);

MutableCFOptions dummy_cf_options;
Version* dummy_versions =
Expand Down Expand Up @@ -7430,7 +7432,7 @@ ReactiveVersionSet::ReactiveVersionSet(
write_buffer_manager, write_controller,
/*block_cache_tracer=*/nullptr, io_tracer, /*db_id*/ "",
/*db_session_id*/ "", /*daily_offpeak_time_utc*/ "",
/*error_handler=*/nullptr, /*read_only=*/true) {}
/*error_handler=*/nullptr, /*unchanging=*/false) {}

ReactiveVersionSet::~ReactiveVersionSet() = default;

Expand Down
Loading

0 comments on commit b9c7481

Please sign in to comment.