-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some secondary/read-only DB logic #13441
Conversation
Summary: Primarily, fix an issue from facebook#13316 with opening secondary DB with preserve/preclude option (crash test disable in facebook#13439). The issue comes down to mixed-up interpretations of "read_only" which should now be resolved. I've introduced the stronger notion of "unchanging" which means the VersionSet never sees any changes to the LSM tree, and the weaker notion of "read_only" which means LSM tree changes are not written through this VersionSet/etc. but can pick up externally written changes. In particular, ManifestTailer should use read_only=true (along with unchanging=false) for proper handling of preserve/preclude options. A new assertion in VersionSet::CreateColumnFamily to help ensure sane usage of the two boolean flags is incompatible with the known wart of allowing CreateColumnFamily on a read-only DB. So to keep that assertion, I have fixed that issue by disallowing it. And this in turn required downstream clean-up in ldb, where I cleaned up some call sites as well. Also, rename SanitizeOptions for ColumnFamilyOptions to SanitizeCfOptions, for ease of search etc. Test Plan: * Added preserve option to a test in db_secondary_test, which reproduced the failure seen in the crash test. * Revert facebook#13439 to re-enable crash test functionality * Update some tests to deal with disallowing CF creation on read-only DB * Add some testing around read-only DBs and CreateColumnFamily(ies) * Resurrect a nearby test for read-only DB to be sure it doesn't write to the DB dir. New EnforcedReadOnlyReopen should probably be used in more places but didn't want to attempt a big migration here and now. (Suggested follow-up.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really hard to detangle, thank you for the improvement! I have some general questions.
ColumnFamilyOptions SanitizeOptions(const ImmutableDBOptions& db_options, | ||
bool read_only, | ||
const ColumnFamilyOptions& src) { | ||
ColumnFamilyOptions SanitizeCfOptions(const ImmutableDBOptions& db_options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
tools/ldb_cmd.cc
Outdated
@@ -1776,7 +1776,7 @@ FileChecksumDumpCommand::FileChecksumDumpCommand( | |||
const std::vector<std::string>& /*params*/, | |||
const std::map<std::string, std::string>& options, | |||
const std::vector<std::string>& flags) | |||
: LDBCommand(options, flags, false, | |||
: LDBCommand(options, flags, false /* is_read_only */, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this command be read only too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll see if I can safely sneak that into this PR. (I'll double-check that there are sufficient existing unit tests.)
@@ -155,6 +155,29 @@ class DBImplReadOnly : public DBImpl { | |||
return Status::NotSupported("Not supported operation in read only mode."); | |||
} | |||
|
|||
using DB::CreateColumnFamily; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess drop column family doesn't affect the options sanitization, but if what we are going with unchanging
is no LSM trees mutation across the DB, no manifest writes(usage of ReadOnlyFileSystem in test) , should we consider disabling user initiated dropping column family too? Are we considering as a follow up to disable these for DBImplSecondary
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, drop column family should probably be disallowed also, but I didn't want to keep adding to the scope of this PR. :) I left the FIXME in place in db_impl_readonly.h which likely includes more remaining than just drop CF.
The story with secondary instance is more complex than I care to attempt to resolve at the moment.
Lines 316 to 322 in 04b53a2
// Column families created by the primary after the secondary instance starts | |
// are currently ignored by the secondary instance. Column families opened | |
// by secondary and dropped by the primary will be dropped by secondary as | |
// well (on next invocation of TryCatchUpWithPrimary()). However the user | |
// of the secondary instance can still access the data of such dropped column | |
// family as long as they do not destroy the corresponding column family | |
// handle. |
tools/ldb_cmd.cc
Outdated
@@ -1962,7 +1964,7 @@ DropColumnFamilyCommand::DropColumnFamilyCommand( | |||
const std::vector<std::string>& params, | |||
const std::map<std::string, std::string>& options, | |||
const std::vector<std::string>& flags) | |||
: LDBCommand(options, flags, true, {ARG_DB}) { | |||
: LDBCommand(options, flags, true /* is_read_only */, {ARG_DB}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If we decide to not allow read only DB drop column family, this command will need to open in non read only mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. It shouldn't hurt to preemptively make this right, so I'll try that.
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger merged this pull request in b9c7481. |
Summary:
Primarily, fix an issue from #13316 with opening secondary DB with preserve/preclude option (crash test disable in #13439). The issue comes down to mixed-up interpretations of "read_only" which should now be resolved. I've introduced the stronger notion of "unchanging" which means the VersionSet never sees any changes to the LSM tree, and the weaker notion of "read_only" which means LSM tree changes are not written through this VersionSet/etc. but can pick up externally written changes. In particular, ManifestTailer should use read_only=true (along with unchanging=false) for proper handling of preserve/preclude options.
A new assertion in VersionSet::CreateColumnFamily to help ensure sane usage of the two boolean flags is incompatible with the known wart of allowing CreateColumnFamily on a read-only DB. So to keep that assertion, I have fixed that issue by disallowing it. And this in turn required downstream clean-up in ldb, where I cleaned up some call sites as well.
Also, rename SanitizeOptions for ColumnFamilyOptions to SanitizeCfOptions, for ease of search etc.
Test Plan: