Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11746. Support arbitrary schemas as an option in ozone debug ldb. #7652

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Tejaskriya
Copy link
Contributor

What changes were proposed in this pull request?

ozone debug ldb automatically detects which schema to use for the provided DB based on its name. However, there may be cases where the tool has not been updated to handle a different DB or format. To support these cases before full support is added, a --schema= option is added that allows passing a java class present in the classpath which implements DBDefinition to use as the schema for displaying a provided DB.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11746

How was this patch tested?

Added unit test in TestDBDefinitionFactory and integration test in TestLDBCli.
Also added a test in TestDBDefinitionFactory to ensure any DBDefinitions added in the future will be usable through the newly introduced option (i.e., the requires constructor is present).

@kerneltime
Copy link
Contributor

As with any CLI command, please include robot tests.

@Tejaskriya Tejaskriya marked this pull request as ready for review January 7, 2025 04:39
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tejaskriya for the patch.

@errose28 errose28 self-requested a review January 7, 2025 15:23
@errose28 errose28 added the tools Tools that helps with debugging label Jan 7, 2025
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tejaskriya for updating the patch, LGTM. Few minor improvements suggested on second look.

@Tejaskriya
Copy link
Contributor Author

Thanks for the review @adoroszlai, I have improved the code following your suggestions!

@Tejaskriya
Copy link
Contributor Author

@errose28 @adoroszlai I think renaming the option to --dbdefinition would be better. We have a value-schema command which shows a column family's value schema. The "schema" word might be confusing here as we are trying to pass a definition for the whole db, not just a column family.
What do you think?

@adoroszlai
Copy link
Contributor

adoroszlai commented Jan 9, 2025

I think renaming the option to --dbdefinition would be better.

If you would like to change it, I prefer --definition instead of --dbdefinition (or instead of --db-definition, which is a bit more readable, but still clumsy):

  • The command is for working with DB, so I find db unnecessary.
  • Avoid clash with option --db as a prefix.

@errose28
Copy link
Contributor

errose28 commented Jan 9, 2025

@swamirishi this is an implementation of your earlier suggestion in case you want to check it out

Copy link
Contributor

@swamirishi swamirishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tejaskriya Thanks for working on the patch but I believe the requirement for this kind of generic tooling would be better if we can open random rocksdb tables as long as we have a way to decode value in the classpath.

@@ -43,11 +43,19 @@ public class RDBParser implements DebugSubcommand {
description = "Database File Path")
private String dbPath;

@CommandLine.Option(names = {"--schema"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of taking a schema, I would rather take an input of codec to decode the value of a column family. For instance if I want to read snapshot diff DB, the DB doesn't have DB definition since we create new column families on the fly based on the snapdiff job.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should open a rocksdb in read only mode and decode value based on the ValueCodec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good approach. Currently we don't have any operations that work on multiple column families so just one key and value codec will suffice for each operation. Codec interface should be lighter weight than DBDefinitions if we need to make new ones on the fly for some reason as well.

@Tejaskriya Tejaskriya marked this pull request as draft January 16, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tools Tools that helps with debugging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants