Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BanyanDB] Optimizing Column Encoding #12445

Open
2 of 3 tasks
hanahmily opened this issue Jul 15, 2024 · 6 comments
Open
2 of 3 tasks

[BanyanDB] Optimizing Column Encoding #12445

hanahmily opened this issue Jul 15, 2024 · 6 comments
Labels
database BanyanDB - SkyWalking native database feature New feature

Comments

@hanahmily
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

There are several optimizations we should apply to the column encoding, the column refers to tags and fields.

  • Move constant values within a block to the metadata.
  • Encode each column based on its data type (string, int64, or float64).
  • Use low cardinality encoding when the column has a limited set of value options within a block.
  • Consider encoding array types using a columnar strategy. If the array size is consistent within a block, transform the arrays into a matrix and group and encode the values within the same column.

Use case

No response

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

  • Yes I am willing to submit a pull request on my own!

Code of Conduct

@hanahmily hanahmily added feature New feature database BanyanDB - SkyWalking native database labels Jul 15, 2024
@hanahmily hanahmily added this to the BanyanDB - 0.8.0 milestone Jul 15, 2024
@sollhui
Copy link

sollhui commented Jul 16, 2024

Please assign to me.

@wu-sheng
Copy link
Member

Are you confident to take two in the same time? How much do you understand BanyanDB?

@sollhui
Copy link

sollhui commented Jul 16, 2024

Are you confident to take two in the same time? How much do you understand BanyanDB?

I am familiar with BanyanDB code and have contributed 10 PR, but I don't think it's an easy task. Let me discuss it with @hanahmily first

@wu-sheng
Copy link
Member

This is on the next iteration only, unless you will finish it in time for 0.7. So, don't hurry and take your time.

@hanahmily Please note, as we are changing docs to user oriented, please make sure the file structure docs covers encoding docs with proper docs and clear examples.

@hanahmily
Copy link
Contributor Author

@sollhui Let's discuss the details first.

@wu-sheng Sure, we will update the relevant documents according to the new structures. This change will not break the file system; therefore, we do not have to increase the file system version.

@wu-sheng
Copy link
Member

Let's discuss details when you have the design.
I am not sure how to change the encoding doesn't affect storage structure. Changing doesn't mean breaking, such as, you have a new encoding type, which will also affect new structure in the file, but no breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database BanyanDB - SkyWalking native database feature New feature
Projects
None yet
Development

No branches or pull requests

3 participants