Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the backup and restore API for metasrv #5085

Closed
zyy17 opened this issue Dec 3, 2024 · 12 comments
Closed

Add the backup and restore API for metasrv #5085

zyy17 opened this issue Dec 3, 2024 · 12 comments
Labels
C-enhancement Category Enhancements good first issue Good for newcomers

Comments

@zyy17
Copy link
Collaborator

zyy17 commented Dec 3, 2024

What type of enhancement is this?

Refactor

What does the enhancement do?

Background

The metadata that metasrv manages is vital for the whole cluster management. Like other data, we also need to add the backup and restore API for metasrv:

  1. Add the abstraction for metadata backup and restore: Although we can backup and restore metadata for backend storage of metasrv, it's not the best operation practice. It should be better to handle in the metasrv;
  2. For disaster recovery: If we back up the metadata period, it's easy to recover from a disaster. Actually, we always back up the etcd data scheduled;
  3. Easy to troubleshoot: It will be easy to dump the raw metadata to debug the problem;

Implementation

We can add backup and restore API in metasrv admin API.

Implementation challenges

No response

@zyy17 zyy17 added C-enhancement Category Enhancements good first issue Good for newcomers labels Dec 3, 2024
@ozewr
Copy link
Contributor

ozewr commented Dec 13, 2024

Hello, I want to give this a try. Could you tell me which part of the code I should read, or provide some other relevant information?

@zyy17
Copy link
Collaborator Author

zyy17 commented Dec 13, 2024

Hello, I want to give this a try. Could you tell me which part of the code I should read, or provide some other relevant information?

👍 Maybe @fengjiachun and @WenyXu can give you some tips.

@WenyXu
Copy link
Member

WenyXu commented Dec 13, 2024

Hi @ozewr, thank you for your interest in this good first issue! To get started, you can take a look at this file, which provides some context on how metadata is structured in our system.

The first task involves dumping and restoring the metadata. Additionally, we aim to design a format to store the exported metadata, ensuring its compatibility and flexibility for future use.

Let us know if you have any questions or need further guidance. We're here to help!

@ozewr
Copy link
Contributor

ozewr commented Dec 16, 2024

Hi,I have already read the relevant code.I think can use query to dump all metadate.And use put to restore it. Can't we use json or toml to store the metadata?@WenyXu @zyy17

@WenyXu
Copy link
Member

WenyXu commented Dec 17, 2024

Hi,I have already read the relevant code.I think can use query to dump all metadate.And use put to restore it. Can't we use json or toml to store the metadata?@WenyXu @zyy17

I believe storing metadata in a JSON file is not ideal. Instead, we should handle metadata as binary data rather than JSON.

@ozewr
Copy link
Contributor

ozewr commented Dec 17, 2024

I believe storing metadata in a JSON file is not ideal. Instead, we should handle metadata as binary data rather than JSON.

I want to use the bincode crate so that I can easily serialize metadata into binary data. However, I couldn't find it in the Cargo.toml. Can I import it ?

@ozewr
Copy link
Contributor

ozewr commented Dec 19, 2024

Implementation

We can add backup and restore API in metasrv admin API.

I'm trying to accomplish this, but I noticed that the Admin HTTP implementation seems quite simple. It doesn't appear to support receiving a binary stream, which makes me unsure how to extract data from a backup binary file during the restore process. Does this require an extension? @WenyXu @zyy17
src/meta-srv/src/service/admin.rs:69

#[async_trait::async_trait]
pub trait HttpHandler: Send + Sync {
    async fn handle(
        &self,
        path: &str,
        method: http::Method,
        params: &HashMap<String, String>,
    ) -> crate::Result<http::Response<String>>;
}

@WenyXu
Copy link
Member

WenyXu commented Dec 19, 2024

Implementation

We can add backup and restore API in metasrv admin API.

I'm trying to accomplish this, but I noticed that the Admin HTTP implementation seems quite simple. It doesn't appear to support receiving a binary stream, which makes me unsure how to extract data from a backup binary file during the restore process. Does this require an extension? @WenyXu @zyy17 src/meta-srv/src/service/admin.rs:69

#[async_trait::async_trait]
pub trait HttpHandler: Send + Sync {
    async fn handle(
        &self,
        path: &str,
        method: http::Method,
        params: &HashMap<String, String>,
    ) -> crate::Result<http::Response<String>>;
}

We can use the CLI to create a kvbackend for direct connection to the store backend.

@ozewr
Copy link
Contributor

ozewr commented Dec 19, 2024

Implementation

We can add backup and restore API in metasrv admin API.

I'm trying to accomplish this, but I noticed that the Admin HTTP implementation seems quite simple. It doesn't appear to support receiving a binary stream, which makes me unsure how to extract data from a backup binary file during the restore process. Does this require an extension? @WenyXu @zyy17 src/meta-srv/src/service/admin.rs:69

#[async_trait::async_trait]
pub trait HttpHandler: Send + Sync {
    async fn handle(
        &self,
        path: &str,
        method: http::Method,
        params: &HashMap<String, String>,
    ) -> crate::Result<http::Response<String>>;
}

We can use the CLI to create a kvbackend for direct connection to the store backend.

You mean not using APIs during backup and restore, directly connecting via CLI instead?

@WenyXu
Copy link
Member

WenyXu commented Dec 19, 2024

Implementation

We can add backup and restore API in metasrv admin API.

I'm trying to accomplish this, but I noticed that the Admin HTTP implementation seems quite simple. It doesn't appear to support receiving a binary stream, which makes me unsure how to extract data from a backup binary file during the restore process. Does this require an extension? @WenyXu @zyy17 src/meta-srv/src/service/admin.rs:69

#[async_trait::async_trait]
pub trait HttpHandler: Send + Sync {
    async fn handle(
        &self,
        path: &str,
        method: http::Method,
        params: &HashMap<String, String>,
    ) -> crate::Result<http::Response<String>>;
}

We can use the CLI to create a kvbackend for direct connection to the store backend.

You mean not using APIs during backup and restore, directly connecting via CLI instead?

Yes. But adding backup and restore APIs to the admin endpoint would also bring some benefits. It would allow PaaS-level services to invoke these APIs directly. cc @zyy17, any thoughts

@ozewr
Copy link
Contributor

ozewr commented Dec 19, 2024

Yes. But adding backup and restore APIs to the admin endpoint would also bring some benefits. It would allow PaaS-level services to invoke these APIs directly. cc @zyy17, any thoughts

If we are going to use APIs, I think we should extend the current HTTP implementation of the Admin. If we are using the CLI, would it require creating a new project?

@zyy17
Copy link
Collaborator Author

zyy17 commented Dec 19, 2024

@ozewr @WenyXu

After careful consideration, I think this issue is not a reasonable request for the following reasons(actually, my initial requirement might have just needed a tool to dump metadata, which led to this issue):

  1. Backup and recovery of GreptimeDB data do not require metadata backup. Users can use the greptime cli tool for data backup and recovery. Therefore, there are not many practical scenarios for the backup and recovery of metadata;

  2. We can import or export the metadata using the ecosystem tools of the backend storage used by metasrv, such as etcdctl, pg_dump, etc. If we need to implement this feature, it might just involve adding a layer of logic to the greptime cli tool, and it might not necessarily be done better than the native tools;

Therefore, I have decided to close this issue. Thank you all for the discussion.

@zyy17 zyy17 closed this as completed Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category Enhancements good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants