Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: manage the sql migration and schema changes in the main repository #685

Closed
cre8ivejp opened this issue Jan 18, 2024 · 15 comments · Fixed by #1060
Closed

feat: manage the sql migration and schema changes in the main repository #685

cre8ivejp opened this issue Jan 18, 2024 · 15 comments · Fixed by #1060
Assignees
Labels

Comments

@cre8ivejp
Copy link
Member

cre8ivejp commented Jan 18, 2024

Currently, we manage the SQL migration and schema changes in another repository. This must be done in the main repository. Otherwise, people hosting the project can't get the updates.

In addition to that, we will modify the migration-mysql service implementation not to access the GitHub repo so that we don't have to put an extra secret (GitHub personal access token) on the Kubernetes cluster.

@Ubisoft-potato
Copy link
Collaborator

Ubisoft-potato commented Feb 6, 2024

We will manage the database migration following the flowchart below:

image

Explanation:

  • Create up and down migration file for each release
  • Embed those up and down migration file to batch service bin
  • Only system admin can make request to batch service to do migration

How we do upgrade and downgrade?

Suppose we have 2 version migration files:

  • 1_create_feature_table for create a table
  • 2_add_name_column for add a column
1_create_feature_table.up.sql            next ->  2_add_name_column.up.sql      ...
1_create_feature_table.down.sql   <- previous     2_add_name_column.down.sql    ...

upgrade: we will apply the up migration file from first to last.

downgrade: we will apply the down migration file from last to first.


About the up and down file

Up file is the change that we made to the database, for example, create a table:

CREATE TABLE IF NOT EXISTS users(
   user_id serial PRIMARY KEY,
   username VARCHAR (50) UNIQUE NOT NULL,
   password VARCHAR (50) NOT NULL,
   email VARCHAR (300) UNIQUE NOT NULL
);

Down file is the change that revert the Up file changes, for example, to revert the Up file changes, we need to delete the table:

DROP TABLE IF EXISTS users;

@kentakozuka
Copy link
Contributor

How do we run mutiple up files?
For example, we have version 1 -> 2 -> 3 of Bucketeer. There is a Bucketeer cluster of version 1 running and users want to upgrade this cluster to version 3. From version 1 to 3, we have two up/down files.

flowchart TD
    version-1 -- up/down files --> version-2
    version-2 -- up/down files --> version-3
    version-1 -- Some users want to upgrade Bucketeer from 1 to 3 --> version-3
Loading

@Ubisoft-potato
Copy link
Collaborator

Ubisoft-potato commented Feb 8, 2024

How do we run mutiple up files?

@kentakozuka The migrate lib that we use has 2 functions can apply those up/down files.

  • Up(): apply all up files from current version at once. Addtionally, there is a Down() function, too
  • Step(int): apply up or down files from current version with specific num of versions.
    For example: if we want to upgrade from version-1 to version-3, we can invoke Step(2), vice versa, we can do down grade from version-3 to version-1 by invoking Step(-2)

@kentakozuka
Copy link
Contributor

kentakozuka commented Feb 8, 2024

@Ubisoft-potato Thanks for the replay!

What is the real-world steps when users upgrade Bucketeer?
Is the following what you are planing?

  1. Upgrade helm/image and apply it to the cluster.
  2. Send a request to the batch server to migrate DB schema from 1 to 2
  3. Send a request to the batch server to migrate DB schema from 3 to 3

@Ubisoft-potato
Copy link
Collaborator

Ubisoft-potato commented Feb 8, 2024

@kentakozuka Yes, the first step is right which is used to update the latest batch service.
The 2, 3 can step can be done with one request, I mean from version-1 to version-3.

Here is one thing that I want to point out: the up/down file represents the changes that we added to the database, not all of the table schemas.

For example:
For the first time we will create all tables in the first up file, next time will going to add some columns to specific table.

  1. 1_create_user_table.up.sql
CREATE TABLE IF NOT EXISTS users(
   user_id serial PRIMARY KEY,
   password VARCHAR (50) NOT NULL,
   email VARCHAR (300) UNIQUE NOT NULL
);
  1. 2_add_name_column.up.sql
ALTER TABLE users
ADD COLUMN name VARCHAR(100);

So in this case, when we upgrade from version-1 to version-2, we just need to add the name column.

Maybe we will add or change some new columns in the third file upwards. Anyway, we just need to apply those up files.

The "down" file corresponds to the SQL revert of the "up" file. If we add some column in up file, then the down file will remove those column.

@kentakozuka
Copy link
Contributor

Is the following scenario may happen?

If the new version contains 2_add_name_column.up.sql and the backend service uses name column, upgrading helm charts causes errors because the backend service reference name column that does not exist until we send a request to the batch service.

@Ubisoft-potato
Copy link
Collaborator

Ubisoft-potato commented Feb 8, 2024

Yes, it could be happened.

We should ensure that the database is up to date before upgrading the service, in general.
So if we can make sure batch service update first then make migration request to it (using helm charts), this scenario can be avoid.

@kentakozuka
Copy link
Contributor

It would be great that we could avoid this kind of errors in some way.(if possible 😄 )

@cre8ivejp
Copy link
Member Author

Yes, it could be happened.

We should ensure that the database is up to date before upgrading the service, in general. So if we can make sure batch service update first then make migration request to it (using helm charts), this scenario can be avoid.

Do you know how the Mattermost handles this case?

@Ubisoft-potato
Copy link
Collaborator

Ubisoft-potato commented Feb 8, 2024

Do you know how the Mattermost handles this case?

Mattermost make their migration as a CLI tool for user, so user can do the migration before they update Mattermost.

You can find it here: https://docs.mattermost.com/manage/command-line-tools.html#mattermost-db

@Ubisoft-potato
Copy link
Collaborator

I found that Helm Charts can define charts dependency relation, maybe we can make batch service as the first service to be installed.

@cre8ivejp
Copy link
Member Author

I found that Helm Charts can define charts dependency relation, maybe we can make batch service as the first service to be installed.

That would be nice!

One thing to keep in mind is that it is hard to cover all possible scenarios.
For cases like this, we should always try to plan the schema changes so it doesn't affect the services in case they deploy before applying the changes.
For any other cases, we should announce maintenance in advance while upgrading the helm chart in my opinion, like any other service does.

Also, because the api-gateway (end user endpoint) uses Redis, we would only maintain the admin console without affecting the end users (in most cases), which is acceptable to me.

Note that the api-gateway only uses MySQL if the Redis fails. In a normal situation, it will only use Redis.

@kentakozuka
Copy link
Contributor

CLI solution looks nice!
It is not clear to me how helm dependencies helps 🤔

@Ubisoft-potato
Copy link
Collaborator

It is not clear to me how helm dependencies helps 🤔

@kentakozuka Sorry! I forgot to update this issue.

After conducting further research on Helm, I have discovered a better method for performing migration after installing or upgrading Bucketeer Helm Charts.

We can use Helm Hooks to do migration after we install or upgrade batch service successfully.

There is a blog that show use case of Helm Hooks.

@kentakozuka
Copy link
Contributor

@Ubisoft-potato Helm hooks sounds good 😄
Thanks!

@cre8ivejp cre8ivejp moved this from In Progress Q1-2024 (Jan-Mar) to Planned Q2-2024 (Apr-Jun) in Bucketeer Roadmap Apr 15, 2024
@cre8ivejp cre8ivejp moved this from Planned Q2-2024 (Apr-Jun) to In Progress Q2-2024 (Apr-Jun) in Bucketeer Roadmap Apr 15, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Weekly done in Bucketeer Tasks Jun 26, 2024
@cre8ivejp cre8ivejp moved this from Weekly done to Done 2024 in Bucketeer Tasks Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress Q2-2024 (Apr-Jun)
Status: Done 2024
4 participants