Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Syncstorage inital sqlite support #1520

Draft
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

Eragonfr
Copy link

@Eragonfr Eragonfr commented Feb 11, 2024

Description

Re-implementation of #935 on the newer codebase.
Thanks for @mqus for the first implementation.

Issue(s)

Partially fixes #498

Todo

  • Code Cleanup
  • Commit history cleanup (I'm not really working in a clean way)
  • Add Sqlite clippy checks to CircleCI
  • docker container (Thanks @bcspragu for the help)
  • e2e tests (Thanks @bcspragu for the help)

Current state is that it runs on my computer, but it's probably not clean at all, there's still a lot of duplicate code, and some SQL queries are still in MySQL-flavored SQL but could be used by the SQLite backend.

@Eragonfr
Copy link
Author

To pass the checks I have the choice between upgrading diesel or ignoring the issue with libsqlite-sys (which tbh is probably a non-issue because we should not be using SQLite printf anyway).

@Eragonfr
Copy link
Author

Eragonfr commented May 9, 2024

I choose to ignore the libsqlite3-sys CVE because it's not impacting this project. That should let me run the other tests in the CI.

clippy_sqlite:
# Matches what's run in circleci
cargo clippy --workspace --all-targets --no-default-features --features=syncstorage-db/sqlite,tokenserver-db/sqlite -- -D warnings

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you saw #1513, but one of the things we're trying to do is to pull more of tokenserver into syncstorage-rs.

(Weird history lesson: Tokenserver was supposed to support more than just sync, but that never really happened. Having it distinct turned out to be a headache for a lot of reasons, so we're folding as much of Tokenserver as we can into the main syncstorage)

As a side note, this also means that the tokenserver storage stuff can probably live along-side the syncstorage data fairly easily. Possibly as just a bunch of other tables.

It's a bit more complicated for us since we have yet to migrate tokenserver over to spanner, but that's on our internal roadmap. When we get to doing that, however, is unknown.

We also have a few crate refactoring tickets open. I'll update those to indicate that we should really put the data handlers for both syncstorage and tokenserver into their own distinct crate.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not saw that PR.

You want to remove tokenserver completely. Is that right ?

I don't understand what it means for my PR.
Should I change how I handle tokenserver ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that it impacts much of your PR, since you're only focused on the Storage part. There may be a few opportunities for folk running their own "Stand Alone" version to have just one database running instead of two, and depending on how we decide to do this for Spanner, there may be some additional functions we introduce to the database layer. We'll try to isolate and highlight those so that we don't cause more work.

@@ -1,6 +1,7 @@
use diesel::{
mysql::MysqlConnection,
r2d2::{CustomizeConnection, Error as PoolError},
sqlite::SqliteConnection,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we might want to feature flag some of this? I know spanner is far from light-weight, and I suspect that MySQL isn't a lot better. Feels silly to compile everything in.

I'm fine with you want to file a ticket for that and keep everything together for this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I can easily add a feature flag, at least for SQLite and MySQL.
I don't want to touch the code related to spanner because I can't test it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may be able to feature flag a few things (like this use:

#[cfg(feature = "sqlite")]
use diesel::sqlite::SqliteConnection;

(and then flag the parts that include SqliteConnection)

For now, don't worry about it. As you get closer to the final version, we can work together to try and isolate things so that folk only include what they need. (This would include all the Spanner stuff, too.)

.cargo/audit.toml Outdated Show resolved Hide resolved
@bcspragu
Copy link

bcspragu commented Aug 8, 2024

I'm really jazzed about this PR, thanks for all the work on it! I noticed this in the PR desc:

  • docker container (Help needed)

If this is a question of building a Docker container from this codebase that bundles all the SQLite deps and nothing more, I'm happy to take on that work.

@Eragonfr
Copy link
Author

Eragonfr commented Aug 9, 2024

I'm really jazzed about this PR, thanks for all the work on it! I noticed this in the PR desc:

  • docker container (Help needed)

If this is a question of building a Docker container from this codebase that bundles all the SQLite deps and nothing more, I'm happy to take on that work.

Something like that yes. There might be some docker-compose configuration files to make too.
I gave multiple tries to Docker but I never seems to get a "standard" docker environment. I always get something with some broken networking, nuked firewall or something else. At this point I've given up on that technology. (docker should release VMs that I can start and get a "standard" env with networking and stuff)

That would leave me to clean up my code

@bcspragu
Copy link

bcspragu commented Aug 9, 2024

Sounds good, I'll give it a shot!

@bcspragu
Copy link

bcspragu commented Aug 9, 2024

Okay, I think I've got something approximately working in Eragonfr/syncstorage-rs@syncstorage-sqlite...bcspragu:syncstorage-rs:syncstorage-sqlite

E.g.

docker build -t mozsync-sqlite --build-arg DATABASE_BACKEND=sqlite .

# The --userns line is a rootless Podman thing, can be ignored mostly
docker run --rm -it \
    --userns=keep-id:uid=10001,gid=10001 \
    -v $PWD/temp-data-do-not-submit:/data \
    -e "SYNC_HOST=0.0.0.0" \
    -e "SYNC_MASTER_SECRET=secret0" \
    -e "SYNC_SYNCSTORAGE__DATABASE_URL=sqlite:///data/syncdb.sqlite" \
    -e "SYNC_TOKENSERVER__DATABASE_URL=sqlite:///data/tokenserverdb.sqlite" \
    -e "SYNC_TOKENSERVER__RUN_MIGRATIONS=true" \
    mozsync-sqlite

Seems to start up successfully, but I'm not brave enough to connect it to my real Firefox sync yet

@Eragonfr
Copy link
Author

Thanks @bcspragu !

@bcspragu
Copy link

bcspragu commented Aug 12, 2024

For sure! Also, looking at the current ci/circleci: build-and-test failure, that's a result of missing --features flags here:

command: cargo build

Not sure what the team would prefer here, but there's two options:

  1. If you're just looking for a sanity check, test against MySQL (e.g --features=syncstorage-db/mysql,tokenserver-db/mysql)
  2. Run separate builds for each DB backend
  3. [Actually, this is probably what we want] Fix the default flags to use MySQL correctly

Unrelated, one thing I forgot to mention is that regardless of which features are enabled, MySQL client libs are required on the host (e.g. there are places where it isn't gated behind the feature flag, it gets linked in regardless). I started adding the relevant #cfg(feature(...)) in places where I noticed it, but again, not sure what the team would prefer there.

@Eragonfr
Copy link
Author

I thought of dropping the default flags because the user should choose their storage backend instead of relying on a default one.


We want to have the proper #cfg(feature(...)) when it's needed without breaking the spanner backend. (It uses MySQL for the token server database)

@Eragonfr
Copy link
Author

Eragonfr commented Sep 8, 2024

The ci fails on sqlite e2e tests. I think it is because the /data/ folder is missing, which means that the database can't be created and that makes the ci to fail.
I need some help to figure out how to either create this folder or to not need it in the container.

The checks are unrelated and are probably failing on the master branch too.

@Eragonfr
Copy link
Author

Eragonfr commented Sep 22, 2024

🤦‍♀️ I'm dumb the e2e tests can't work because they require a connection to the database from the python script and syncserver at the same time. Which is not possible with sqlite.
There was no need for my last height commits.
But the e2e tests are not usable with sqlite.

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch 3 times, most recently from 7ca1578 to 1c08e86 Compare September 22, 2024 23:08
@JulianWgs
Copy link

Hi @Eragonfr , thanks for your hard work! I am trying to get this to run locally, but I can't execute this query against the token database, because the database file is empty. Am I doing something wrong? Could you try running the server with a missing token database and check if the server correctly sets it up? Thanks again! I am really looking forward to this!

@Eragonfr
Copy link
Author

Eragonfr commented Oct 13, 2024

@JulianWgs I also have the issue. I think that I broke the auto migration when doing some refactor.
The tokenserver part is giving me some trouble to get it to use SQLite queries, it doesn't pass its unit tests and there is quite a lot of its queries that doesn't run.

EDIT: Can't reproduce that bug anymore… wtf

@Eragonfr Eragonfr force-pushed the syncstorage-sqlite branch 4 times, most recently from ff5be44 to a8dda23 Compare October 19, 2024 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support sqlite/postgresql backend
5 participants