Skip to content

Conversation

@beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented Oct 29, 2025

The original plan for getting the Vitess infrastructure running was to use vttestserver as a starting point to reach a minimum viable setup. However, vttestserver didn’t work out because some of its defaults conflicted with how we clean up rows and the level of resources (threads) we need.

Fortunately, vttestserver is just a wrapper around vtcombo that generates a vttest protobuf describing the configuration for an in-memory topology server started by vtcombo, encoded in JSON. By modifying vttestserver’s run.sh, we're able to interact with vtcombo directly, passing the JSON configuration along with other vttestserver defaults reverse-engineered from run.sh and vtprocess.go.

Vitess doesn’t provide a vtcombo image, we must build our own. Build and upload a boulder-vtcomboserver image on top of Oracle’s MySQL 8.0 image, which provides native arm64 support. The accompanying tag-and-upload shell script defaults to amd64 for CI.

As an aside, Vitess’s official Dockerfiles are only published for amd64, and modifying them to build for arm64 would prove difficult because Oracle doesn’t publish MySQL arm64 binaries in its Debian apt repository.

With boulder-vtcomboserver up and running I was able to find/validate the following issues and provide workarounds:

  • Problem: db-migrate, the tool we use to apply database migrations, must be configured to talk to MariaDB and to MySQL through Vitess (vtgate + vttablet).
    Solution: Create two new dbconfig YAML files (mariadb and vitess) and use test/entrypoint.sh to set the appropriate file for sql-migrate (test/create_db.sh) to use. Also, symlink each of these two new files from db to db-next just like the old dbconfig.yml file.

  • Problem: Vitess does not allow database CREATE statements and any DDL containing them will be rejected by vtgate.
    Solution: These databases are already created by vtcombo since they’re defined as KEYSPACES. Skip database creation in test/create_db.sh.

  • Problem: Vitess does not allow user creation or grants (CREATE USER, GRANT), and any DDL containing these commands will be blocked by vtgate.
    Solution: Skip user creation and grant steps in test/create_db.sh. Set % for --vschema_ddl_authorized_users as vttestserver does, and revisit this later for a more complete approach.

  • Problem: vttablet default for maximum number of rows returned from a (non-streaming) query (10,000) is too low for Boulder’s needs, causing queries to fail due to vttablet rejecting them.
    Solution: Increase --queryserver-config-max-result-size to 1,000,000 and --queryserver-config-warn-result-size to 1,000,000.

  • Problem: vttablet default for connection pool size (16) and maximum number of concurrent transactions (20) are too low for Boulder’s needs, causing queries to fail due to vttablet being overloaded.
    Solution: Increase --queryserver-config-pool-size to 64 and --queryserver-config-transaction-cap to 80.

  • Problem: Vitess does not allow TRIGGER statements and any DDL containing them will be rejected by vtgate. Without TRIGGER statements TestIssuanceCertStorageFailed, an integration test, will fail.
    Soluton: Run these TRIGGER statements in an entrypoint scripttest/vtcomboserver/install_trigger.sh, bypassing vtgate entirely.

Depends on #8479
Depends on #8489
Depends on #8490
Depends on #8494
Fixes #7736

@beautifulentropy beautifulentropy force-pushed the add-vitess branch 11 times, most recently from 1df6b37 to f6b45ac Compare October 31, 2025 16:15
@beautifulentropy beautifulentropy changed the title WIP database: Add vitess + mysql 8.0 to our development environment Oct 31, 2025
@beautifulentropy beautifulentropy force-pushed the add-vitess branch 8 times, most recently from e3b5db5 to 55dcd24 Compare November 4, 2025 17:06
@beautifulentropy beautifulentropy marked this pull request as ready for review November 12, 2025 17:33
@beautifulentropy beautifulentropy requested a review from a team as a code owner November 12, 2025 17:33
@github-actions
Copy link
Contributor

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

Copy link
Contributor

@aarongable aarongable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overarching comment: I think a bunch of small pieces of this (the sa.go change, the schema file changes, the "put max_statement_time in every dburl explicitly" change, etc) could and maybe should be broken out as smaller changes that precede actually adding Vitess+MySQL8. I don't feel super strongly on this point -- this PR is actually fairly readable despite its size as-is -- but since some pieces have already been broken out, it would make sense to continue down that path.

beautifulentropy added a commit that referenced this pull request Nov 13, 2025
Ahead of the move from ProxySQL + MariaDB to Vitess + MySQL 8 in #8468.
Vitess blocks partition related DDL, so partitions need to be removed
from all schemas under `sa/db*`. The team has agreed that this drift
from Production is acceptable because it lets us begin testing on Vitess
and MySQL sooner.

Separately, `thisUpdate` and `nextUpdate` were relying on an implicit
`DEFAULT NULL`. We now make that explicit, matching how we define other
DATETIME columns. Also, add a missing `DROP TABLE `incidents`;` to our
combined schema migration.

Part of #7736
aarongable

This comment was marked as duplicate.

@beautifulentropy beautifulentropy marked this pull request as draft November 18, 2025 19:58
beautifulentropy added a commit that referenced this pull request Nov 19, 2025
…erter (#8494)

Today, timestamp truncation happens for queries using `*borp.DbMap` but
not `*borp.Transaction`. That means comparisons still see sub-seconds,
but inserts into MariaDB `DATETIME` columns silently truncate them to
whole seconds.

On MySQL 8, the same queries will still include sub-seconds, but inserts
into `DATETIME` columns will round to the nearest second instead of
truncate. This leads to issues for queries like the one in
`*StorageAuthority.UpdateCRLShard()`. When two CRL updaters write within
the same second one may be rounded up to the next second. When the other
updater attempts its own `UPDATE .. WHERE thisUpdate <= ?`, the
condition fails because the stored timestamp now appears to be in the
future.

Ahead of the transition from ProxySQL + MariaDB to Vitess + MySQL 8 in
#8468, update borp (letsencrypt/borp#12) to
expose Transaction arguments to the BoulderTypeConverter, allowing it to
truncate all timestamps passed through Transactions and keep behavior
consistent across `*borp.DbMap` and `*borp.Transaction`, as well as
MariaDB and MySQL 8.

Part of #7736
@beautifulentropy beautifulentropy force-pushed the add-vitess branch 6 times, most recently from f76694f to e61294b Compare November 24, 2025 21:51
@beautifulentropy beautifulentropy marked this pull request as ready for review November 24, 2025 21:52
npurtova pushed a commit to plesk/boulder that referenced this pull request Nov 25, 2025
Ahead of the move from ProxySQL + MariaDB to Vitess + MySQL 8 in letsencrypt#8468.
Vitess blocks partition related DDL, so partitions need to be removed
from all schemas under `sa/db*`. The team has agreed that this drift
from Production is acceptable because it lets us begin testing on Vitess
and MySQL sooner.

Separately, `thisUpdate` and `nextUpdate` were relying on an implicit
`DEFAULT NULL`. We now make that explicit, matching how we define other
DATETIME columns. Also, add a missing `DROP TABLE `incidents`;` to our
combined schema migration.

Part of letsencrypt#7736
npurtova pushed a commit to plesk/boulder that referenced this pull request Nov 25, 2025
…erter (letsencrypt#8494)

Today, timestamp truncation happens for queries using `*borp.DbMap` but
not `*borp.Transaction`. That means comparisons still see sub-seconds,
but inserts into MariaDB `DATETIME` columns silently truncate them to
whole seconds.

On MySQL 8, the same queries will still include sub-seconds, but inserts
into `DATETIME` columns will round to the nearest second instead of
truncate. This leads to issues for queries like the one in
`*StorageAuthority.UpdateCRLShard()`. When two CRL updaters write within
the same second one may be rounded up to the next second. When the other
updater attempts its own `UPDATE .. WHERE thisUpdate <= ?`, the
condition fails because the stored timestamp now appears to be in the
future.

Ahead of the transition from ProxySQL + MariaDB to Vitess + MySQL 8 in
letsencrypt#8468, update borp (letsencrypt/borp#12) to
expose Transaction arguments to the BoulderTypeConverter, allowing it to
truncate all timestamps passed through Transactions and keep behavior
consistent across `*borp.DbMap` and `*borp.Transaction`, as well as
MariaDB and MySQL 8.

Part of letsencrypt#7736
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use Vitess in Boulder CI

3 participants