Skip to content

Chapter 3.8.2: DB Choice - async vs connection pooling #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bbros-dev opened this issue Mar 8, 2021 · 7 comments
Closed

Chapter 3.8.2: DB Choice - async vs connection pooling #65

bbros-dev opened this issue Mar 8, 2021 · 7 comments

Comments

@bbros-dev
Copy link

bbros-dev commented Mar 8, 2021

Obviously I'm a Rust noob, so please bear that in mind. Specifically I have not used any Rust DB library.
Many readers will be too and will be relying on choices made here.

After a cursory review of Diesel issue: diesel-rs/diesel/issues/399 I'm nervous about sqlx.

I don't think you addressed their substantive reasons for not offering an async API.

Are you indicating the sqlx has resolved those issues in their crate?
If so could you say so explicitly? It would also be informative to know if it is your opinion that the Diesel reservations are unfounded? Unreasonably conservative? You didn't hold back on the use of executable business documents ;)
Frank and forthright is generally valuable.

In fact the Diesel reasons for their choice of sync and your reasons for recommending a relational DB are pretty much the same. Repurposing you relational DB statement, one could say this is reasonable description of the state-of-play in Diesel:

Syncronous database APIs are reasonably good as jack-of-all-trades: they will often be a good choice when
building the first version of your application, supporting you along the way while you explore the
constraints of your domain.

Yet when it comes to a DB library you guide to a choice, sqlx, that IMO is high risk, for no return that I'm aware of. The risk is some less experienced user relying on the async API while using transactions. I could be wrong but there is no reasonable assurance that your app will only hit issues with transaction issues that leave incomplete transactions at very high 'Google' scale volumes.

The risk-return trade-off gets worse for sqlx when you consider that there are Rust crates r2d2 and bb8, helping you handle increased workloads. If your app scales beyond that capacity, (+sharding) then you should be looking at the toys the big kids are playing with.

While I haven't read the whole book, I wonder if there isn't a lost opportunity to add r2d2, bb8, or some such crate, as a step to production? Perhaps this comes up later?

@bbros-dev bbros-dev changed the title Chapter 3.8.2: DB Choice - async reservations Chapter 3.8.2: DB Choice - async vs connection pooling Mar 8, 2021
@bbros-dev
Copy link
Author

bbros-dev commented Mar 8, 2021

For anyone following at home... it would appear that an async DB API, which rustaceans view as elegant robust, is blocking on this Rust issue: rust-lang/rust/issues/59337

@LukeMathWalker
Copy link
Owner

LukeMathWalker commented Mar 8, 2021

The issue you linked explains what sqlx is doing:

We mark that the connection has an open transaction and perform the rollback before the next operation performed on that connection. This is the approach chosen by sqlx. My problem with this approach is that we have a potential dangling transaction now if we do not perform any other operation in a timely manner. This can result in a degraded performance at database side, because the database needs to assume the transaction is ongoing till the abort command is really received.

Which boils down to "a transaction will always be committed/rolled back as long the connection is used again".
This is safe to assume in an application that is serving any traffic - if you haven't any traffic then it makes little sense to discuss database performance degradation related to a transaction left open.
I understand the concerns of diesel's developers, but I don't think they are a blocker as of today.

A synchronous API, on the other side, is much more prone to misuse: you need to go and inspect all functions you are calling from an asynchronous context to make sure none of those is performing a query because, if they are, they have to be wrapped in a spawn_blocking to avoid blocking the main thread of your executor leading to performance degradation and/or downtime for the application.
It's the good old blue/red functions problem.

The latter case worries me much more than the former. It's not a conversation about performance or throughput.

I haven't added an extensive discussion about the topic because it's a fast changing subject and it's probably mostly of interest to advanced Rust users, but I am happy to add a footnote if you think that would be beneficial.

@bbros-dev
Copy link
Author

bbros-dev commented Mar 8, 2021

... I am happy to add a footnote if you think that would be beneficial

Sure, but this isn't my baby. Maybe something like:

Asynchronous DB access is not trivial, and robust implementations are blocked by Rust compiler issue 59337. A particular issue is the case of transactions, which are often required in production. The current async DB crates have different vulnerabilities - it would be prudent to review Diesel issue 399 to see if your use case would be affected. An alternative way to workaround this for a production use case ( i.e. have transactions), is to use connection pools to 'simulate' asynchronous DB access. Here your least-vulnerable option is the DeadPool crate, and its transaction examples - in which case you DB crate choice will be tokio-postgresql.

One could also argure the last sentence could be:

Here your least-vulnerable option is the r2d2 feature in Diesel, and its Actix example - in which case your DB crate choice will be Diesel.

@LukeMathWalker
Copy link
Owner

Asynchronous DB access is not trivial, and robust implementations are blocked by Rust compiler issue 59337. A particular issue is the case of transactions, which are often required in production. The current async DB crates have different vulnerabilities - it would be prudent to review Diesel issue 399 to see if your use case would be affected. An alternative way to workaround this for a production use case ( i.e. have transactions), is to use connection pools to 'simulate' asynchronous DB access. Here your least-vulnerable option is the DeadPool crate, and its transaction examples - in which case you DB crate choice will be tokio-postgresql.

Why are you suggesting that deadpool + tokio-postgresql is a more robust solution than sqlx's PgPool?

Sure, but this isn't my baby.

Oh, it's definitely mine 😁
I am asking for your opinion, as a reader, because, as the author, I actually do not consider the async solutions to be significantly worse than the synchronous ones. Or, to phrase it differently, not worse enough to warrant a "here be dragons" warning sign.

@bbros-dev
Copy link
Author

Why are you suggesting that deadpool + tokio-postgresql is a more robust solution than sqlx's PgPool?

Ignorance? More seriously, I didn't revise that because I took the deadpool claim of 100 line implementation at psuedo-face value - i.e. on inspection of managed code it seemed about right. The sqlx PgPool implementation (ignoring comments) looked even more elegant. From experience that code usually hides/locates the implementation detail elsewhere and since they don't make any claims about their implementation and don't point to the relevant functions I didn't investigate further. With my Rust skill-level that is about all that could be done.
Happy to stand corrected.

On a related note you have repeatedly put words in my mouth, I've tended to let it slide, but it is not common courtesy. "more robust" is not what I said. "least vulnerable" is not the same thing, and note I do refer to not just the code but working examples of a specific use case. Incase you take that too far - I note that I'm providing a suggestion aligned with your recommendation to use async DB crates. Hopefully it is clear I disagree with that assessment.

Unfortunately I cut my async teeth on MPI in the 90's. I still have nightmares about the dragons ;)
Seriously I'm hoping that Rust has solved all those problems, and maybe the compiler is now that smart - having poured over the MPI spec's and the debates about whether it's requirements are even possible on paper I am skeptical, but hopeful (those days were pre-CAP conjecture/proofs).

@LukeMathWalker
Copy link
Owner

On a related note you have repeatedly put words in my mouth, I've tended to let it slide, but it is not common courtesy. "more robust" is not what I said. "least vulnerable" is not the same thing, and note I do refer to not just the code but working examples of a specific use case. Incase you take that too far - I note that I'm providing a suggestion aligned with your recommendation to use async DB crates. Hopefully it is clear I disagree with that assessment.

My bad indeed, I am sorry 🙇🏼

@bbros-dev
Copy link
Author

Closing as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants