You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All of the details asked for in the template below are specified in my reproducer https://github.com/davepacheco/diesel-oid. I'll also fill it in as requested below but I think it'll be easier to just look at the reproducer.
Setup
Versions
Rust: 1.77.0
Diesel: 2.1.5
Database: PostgreSQL 16.2
Operating System MacOS Ventura 13.6.1
Feature Flags
diesel: default + "postgres"
Problem Description
We have a program that:
runs some queries that insert rows containing user-defined types
some other program runs a schema change that's compatible with the Diesel schema but changes the type's oid (i.e., drop the type and re-add it)
runs more of those queries
After the schema change, such queries start failing with errors like type with ID 218 does not exist (CockroachDB) or cache lookup failed for type 16455 (PostgreSQL)
You could argue that this is not supposed to work? But restarting the application or executing the queries from new connections fixes the problem. It's the stale cache in the Diesel connection that's broken and if it didn't cache the OIDs, I think the application would work.
I don't know if PostgreSQL provides any way for Diesel to handle this without the application's help. If not, how should an application deal with this? Should the application have a way to disable caching? Or maybe a way to communicate to Diesel that the cache needs to be flushed?
To make things more complicated, the application where we actually hit this proactively populates the Diesel cache with all of our user-defined types to avoid lots of tiny queries for very short-lived connections (our test suite).
Relatedly: why is the cache per-connection? We're using a connection pool. So even if Diesel had a way to flush the cache and re-fill it, it's hard for us to go do that on every connection. It'd be really nice if the cache were distributed via a watch channel or something where we could update a canonical copy and then propagate that out to all the consumers that want it. I guess that might be hard if consumers need to modify it themselves as they run into new types that they need?
What are you trying to accomplish?
We're trying to manage an application where the database schema can change over time. We're not yet doing online schema updates, in that we're not expecting the application to work seamlessly as the schema is updated. We do expect that we can start the program, have it wait for the schema to match what the version that it understands, and then proceed.
First, we store an integer version number for our schema in a database table.
To update our system to a new version of the software + schema, we do the following:
shut the application down ("old" application version, "old" schema version)
replace the application binaries on disk
start the application again ("new" application version)
on startup, it checks the schema version. It'll still be the old one. It blocks, re-checking every few seconds until the version matches what's expected.
an operator executes a separate program to perform the schema update.
this unblocks the application above
Our problem is that Diesel may have cached some types' OIDs before the schema change. If that happens, then the application is totally broken (lots of queries fail) after the schema change.
In diesel-oid, run cargo run -- postgresql://127.0.0.1:5432/dap.
I see:
$ cargo run -- postgresql://127.0.0.1:5432/dap
Compiling diesel-oid v0.1.0 (/Users/dap/diesel-oid)
Finished dev [unoptimized + debuginfo] target(s) in 0.44s
Running `target/debug/diesel-oid 'postgresql://127.0.0.1:5432/dap'`
connecting to: postgresql://127.0.0.1:5432/dap
connected!
initialized schema
inserted rows: 1
found rows: [MyTable { a: 0, b: One }]
updated schema
ERROR: second insert failed: cache lookup failed for type 16455
establishing second connection
connected!
insert on second connection worked: 1
contents of table after second insert (second conn):
rows: [MyTable { a: 0, b: One }, MyTable { a: 0, b: One }]
contents of table after second insert (first conn):
Error: listing contents of table
Caused by:
cached plan must not change result type
I'm not worried about "cached plan must not change result type" -- I get that one. The problem is the "cache lookup failed for type 16455". I think what's going on is:
From one connection, we successfully INSERT a row into table my_table, which has a column of user-defined type my_type. This populates Diesel’s OID cache for the type my_type.
Then we change the schema in a way that’s compatible with the Diesel definition of that schema, but with a different OID (by dropping and re-creating the user-defined type).
Then we try the INSERT again. It fails unexpectedly. It appears that what’s happening is Diesel is sending the same OID as before, from its cache, but that’s wrong now.
To prove it’s just a caching issue: we create a second connection and do the same INSERT again. It works.
Thanks for filling this bug report. It's definitively helpful to have a good minimal example for this.
That written: I unfortunately see no good way to change the current behavior without running into massive performance problems. I think this essentially boils down to: Change the schema is only supported as long as the various internal caches in the connection are not populated yet. After that it's not safe to reuse the connection again and expect it to continue to work. You need to establish connections at that point by marking your connection pool as stale and reestablish connections. After all you know at that point you are calling these queries that something might go stale, which is more than diesel knows about that particular queries.
That written: I'm open to accept contributions to improve the current behavior, but I would strongly prefer first to have a suggested design before someone starts to implement something. For me that's currently not a priority (well maybe beside just adding a documentation note about this specific behavior).
Relatedly: why is the cache per-connection? We're using a connection pool. So even if Diesel had a way to flush the cache and re-fill it, it's hard for us to go do that on every connection. It'd be really nice if the cache were distributed via a watch channel or something where we could update a canonical copy and then propagate that out to all the consumers that want it. I guess that might be hard if consumers need to modify it themselves as they run into new types that they need?
Simply because there is no pool in diesel, we just provide support for r2d2 and that does not allow to store these information in the pool itself. That only leaves storing them in the connection. I cannot really comment for other third party pool implementations here. Again that's something where I would accept contributions to change it.
@fbecart Please not that this issue is not about the prepared statement cache, but about the oid type cache which is something different. So your post here is off topic. Please create a new discussion if you feel that this needs to be discussed.
All of the details asked for in the template below are specified in my reproducer https://github.com/davepacheco/diesel-oid. I'll also fill it in as requested below but I think it'll be easier to just look at the reproducer.
Setup
Versions
Feature Flags
Problem Description
We have a program that:
After the schema change, such queries start failing with errors like
type with ID 218 does not exist
(CockroachDB) orcache lookup failed for type 16455
(PostgreSQL)You could argue that this is not supposed to work? But restarting the application or executing the queries from new connections fixes the problem. It's the stale cache in the Diesel connection that's broken and if it didn't cache the OIDs, I think the application would work.
I don't know if PostgreSQL provides any way for Diesel to handle this without the application's help. If not, how should an application deal with this? Should the application have a way to disable caching? Or maybe a way to communicate to Diesel that the cache needs to be flushed?
To make things more complicated, the application where we actually hit this proactively populates the Diesel cache with all of our user-defined types to avoid lots of tiny queries for very short-lived connections (our test suite).
Relatedly: why is the cache per-connection? We're using a connection pool. So even if Diesel had a way to flush the cache and re-fill it, it's hard for us to go do that on every connection. It'd be really nice if the cache were distributed via a
watch
channel or something where we could update a canonical copy and then propagate that out to all the consumers that want it. I guess that might be hard if consumers need to modify it themselves as they run into new types that they need?What are you trying to accomplish?
We're trying to manage an application where the database schema can change over time. We're not yet doing online schema updates, in that we're not expecting the application to work seamlessly as the schema is updated. We do expect that we can start the program, have it wait for the schema to match what the version that it understands, and then proceed.
First, we store an integer version number for our schema in a database table.
To update our system to a new version of the software + schema, we do the following:
Our problem is that Diesel may have cached some types' OIDs before the schema change. If that happens, then the application is totally broken (lots of queries fail) after the schema change.
For much more detail, see oxidecomputer/omicron#5561.
What is the expected output?
For my reproducer, I expect no "cache lookup failed" error.
What is the actual output?
For my reproducer, I see the "cache lookup failed" error.
Are you seeing any additional errors?
See reproducer README or output below.
Steps to reproduce
dap
.diesel-oid
, runcargo run -- postgresql://127.0.0.1:5432/dap
.I see:
I'm not worried about "cached plan must not change result type" -- I get that one. The problem is the "cache lookup failed for type 16455". I think what's going on is:
Checklist
closed if this is not the case)
I am not sure how to reproduce this without
diesel_derive_enum
but I do not believe it's related to the problem.The text was updated successfully, but these errors were encountered: