You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't know the history of this check, but it's problematic in a few situations.
Take a two instance cluster with instance Foo and instance Bar. Lets say Foo is the writer. Foo crashes and Bar gets promoted to the writer. When Bar becomes available the driver will get stuck in this loop until Foo comes up as a reader (which may never happen in a bounded time depending on other problems) and brings the topology size to two. However, as soon as the driver is connected to Bar it has a writer connection and can complete the failover so all the additional downtime is unnecessary.
Expected Behavior
I expect the driver to return availability to clients looking for a writer as soon as a new writer is connected to regardless of the rest of the topology in terms of number of readers and their health.
What plugins are used? What other connection properties were set?
aurora-mysql
Current Behavior
When connecting to a two instance aurora mysql cluster and calling the failover-db-cluster api the failover of the driver won't complete until both instances restart (the reader gets promoted and restarts as a writer and the old writer restarts as a reader). It should complete as soon as the new writer is up.
Reproduction Steps
Create a two instance mysql cluster. Connect and send queries with the driver. Trigger failover with the api. Wait for the FailoverSuccessSQLException. Note that this comes later than the time when the new writer comes up. You can get this from the db cloudwatch logs for example.
Possible Solution
No response
Additional Information/Context
No response
The AWS Advanced JDBC Driver version used
latest
JDK version used
11
Operating System and version
osx
The text was updated successfully, but these errors were encountered:
Hey @ucjonathan, this issue isn't mysql specific and applies to pg too. I filled in the issue incorrectly because I only specified the aurora-mysql plugin in this issue description but it affects both.
In either case however, your fix suggestion seems appropriate. As soon as the driver is connected to a writer it should go ahead and serve requests, no reason to wait for other instances.
I expect it will apply to MAZ clusters too not just Aurora. Whatever the context, as soon as you have a writer there's no need to wait for another instance to be up if you're looking for a writer endpoint.
A new version of failover plugin has been merged recently. It's a reworked and re-architected plugin to support cluster failover. In general, a new failover2 plugin shows a better stability and we hope it may solve the issue you reported.
The new plugin is available in the latest snapshot build. Could you kindly checkout our snapshot build and let us know
if the issue still persists with a new failover2 plugin?
Describe the bug
The bug lies in the code here:
aws-advanced-jdbc-wrapper/wrapper/src/main/java/software/amazon/jdbc/plugin/failover/ClusterAwareWriterFailoverHandler.java
Line 408 in d9a563b
I don't know the history of this check, but it's problematic in a few situations.
Take a two instance cluster with instance Foo and instance Bar. Lets say Foo is the writer. Foo crashes and Bar gets promoted to the writer. When Bar becomes available the driver will get stuck in this loop until Foo comes up as a reader (which may never happen in a bounded time depending on other problems) and brings the topology size to two. However, as soon as the driver is connected to Bar it has a writer connection and can complete the failover so all the additional downtime is unnecessary.
Expected Behavior
I expect the driver to return availability to clients looking for a writer as soon as a new writer is connected to regardless of the rest of the topology in terms of number of readers and their health.
What plugins are used? What other connection properties were set?
aurora-mysql
Current Behavior
When connecting to a two instance aurora mysql cluster and calling the failover-db-cluster api the failover of the driver won't complete until both instances restart (the reader gets promoted and restarts as a writer and the old writer restarts as a reader). It should complete as soon as the new writer is up.
Reproduction Steps
Create a two instance mysql cluster. Connect and send queries with the driver. Trigger failover with the api. Wait for the FailoverSuccessSQLException. Note that this comes later than the time when the new writer comes up. You can get this from the db cloudwatch logs for example.
Possible Solution
No response
Additional Information/Context
No response
The AWS Advanced JDBC Driver version used
latest
JDK version used
11
Operating System and version
osx
The text was updated successfully, but these errors were encountered: