Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on handling closed connections #315

Open
ruslandoga opened this issue Nov 9, 2024 · 7 comments
Open

Guidance on handling closed connections #315

ruslandoga opened this issue Nov 9, 2024 · 7 comments

Comments

@ruslandoga
Copy link

ruslandoga commented Nov 9, 2024

👋

I wonder if it's possible to gracefully reconnect once the underlying adapter's connection is detected to be closed to avoid :closed errors on user queries (DBConnection.query / DBConnection.execute)? For example, maybe we could use :inet.monitor/1 from the adapters? Maybe it could be possible to let the adapters handle_info the DOWN messages?

Related Ch issues / PRs: plausible/ch#210, plausible/ch#211

@josevalim
Copy link
Member

If I understand correctly, the issue is that DBConnection is not very kind to changing the connection once a query starts, right? So we would need to add a mechanism that allows the adapter to say: "try again with another connection, I am going away"?

@ruslandoga
Copy link
Author

ruslandoga commented Nov 9, 2024

the issue is that DBConnection is not very kind to changing the connection once a query starts, right?

I think we can already "swizzle" the connection without "telling" DBConnection like in plausible/ch#211. So that part seems to be OK.

So we would need to add a mechanism that allows the adapter to say: "try again with another connection, I am going away"?

Do you mean a new TCP connection entirely, or another DBConnection connection from the pool? The approach I linked above is about a new TCP connection entirely, and that seems to work but kind of feels like the TCP reconnection can happen earlier, not on ping or query attempt. If you mean another DBConnection connection from the existing pool, they might be disconnected as well.

I was thinking more in terms of:

  • connection is idle, it's some time away from the next ping and there is no user query incoming
  • underlying TCP connection gets closed
  • connection gets DOWN message from the monitor and asks DBConnection to reconnect (via {:disconnect, conn})
  • error is logged (similar to ping callback), making it clear that there might be some instability in the user's system
  • the next arriving query gets a "connected" connection, or at least a best effort at it

What seems to be happening right now in Ch is the following:

  • connection is idle, it's some time away from the next ping and there is no user query incoming
  • underlying TCP connection gets closed
  • DBConnection query arrives, uses a closed connection, gets :closed error back, DBConnection.Connection disconnects
  • caller gets an error (needs to retry)
  • DBConnection starts a new DBConnection.Connection (=reconnects)
  • the next query works

So that it's all correct, but could've been handled earlier without making the caller retry the query.

For some additional context, I think it might be similar to sneako/finch#62, sneako/finch#272

@josevalim
Copy link
Member

I was thinking more in terms of

Makes sense! I think this can all be implemented in the connection then, no? Without a need to change DBConnection? You should be able to receive the :inet.monitor messages in your connection process.

@ruslandoga
Copy link
Author

ruslandoga commented Nov 9, 2024

I'll try but I think info messages are not "propagated" to the adapter. But maybe one of the existing {:DOWN,...} handlers would work out of the box. First thought is that they wouldn't work since this monitor ref would be different.

@ruslandoga
Copy link
Author

I tried it out in plausible/ch#212 and the unmatched info messages seem to be ignored:

12:29:27.338 [info] Ch.Connection (#PID<0.1827.0>) missed message: {:DOWN, #Reference<0.854798976.1477443587.123761>, :port, #Port<0.4>, :normal}

12:29:27.338 [info] Ch.Connection (#PID<0.1827.0>) missed message: {:DOWN, #Reference<0.854798976.1477443587.123863>, :port, #Port<0.5>, :normal}

@josevalim
Copy link
Member

Good catch, so let's proxy those if the connection has defined a handle_info callback (which will be optional). Can you please send a PR? Otherwise let me know and I will tackle it.

@ruslandoga
Copy link
Author

Can you please send a PR?

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants