-
Notifications
You must be signed in to change notification settings - Fork 110
fix(2117): if si_wq is full, reset connection in case of flooding #2150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
In general I don't have any corrections, however I have few suggestions:
@krizhanovsky Please see this comment, we need to know your opinion. |
It's not so easy to do this. Maybe do not close that issue for later investigation. |
const-t
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Only one question, maybe we should do the same thing for websockets as well?
Yes, other places should be fixed, too. I'll try to cover them later. |
krizhanovsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a lot of questions about this PR. Also from #2117 :
The root cause of the tls error is that si_wq (which has a default budget of 10, but even 1,000,000 is not enough to flood a single connection) cannot tolerate the high rate of ping acks being sent and returns -EBUSY
Why? si_wq is supposed to be a very fast lock-free RB and we wake up a target processor on insertions. What's the reason for the slowness on the read (processing) side? There is something fundamentally broken if a Python script can flood the lock-free in-kernel network processing.
Probably it's OK to reset TCP connections, which we can't handle, but we should not involve security events handling for this.
Having #1940 (comment) in mind, I'd propose to postpone the fix until #1940
| tls_state_to_str(tls->state), r, | ||
| r == -EBADMSG ? "(bad ciphertext)" : ""); | ||
| return r; | ||
| return T_BLOCK_WITH_RST; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we're going to block the client, not just reset it's connection. As noted in lib/log.h the return code for a security event, not for OOM, which might have different reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pure misleading in naming, all I want is sending RST, but we have only T_BLOCK_WITH_RST constant. Maybe we should bring in a dedicated constant for normal RST.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we need to reset connections, then we do need designated RST constant and appropriate workflow handling the return codes.
| } | ||
|
|
||
| if (unlikely(SS_CONN_TYPE(sk) & Conn_Reset)) | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing Conn_Reset state, used only in one function seems like a workaround. I'd prefer a more clear solution. Seems this state is only needed to not to execute the skb processing while the socket is in the queue for closing, so don't we already have enough information about the socket state at the moment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a workaround, but a bugfix: In case of errors, we should exit the call chain and stop handling the involved TLS records, unrolled skb, sk->sk_receive_queue, and future sk_data_ready callbacks from the kernel (that's why we need to reset but not close the socket). Otherwise, it's an undefined behavior. And yes, we didn't cover the RST case (not close) in that function, that's exactly why I made changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new socket state used only in one function looks awkward.
Can we handle this with a real socket state? Maybe we can even avoid calling sk_data_ready by setting some socket state and/or doing partial close/reset?
Co-authored-by: Alexander Krizhanovsky <[email protected]>
The bottleneck is not the locking, but the efficiency of work processing in the NET_TX softirq. Obviously, sending is much slower than enqueuing, maybe due to the TLS encryption. Another suspicious point is, that when we handle the sending of the same socket in another CPU, it's most likely we will be blocked at the locking of the |
|
Closed in favor of #2257 |
close #2117
The leak only occurs when
si_wqis full and continues to process the current skb (which may contain the remaining SSL record),sk->sk_receive_queue, and possibly skbs that come in later. The leak is never triggered when we reset the connection and stop processing data immediately. Such a fix would be reasonable even without the leak since it is unlikely thatsi_wqwill become full without flooding.