-
Notifications
You must be signed in to change notification settings - Fork 1
[BE] SSE connection 부하 문제 개선 #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- 서버가 최대 1703개의 connection을 동시에 연결 가능하다는 점을 고려 - 최대 연결 가능한 SSE 통신 개수를 1500개로 지정
WalkthroughThe changes add a new connection limit to the SSE service by introducing a constant ( Changes
Sequence Diagram(s)sequenceDiagram
participant Subscriber
participant SseService
Subscriber->>SseService: Request subscription for course/student messages
SseService->>SseService: call checkIfConnectionFull()
alt Connection limit reached
SseService-->>Subscriber: Return empty Flux
else
SseService-->>Subscriber: Proceed with subscription
end
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (1)
148-154: Well-implemented connection limit check.The
checkIfConnectionFull()method effectively determines if the connection limit has been reached. However, there's a small discrepancy in the log message.The log message says "exceeded" (초과했습니다) while the condition checks if the connection count has "reached or exceeded" the limit. Consider updating the message for accuracy:
- log.debug("SSE 연결 제한을 초과했습니다. : 현재 연결 개수 = {}", sinks.size()); + log.debug("SSE 연결 제한에 도달했습니다. : 현재 연결 개수 = {}", sinks.size());
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java(5 hunks)
🔇 Additional comments (8)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (8)
25-25: Good implementation of connection limit constant.The
MAX_SINKSconstant is appropriately defined with a value of 1500, addressing the OutOfMemoryError issue mentioned in the PR objectives. This aligns with the findings from issue #221 which determined the maximum safe number of SSE connections.
31-33: Effective connection limit check for course messages.This connection limit check is correctly implemented to prevent new course connections when the limit is reached. The empty Flux return ensures graceful rejection of new connections.
47-49: Effective connection limit check for student messages.The connection limit check is properly implemented for student messages as well, ensuring consistent behavior across both types of connections.
88-88: Good monitoring addition for course connections.Adding the
showCurrentUsers()call improves observability by logging the current connection count when a new course connection is established.
95-95: Improved log formatting.Minor improvement to the log formatting for better readability.
106-106: Good monitoring addition for student connections.The
showCurrentUsers()call provides consistent connection monitoring for student connections.
113-113: Improved log formatting.Minor improvement to the log formatting for better readability.
156-158: Good monitoring implementation.The
showCurrentUsers()method provides helpful visibility into the current connection count, which is valuable for monitoring and debugging.
| private boolean checkIfConnectionFull() { | ||
| if (sinks.size() >= MAX_SINKS) { | ||
| log.debug("SSE 연결 제한을 초과했습니다. : 현재 연결 개수 = {}", sinks.size()); | ||
| return true; | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
만약 1499개의 연결이 있을 때 여러개의 연결 요청이 동시에 들어와서 1500개를 넘어가는 경우도 있지 않을까요? 물론 >= 연산이기 때문에 적절하게 연결 제한 초과했다고 알리겠지만 연결 요청을 처리하는 스레드가 2~300개일 경우도 있지 않을까 하는 생각이 들었습니다. 저희 스레드풀의 스레드 수를 확인해보시면 좋을 것 같아요. 개인적으로 제가 말씀드린 내용으로 인해 큰 문제가 생기진 않을 것 같긴 합니다.😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
그러네요. 동시성 이슈가 생길 수 있다고 생각합니다. 그래서 새로운 방법으로 AtomicInteger를 도입하고, 실제 Connection에 추가하는 시점에 개수를 검사해 만약 1500개가 넘기게 되면 예외 처리를 하도록 수정하는 방법은 어떨까요? AtomicInteger를 사용하면 race condition 문제도 해결 가능하고, 실제 connection을 추가하는 시점에서 개수 검사를 진행해 유리님께서 말씀하신 문제를 개선할 수 있지 않을까 하는 생각이 들었습니다.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AutomicInteger를 쓰게 되면 race condition 문제는 해결이 되겠지만 그럼 AutomicInteger의 값을 증감할 때 synchronized 처리가 되는 거 아닌가요? 이전처럼 size() 함수를 호출하여 검사하는 방법보다 AutomicInteger를 사용했을 때만의 장점이 크게 와닿진 않는 것 같아요.
혹시 AutomicInteger를 썼을 때의 장점을 좀 더 설명해주실 수 있나요??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
음 ConcurrentHashMap도 공유 자원이기 때문에 충분히 ConcurrentHashMap.size()를 사용해도 AtomicInteger와 같은 효과겠네요. 연결 개수 관리 또한 공유 자원이라는 생각으로 AtomicInteger를 도입했는데, ConcurrentHashMap.size()를 사용하면 굳이 사용하지 않아도 될 것 같습니다.
- AtomicInteger로 멀티스레드 환경에서 race condition 방지 - 실제 connection을 추가하는 시점에서 개수 체크를 수행함으로써, 동시성 이슈 개선
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (1)
32-42: Consider returning an error response instead of empty Flux.When connections are full, the current implementation returns
Flux.empty(), which doesn't inform the client why the connection was rejected. Consider returning a proper error response with theCONNECTION_LIMIT_EXCEEDEDerror code.The approach of checking both before trying to create a connection and after incrementing the counter effectively addresses the race condition concern mentioned in previous comments.
public Flux<ServerSentEvent<MessageResponse<?>>> subscribeCourseMessages(String courseId) { if (checkIfConnectionFull()) { - return Flux.empty(); + MessageResponse<?> errorMessage = new MessageResponse<>("CONNECTION_LIMIT_EXCEEDED", null); + ServerSentEvent<MessageResponse<?>> errorEvent = ServerSentEvent.<MessageResponse<?>>builder() + .retry(Duration.ofSeconds(RETRY_INTERVAL_SECONDS)) + .data(errorMessage) + .build(); + return Flux.just(errorEvent); } // Rest of the method unchanged }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java(5 hunks)back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/global/exception/code/SseErrorCode.java(1 hunks)
🔇 Additional comments (8)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/global/exception/code/SseErrorCode.java (1)
11-11: Appropriate error code for connection limiting.The newly added error code uses the semantically correct HTTP status code
TOO_MANY_REQUESTS(429) which is the standard response for rate limiting scenarios. The error message clearly communicates that there are too many users connected.back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (7)
26-27: Good implementation of connection limiting.Using a constant to define the maximum number of connections (1500) along with an
AtomicIntegeris appropriate for concurrent environments. TheAtomicIntegerprevents race conditions when multiple threads try to increment or decrement the counter simultaneously.
54-68: Consider consistent error handling for subscribeStudentMessages.Similar to the previous comment, consider returning a meaningful error response instead of
Flux.empty()when the connection limit is exceeded.The implementation correctly checks at both decision points - before attempting to create a connection and after incrementing the counter - which effectively prevents exceeding the connection limit.
101-102: Helpful diagnostic logging.Adding the user count logging with
showCurrentUsers()is a good practice for monitoring connection state.
119-120: Consistent logging across different connection types.The same diagnostic logging is appropriately applied to student connections as well.
157-163: Properly releasing resources when connections close.The updated
closeConnectionmethod correctly decrements the counter when connections are closed, which is essential for accurate connection tracking.
165-172: Effective connection limit check implementation.The
checkIfConnectionFullmethod provides a clean abstraction for checking the connection limit status. The logging statement when connections are full is helpful for debugging and monitoring.
174-176: Useful monitoring utility.The
showCurrentUsersmethod provides valuable runtime information about the current connection count, which will help with monitoring the service under load.
📝 추가 작업 내용기존 구현 방식은 동시성 이슈가 발생할 위험이 높아, 아래와 같은 방식으로 connection 개수를 관리하도록 수정
|
ConcurrentHashMap.size() vs AutomicInteger 비교
AutomicInteger가 더 좋아보임size()를 매번 확인하는 것보다 AutomicInteger의 장점이 더 좋아 보입니다. AutomicInteger는 CPU 아주 가까이서 연산되는 CAS 연산이라고 합니다. 찾아보는 해당 방법을 적용하는 게 좋아 보입니다!! 근데 decrementAndGet()이 제대로 호출되지 않을 수도 있으니 그 부분을 고려해봐야 할 것 같아요. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (2)
55-57: Consider providing feedback to client when connection limit is reachedWhen the connection limit is reached, the current implementation returns an empty Flux, which doesn't communicate to the client why the connection was rejected. Consider adding an error response that explains the connection was refused due to server load.
- if (checkIfConnectionFull()) { - return Flux.empty(); - } + if (checkIfConnectionFull()) { + return Flux.error(new BaseException(SseErrorCode.CONNECTION_LIMIT_EXCEEDED)); + }Also applies to: 62-68
165-172: Potential improvement for checking connection limitThe current implementation can potentially have a race condition because it reads the connection count and then performs a check. Consider using a direct comparison with the AtomicInteger's compareAndSet method for more atomic operations.
private boolean checkIfConnectionFull() { - int connectionCount = currentConnections.get(); - if (connectionCount >= MAX_SINKS) { + if (currentConnections.get() >= MAX_SINKS) { log.debug("SSE 연결 제한을 초과했습니다. : 현재 연결 개수 = {}", connectionCount); return true; } return false; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java(5 hunks)
🔇 Additional comments (7)
back-end/reacton-classroom/src/main/java/com/softeer/reacton_classroom/domain/sse/service/SseService.java (7)
18-18: Good implementation of connection limit with AtomicInteger!The use of AtomicInteger for tracking connections is a robust solution for managing concurrent connections in a thread-safe manner. Setting the MAX_SINKS constant to 1500 will help prevent OutOfMemoryError issues as mentioned in the PR objectives.
Also applies to: 26-27
32-35: Early check for connection limit is efficientChecking the connection count before attempting to create a new sink is a good optimization to avoid unnecessary processing when connections are already at maximum capacity.
36-42: Effective race condition prevention with atomic increment and checkThe use of
incrementAndGet()followed by an immediate check and possibledecrementAndGet()is an excellent approach to handle the race condition scenario mentioned in the PR comments. This ensures that even when multiple threads try to create connections simultaneously near the limit, you won't exceed the 1500 connection threshold.
102-102: Improved monitoring with connection count loggingThe addition of
showCurrentUsers()calls provides valuable visibility into the current system load, which will be helpful for monitoring connection trends and potential bottlenecks.Also applies to: 120-120
157-163: Robust connection closure logic with connection counter managementThe improved
closeConnectionmethod shows good attention to detail:
- Renamed parameter from
courseIdto more genericidfor better reusability- Added null check to prevent NullPointerException
- Properly decrements the connection counter to maintain accurate tracking
This ensures clean resource management and accurate connection tracking.
174-176: Clear logging for connection monitoringThe
showCurrentUsersmethod provides a simple yet effective way to monitor the current load on the SSE service. This will be valuable for operational monitoring and troubleshooting.
96-99:Details
❓ Verification inconclusive
Verify disconnect handling for message broadcast
The current implementation sends messages to all students in a course, but there's no handling for the case where a student might have disconnected unexpectedly. This could lead to exceptions when trying to send messages to disconnected clients.
🏁 Script executed:
#!/bin/bash # Check if there's any error handling for sending messages to potentially disconnected students # Looking for try-catch blocks or error handling around the message sending logic rg -A 5 -B 5 "sendMessage.*studentId.*message" --include="*.java" # Check if the sendMessage method handles the case where a sink might be disconnected rg -A 10 "sendMessage.*sink.*tryEmitNext" --include="*.java"Length of output: 1014
Review Update: Ensure Explicit Disconnect Handling in
SseServiceThe current loop at lines 96–99 iterates over students and calls
sendMessage(studentId, message)without any apparent error handling. This raises concerns about what happens if a client has disconnected (e.g., if the underlying sink is no longer available), as exceptions might then disrupt the broadcast.It’s unclear from the existing code whether the
sendMessagemethod internally manages disconnection issues (for example, by wrapping its call to something like atryEmitNextin a try-catch block or by checking the connection status). Please verify one of the following:
If disconnect handling is implemented inside
sendMessageor elsewhere:
Ensure that this behavior is documented so that the implicit handling is clear to future maintainers.Otherwise:
Consider adding explicit error handling (for example, wrapping thesendMessagecall in a try-catch block) to gracefully handle scenarios where a student’s connection is lost. This will help prevent unhandled exceptions during message broadcast.
음 성능 면에서 차이가 있었네요. 매 번 연결할 때마다 크기를 확인해야 한다는 점을 고려하면 크기 값을 가져오는 성능이 더 뛰어나고, 정확하게 1500개의 connection으로 제한할 수 있는 AtomicInteger가 적합해 보입니다. |
uri010
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
이랬다 저랬다 해서 죄송해요 ㅠㅠ 수고하셨습니다!
[BE] SSE connection 부하 문제 개선
#️⃣ 연관된 이슈
#251
📝 작업 내용
동시에 너무 많은 SSE 통신 연결 시 OutOfMemoryError가 발생하는 문제 해결
💬 리뷰 요구사항(선택)
Summary by CodeRabbit