-
Notifications
You must be signed in to change notification settings - Fork 5.5k
misc: Make event loop THE implementation for httpRemoteTask and related code #26697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer's GuideMakes the event-loop based HttpRemoteTask implementation the default by removing the legacy executor-based implementation and its configuration flag, wiring HttpRemoteTaskFactory to always use SafeEventLoopGroup, and aligning tests and config mappings with the new default. Sequence diagram for creating a RemoteTask using the event loop based HttpRemoteTask implementationsequenceDiagram
participant CoordinatorTaskManager
participant HttpRemoteTaskFactory
participant SafeEventLoopGroup
participant SafeEventLoop
participant HttpRemoteTaskWithEventLoop
CoordinatorTaskManager->>HttpRemoteTaskFactory: createRemoteTask(session, taskId, node, fragment, initialSplits, outputBuffers, httpClient, config,...)
activate HttpRemoteTaskFactory
HttpRemoteTaskFactory->>SafeEventLoopGroup: next()
activate SafeEventLoopGroup
SafeEventLoopGroup-->>HttpRemoteTaskFactory: SafeEventLoop
deactivate SafeEventLoopGroup
HttpRemoteTaskFactory->>HttpRemoteTaskWithEventLoop: new HttpRemoteTaskWithEventLoop(..., eventLoop)
activate HttpRemoteTaskWithEventLoop
HttpRemoteTaskWithEventLoop-->>HttpRemoteTaskFactory: RemoteTask instance
deactivate HttpRemoteTaskWithEventLoop
HttpRemoteTaskFactory-->>CoordinatorTaskManager: RemoteTask
deactivate HttpRemoteTaskFactory
Class diagram for event loop based HttpRemoteTask implementation and factoryclassDiagram
class TaskManagerConfig {
-double highMemoryTaskKillerGCReclaimMemoryThreshold
-Duration highMemoryTaskKillerFrequentFullGCDurationThreshold
-double highMemoryTaskKillerHeapMemoryThreshold
-Duration slowMethodThresholdOnEventLoop
+getSlowMethodThresholdOnEventLoop() Duration
+setSlowMethodThresholdOnEventLoop(Duration slowMethodThresholdOnEventLoop) TaskManagerConfig
}
class SafeEventLoopGroup {
+SafeEventLoopGroup(int maxCallbackThreads, ThreadFactory threadFactory, Duration slowMethodThresholdOnEventLoop)
+next() SafeEventLoop
}
class SafeEventLoop {
+SafeEventLoop(SafeEventLoopGroup parent, Executor executor)
}
class RemoteTask {
<<interface>>
}
class HttpRemoteTaskWithEventLoop {
+HttpRemoteTaskWithEventLoop(Session session, TaskId taskId, String nodeId, URI legacyTaskLocation, URI taskLocation, PlanFragment fragment, List_SplitAssignment_ initialSplits, OutputBuffers outputBuffers, HttpClient httpClient, Duration maxErrorDuration, Duration taskStatusRefreshMaxWait, Duration taskInfoRefreshMaxWait, Duration taskInfoUpdateInterval, boolean summarizeTaskInfo, Codec_TaskStatus_ taskStatusCodec, Codec_TaskInfo_ taskInfoCodec, JsonCodec_TaskInfo_ taskInfoJsonCodec, Codec_TaskUpdateRequest_ taskUpdateRequestCodec, Codec_TaskInfoResponse_ taskInfoResponseCodec, Codec_PlanFragment_ planFragmentCodec, NodeStatusTracker nodeStatsTracker, TaskStatsTracker stats, boolean binaryTransportEnabled, boolean thriftTransportEnabled, boolean taskInfoThriftTransportEnabled, boolean taskUpdateRequestThriftSerdeEnabled, boolean taskInfoResponseThriftSerdeEnabled, ThriftProtocol thriftProtocol, TableWriteInfo tableWriteInfo, DataSize maxTaskUpdateSizeInBytes, MetadataManager metadataManager, QueryManager queryManager, DecayCounter taskUpdateRequestSize, boolean taskUpdateSizeTrackingEnabled, HandleResolver handleResolver, SchedulerStatsTracker schedulerStatsTracker, SafeEventLoop eventLoop)
}
class HttpRemoteTaskFactory {
-Optional_SafeEventLoopGroup_ eventLoopGroup
+HttpRemoteTaskFactory(TaskManagerConfig taskConfig, RemoteTaskConfig config, HttpClient httpClient)
+createRemoteTask(Session session, TaskId taskId, RemoteNode node, PlanFragment fragment, List_SplitAssignment_ initialSplits, OutputBuffers outputBuffers, HttpClient httpClient, Duration maxErrorDuration, Duration taskStatusRefreshMaxWait, Duration taskInfoRefreshMaxWait, Duration taskInfoUpdateInterval, boolean summarizeTaskInfo, Codec_TaskStatus_ taskStatusCodec, Codec_TaskInfo_ taskInfoCodec, JsonCodec_TaskInfo_ taskInfoJsonCodec, Codec_TaskUpdateRequest_ taskUpdateRequestCodec, Codec_TaskInfoResponse_ taskInfoResponseCodec, Codec_PlanFragment_ planFragmentCodec, NodeStatusTracker nodeStatsTracker, TaskStatsTracker stats, boolean binaryTransportEnabled, boolean thriftTransportEnabled, boolean taskInfoThriftTransportEnabled, boolean taskUpdateRequestThriftSerdeEnabled, boolean taskInfoResponseThriftSerdeEnabled, ThriftProtocol thriftProtocol, TableWriteInfo tableWriteInfo, DataSize maxTaskUpdateSizeInBytes, MetadataManager metadataManager, QueryManager queryManager, DecayCounter taskUpdateRequestSize, boolean taskUpdateSizeTrackingEnabled, HandleResolver handleResolver, SchedulerStatsTracker schedulerStatsTracker) RemoteTask
}
HttpRemoteTaskWithEventLoop ..|> RemoteTask
HttpRemoteTaskFactory o-- SafeEventLoopGroup
HttpRemoteTaskFactory ..> HttpRemoteTaskWithEventLoop
HttpRemoteTaskFactory ..> TaskManagerConfig
SafeEventLoopGroup o-- SafeEventLoop
TaskManagerConfig ..> SafeEventLoopGroup
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead make the default value for task.enable-event-loop = true , let it bake for couple of releases and then delete the code ?
Not sure if other open source users ever enabled this eventloop feature and tested. Making it true by default might bring up some issues in non-meta deployments, which we can fix and then delete the fallback code.
4b71700 to
3cfb487
Compare
Description
Motivation and Context
clean up the duplicate code back then for roll out purpose.
Impact
Test Plan
Contributor checklist
Release Notes