Skip to content

Conversation

@shangm2
Copy link
Contributor

@shangm2 shangm2 commented Nov 25, 2025

Description

  1. event loop based http remote task, task status fetcher and task info fetcher has been reliably running in production for many months now. This pr makes event loop as the default implementation for those three classes and remove the old code.
  2. I also tried to replay production queries with only 5 threads for the event loop attempting to catch any potential issues but no issue has been found.
  3. heapdump analysis also showed that the failTask call is heavy containing a recursive toFailure call. This is something we can optimize.

Motivation and Context

clean up the duplicate code back then for roll out purpose.

Impact

Test Plan

  1. passed verifiers

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.
  • If adding new dependencies, verified they have an OpenSSF Scorecard score of 5.0 or higher (or obtained explicit TSC approval for lower scores).

Release Notes

== NO RELEASE NOTE ==

@shangm2 shangm2 requested a review from a team as a code owner November 25, 2025 17:41
@prestodb-ci prestodb-ci added the from:Meta PR from Meta label Nov 25, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Nov 25, 2025

Reviewer's Guide

Makes the event-loop based HttpRemoteTask implementation the default by removing the legacy executor-based implementation and its configuration flag, wiring HttpRemoteTaskFactory to always use SafeEventLoopGroup, and aligning tests and config mappings with the new default.

Sequence diagram for creating a RemoteTask using the event loop based HttpRemoteTask implementation

sequenceDiagram
    participant CoordinatorTaskManager
    participant HttpRemoteTaskFactory
    participant SafeEventLoopGroup
    participant SafeEventLoop
    participant HttpRemoteTaskWithEventLoop

    CoordinatorTaskManager->>HttpRemoteTaskFactory: createRemoteTask(session, taskId, node, fragment, initialSplits, outputBuffers, httpClient, config,...)
    activate HttpRemoteTaskFactory

    HttpRemoteTaskFactory->>SafeEventLoopGroup: next()
    activate SafeEventLoopGroup
    SafeEventLoopGroup-->>HttpRemoteTaskFactory: SafeEventLoop
    deactivate SafeEventLoopGroup

    HttpRemoteTaskFactory->>HttpRemoteTaskWithEventLoop: new HttpRemoteTaskWithEventLoop(..., eventLoop)
    activate HttpRemoteTaskWithEventLoop
    HttpRemoteTaskWithEventLoop-->>HttpRemoteTaskFactory: RemoteTask instance
    deactivate HttpRemoteTaskWithEventLoop

    HttpRemoteTaskFactory-->>CoordinatorTaskManager: RemoteTask
    deactivate HttpRemoteTaskFactory
Loading

Class diagram for event loop based HttpRemoteTask implementation and factory

classDiagram
    class TaskManagerConfig {
        -double highMemoryTaskKillerGCReclaimMemoryThreshold
        -Duration highMemoryTaskKillerFrequentFullGCDurationThreshold
        -double highMemoryTaskKillerHeapMemoryThreshold
        -Duration slowMethodThresholdOnEventLoop
        +getSlowMethodThresholdOnEventLoop() Duration
        +setSlowMethodThresholdOnEventLoop(Duration slowMethodThresholdOnEventLoop) TaskManagerConfig
    }

    class SafeEventLoopGroup {
        +SafeEventLoopGroup(int maxCallbackThreads, ThreadFactory threadFactory, Duration slowMethodThresholdOnEventLoop)
        +next() SafeEventLoop
    }

    class SafeEventLoop {
        +SafeEventLoop(SafeEventLoopGroup parent, Executor executor)
    }

    class RemoteTask {
        <<interface>>
    }

    class HttpRemoteTaskWithEventLoop {
        +HttpRemoteTaskWithEventLoop(Session session, TaskId taskId, String nodeId, URI legacyTaskLocation, URI taskLocation, PlanFragment fragment, List_SplitAssignment_ initialSplits, OutputBuffers outputBuffers, HttpClient httpClient, Duration maxErrorDuration, Duration taskStatusRefreshMaxWait, Duration taskInfoRefreshMaxWait, Duration taskInfoUpdateInterval, boolean summarizeTaskInfo, Codec_TaskStatus_ taskStatusCodec, Codec_TaskInfo_ taskInfoCodec, JsonCodec_TaskInfo_ taskInfoJsonCodec, Codec_TaskUpdateRequest_ taskUpdateRequestCodec, Codec_TaskInfoResponse_ taskInfoResponseCodec, Codec_PlanFragment_ planFragmentCodec, NodeStatusTracker nodeStatsTracker, TaskStatsTracker stats, boolean binaryTransportEnabled, boolean thriftTransportEnabled, boolean taskInfoThriftTransportEnabled, boolean taskUpdateRequestThriftSerdeEnabled, boolean taskInfoResponseThriftSerdeEnabled, ThriftProtocol thriftProtocol, TableWriteInfo tableWriteInfo, DataSize maxTaskUpdateSizeInBytes, MetadataManager metadataManager, QueryManager queryManager, DecayCounter taskUpdateRequestSize, boolean taskUpdateSizeTrackingEnabled, HandleResolver handleResolver, SchedulerStatsTracker schedulerStatsTracker, SafeEventLoop eventLoop)
    }

    class HttpRemoteTaskFactory {
        -Optional_SafeEventLoopGroup_ eventLoopGroup
        +HttpRemoteTaskFactory(TaskManagerConfig taskConfig, RemoteTaskConfig config, HttpClient httpClient)
        +createRemoteTask(Session session, TaskId taskId, RemoteNode node, PlanFragment fragment, List_SplitAssignment_ initialSplits, OutputBuffers outputBuffers, HttpClient httpClient, Duration maxErrorDuration, Duration taskStatusRefreshMaxWait, Duration taskInfoRefreshMaxWait, Duration taskInfoUpdateInterval, boolean summarizeTaskInfo, Codec_TaskStatus_ taskStatusCodec, Codec_TaskInfo_ taskInfoCodec, JsonCodec_TaskInfo_ taskInfoJsonCodec, Codec_TaskUpdateRequest_ taskUpdateRequestCodec, Codec_TaskInfoResponse_ taskInfoResponseCodec, Codec_PlanFragment_ planFragmentCodec, NodeStatusTracker nodeStatsTracker, TaskStatsTracker stats, boolean binaryTransportEnabled, boolean thriftTransportEnabled, boolean taskInfoThriftTransportEnabled, boolean taskUpdateRequestThriftSerdeEnabled, boolean taskInfoResponseThriftSerdeEnabled, ThriftProtocol thriftProtocol, TableWriteInfo tableWriteInfo, DataSize maxTaskUpdateSizeInBytes, MetadataManager metadataManager, QueryManager queryManager, DecayCounter taskUpdateRequestSize, boolean taskUpdateSizeTrackingEnabled, HandleResolver handleResolver, SchedulerStatsTracker schedulerStatsTracker) RemoteTask
    }

    HttpRemoteTaskWithEventLoop ..|> RemoteTask
    HttpRemoteTaskFactory o-- SafeEventLoopGroup
    HttpRemoteTaskFactory ..> HttpRemoteTaskWithEventLoop
    HttpRemoteTaskFactory ..> TaskManagerConfig
    SafeEventLoopGroup o-- SafeEventLoop
    TaskManagerConfig ..> SafeEventLoopGroup
Loading

File-Level Changes

Change Details Files
Make event-loop based HttpRemoteTask the only implementation and always construct tasks via the event loop factory method.
  • Always initialize SafeEventLoopGroup in HttpRemoteTaskFactory instead of gating it on a config flag
  • Remove conditional branch that instantiated the legacy executor-based HttpRemoteTask and always call createHttpRemoteTaskWithEventLoop instead
  • Pass a SafeEventLoop instance from the SafeEventLoopGroup into createHttpRemoteTaskWithEventLoop and drop executor-based arguments
presto-main/src/main/java/com/facebook/presto/server/remotetask/HttpRemoteTaskFactory.java
Remove the event-loop enablement flag from TaskManagerConfig and its tests, making event-loop usage unconditional from a config perspective.
  • Drop enableEventLoop field and its getter/setter from TaskManagerConfig along with the 'task.enable-event-loop' @config mapping
  • Update TaskManagerConfig default-value test expectations to no longer reference enableEventLoop
  • Update explicit property-mapping tests to remove any use of 'task.enable-event-loop' and the corresponding config assertion
presto-main-base/src/main/java/com/facebook/presto/execution/TaskManagerConfig.java
presto-main-base/src/test/java/com/facebook/presto/execution/TestTaskManagerConfig.java
Consolidate tests onto the event-loop based HttpRemoteTask and expose helper APIs needed by other tests.
  • Switch TestHttpRemoteTaskConnectorCodec to use TestHttpRemoteTaskWithEventLoop for FailureScenario and polling utilities
  • Relax visibility of helper methods and FailureScenario enum in TestHttpRemoteTaskWithEventLoop so they can be reused from other test classes
  • Extend TestHttpRemoteTaskWithEventLoop.TestingTaskResource to track and expose the last TaskUpdateRequest for assertions in other tests
presto-main/src/test/java/com/facebook/presto/server/remotetask/TestHttpRemoteTaskConnectorCodec.java
presto-main/src/test/java/com/facebook/presto/server/remotetask/TestHttpRemoteTaskWithEventLoop.java
Delete the legacy executor-based HttpRemoteTask implementation and associated fetchers and tests, relying solely on the event-loop variants.
  • Remove ContinuousTaskStatusFetcher, HttpRemoteTask, and TaskInfoFetcher classes from the remotetask package
  • Delete the legacy TestHttpRemoteTask test class that targeted the removed implementation
presto-main/src/main/java/com/facebook/presto/server/remotetask/ContinuousTaskStatusFetcher.java
presto-main/src/main/java/com/facebook/presto/server/remotetask/HttpRemoteTask.java
presto-main/src/main/java/com/facebook/presto/server/remotetask/TaskInfoFetcher.java
presto-main/src/test/java/com/facebook/presto/server/remotetask/TestHttpRemoteTask.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@shangm2 shangm2 changed the title misc: Make event loop as the implementation for httpRemoteTask and related code misc: Make event loop THE implementation for httpRemoteTask and related code Nov 25, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@NikhilCollooru NikhilCollooru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead make the default value for task.enable-event-loop = true , let it bake for couple of releases and then delete the code ?

Not sure if other open source users ever enabled this eventloop feature and tested. Making it true by default might bring up some issues in non-meta deployments, which we can fix and then delete the fallback code.

@shangm2 shangm2 merged commit c526fa0 into prestodb:master Dec 2, 2025
86 of 87 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:Meta PR from Meta

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants