Skip to content

Conversation

@xuang7
Copy link
Contributor

@xuang7 xuang7 commented Oct 27, 2025

What changes were proposed in this PR?

This PR introduces on-demand (batch) presigning for multipart uploads to reduce failures from expired pre-signed URLs. Previously, all part URLs were pre-signed at the start using an experimental LakeFS API. For long uploads, URLs for later parts could expire (after 15 min locally or 30 min on the server), causing the upload to fail midway. The revised implementation uses the LakeFS function for initial setup, then presigns URL batches on-demand directly using S3Presigner.

Changes (Backend)

  • Add a new method, presignUploadParts. This method uses s3Presigner to sign a specific list of provided partNumbers
  • The /multipart-upload endpoint coordinates new signing flow:
    • type=init: Initiates upload with LakeFS (numParts=0), returns only uploadId and physicalAddress.
    • type=presign(New operation): This endpoint receives a pendingParts list and physicalAddress from the client. It calls the new S3StorageClient.presignUploadParts to sign the requested batch and returns the new URLs.

Changes (Frontend)

  • Refactored multipart upload to use RxJS concatMap for sequential batch processing:
    • Initiates with type=init (no pre-signed URLs)
    • Processes uploads in batches, calling type=presign for each batch just before uploading
  • Introduce a urlBatchSize variable (default: 50) to control how many URLs are requested in each init and sign call.

Changes (Config)

  • Added s3MultipartPresignExpiryMinutes configuration variable to control presigned URL expiration time (default: 30 minutes)

Presigned URL Comparison

LakeFS initiatePresignedMultipartUploads S3 presignUploadParts

Any related issues, documentation, discussions?

Fixes #3837
Resolves URL expiration for pending parts. Fully handling interruptions during part uploads requires resumable uploads.

How was this PR tested?

Tested with existing automated test cases and local manual tests.

Was this PR authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added feature frontend Changes related to the frontend GUI service labels Oct 27, 2025
@xuang7 xuang7 marked this pull request as ready for review October 27, 2025 01:25
@xuang7 xuang7 marked this pull request as draft October 27, 2025 05:28
@aglinxinyuan aglinxinyuan requested a review from Copilot October 27, 2025 05:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces on-demand batch presigning for multipart uploads to prevent failures from expired pre-signed URLs during long-running uploads. Previously, all part URLs were pre-signed upfront, causing later parts to expire (15-30 minutes). The new implementation presigns URLs in batches as needed.

Key Changes:

  • Backend adds presignUploadParts method using S3Presigner to sign specific part batches on-demand
  • API endpoint now supports type=init (first batch) and new type=sign operation (subsequent batches)
  • Frontend switches to RxJS expand operator for recursive, stateless batch fetching with configurable batch size (default: 100 parts)

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
frontend/src/app/dashboard/service/user/dataset/dataset.service.ts Implements RxJS expand-based recursive batch fetching; adds signPendingParts method and urlBatchSize configuration
file-service/src/main/scala/org/apache/texera/service/util/S3StorageClient.scala Adds S3Presigner client and presignUploadParts method with URI extraction helper
file-service/src/main/scala/org/apache/texera/service/resource/DatasetResource.scala Adds "sign" operation handler; converts init response to Map format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

aglinxinyuan and others added 2 commits October 29, 2025 20:41
@github-actions github-actions bot added the common label Nov 1, 2025
@xuang7 xuang7 marked this pull request as ready for review November 1, 2025 23:53
@chenlica chenlica requested a review from aicam November 15, 2025 07:33
@chenlica
Copy link
Contributor

@aicam please review it before @aglinxinyuan does his review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common feature frontend Changes related to the frontend GUI service

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Presigned URL expiration causes upload failure after 30 minutes

3 participants