Add error handling improvements for upload-media#74917
Add error handling improvements for upload-media#74917adamsilverstein wants to merge 30 commits intotrunkfrom
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
|
Size Change: +446 B (+0.01%) Total Size: 7.73 MB
ℹ️ View Unchanged
|
|
Flaky tests detected in eef41e8. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/23777054911
|
22b6f13 to
34b509a
Compare
c5d67ea to
7d17226
Compare
There was a problem hiding this comment.
Pull request overview
This PR implements comprehensive error handling and retry logic for the upload-media package to address GitHub Issue #74366. The implementation adds an error classification system with 16 error types, automatic retry with exponential backoff for transient failures, user-friendly localized error messages, and retry state management with a new ItemStatus.PendingRetry status.
Changes:
- Added
ErrorCodeenum with 16 error types categorized as retryable (network, timeout, server, VIPS worker errors) and non-retryable (validation, permission, file size, MIME type errors) - Implemented automatic retry with exponential backoff and jitter for failed uploads, with configurable retry settings (default: 3 attempts, 1s-30s delays, 2x multiplier)
- Created comprehensive error message system with localized, actionable user guidance for all error types
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/upload-media/src/upload-error.ts | Added ErrorCode enum and isRetryable getter to classify errors |
| packages/upload-media/src/error-messages.ts | New file providing user-friendly i18n error messages for all error codes |
| packages/upload-media/src/store/utils/retry.ts | New file with exponential backoff calculation and error classification logic |
| packages/upload-media/src/store/utils/debug-logger.ts | Added logging functions for retry scheduling, execution, and max retries exceeded |
| packages/upload-media/src/store/types.ts | Added RetrySettings, ScheduleRetryAction, ItemStatus.PendingRetry, and retry-related QueueItem fields |
| packages/upload-media/src/store/constants.ts | Added retry configuration constants (max attempts, delays, multiplier, jitter) |
| packages/upload-media/src/store/reducer.ts | Added ScheduleRetry case and default retry settings initialization |
| packages/upload-media/src/store/actions.ts | Modified cancelItem to check retry eligibility, added scheduleRetry and executeRetry actions |
| packages/upload-media/src/store/private-selectors.ts | Added selectors for retry state (pending items, retry timestamp, retry count, max retries check) |
| packages/upload-media/src/store/test/retry.ts | New test file with 28 unit tests for retry utilities |
| packages/upload-media/src/index.ts | Exported ErrorCode, error message functions, and retry-related types |
| packages/upload-media/README.md | Updated documentation for cancelItem, scheduleRetry, and executeRetry actions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1fb78d7 to
89e2a02
Compare
9e4d2bc to
05918ca
Compare
Implements automatic retry with exponential backoff for failed uploads, error classification system, and user-friendly error messages. Changes: - Add ErrorCode enum with 16 error types (network, validation, processing) - Add isRetryable getter to UploadError class for automatic retry classification - Add retry configuration constants (max attempts, delays, backoff multiplier) - Add RetrySettings interface and ItemStatus.PendingRetry status - Add scheduleRetry/executeRetry actions for automatic retry scheduling - Add retry state selectors (getPendingRetryItems, getItemRetryCount, etc.) - Add retry logging functions for debugging - Add calculateRetryDelay utility with exponential backoff and jitter - Add shouldRetryError utility for error classification - Add user-friendly i18n error messages for all error codes - Add comprehensive unit tests for retry utilities (28 tests) The retry system automatically retries failed uploads for transient errors (network failures, timeouts, server errors) while immediately failing for non-retryable errors (validation, permissions, file size limits). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Moves the retry timer Map and cleanup helper out of actions.ts (public actions) into utils/retry.ts where other retry utilities live. This keeps clearRetryTimer out of the public store API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
andrewserong
left a comment
There was a problem hiding this comment.
Neat ideas here! Unfortunately I haven't been able to get it working for me locally for the case of a dropped network connection. When I reactivate the connection, the loaders still appear to be stuck:
I left a note about the logging, I'm wondering if that can be simplified / re-use warning rather than having lots of granular functions?
Without being able to test manually, the main question I had is what happens if you lose internet connection while uploading a large number of images to a gallery block at once. E.g. a 20-image gallery. Will it attempt to resume all the same time, or is it batched like for regular uploads? I'm also curious what the user expectation here is — how many retries / how long would we expect for it to attempt to upload versus cancelling out the operation 🤔
Separately, while attempting to test the case of an invalid file type (i.e. renaming a JPEG locally to use the .invalid extension) and then dragging and dropping to the editor, it appears to get stuck in a loading state in the canvas, with an error thrown:
I think in this case it's erroring here because there's no .url available on attachment (in this case attachment is undefined, I assume because the server returned a 500 error as the user was not allowed to upload the file type?):
I mightn't get a chance to re-review again before I wrap up for the week (I'm AFK tomorrow), but happy to test more next week if this is still open!
Ok let me work on testing this a bit more manually and try to get it working correctly. thanks for testing.
These should still get batched to avoid trying to do too much all at once.
Once the original image is uploaded we have everything we need to create all of the sub sizes. The expectation should be "when I upload my image to WordPress it will process that image to create all of the sub sizes used for my site. If my connection is interrupted, it will resume processing when the connection returns or the editor is reloaded" that woulkd also work if the user restarted their browser and reopened the post with the partially processed image. For genuine failures (for example lets say the sideload endpoint is failing), the system should fall back to using the existing server processing approach. so ideally users never notice the failure other than a slower than expected upload.
thanks for the debugging, i'll take a look. |
- Gut debug-logger: remove all log* functions, keep only measure() - Replace log* calls with @wordpress/warning in actions.ts - Fix attachment null crash in uploadItem onFileChange callback - Add missing network error retry patterns (fetch_error, Failed to fetch, Load failed) - Update tests for warning expectations and new retry patterns Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The upload-media package imports @wordpress/warning but was missing the corresponding tsconfig project reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit added @wordpress/warning to upload-media's package.json but did not regenerate the lockfile, causing CI to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Testing this I noticed a couple of things are still off:
|
Guard uploadItem's finishOperation against double-fire from both onFileChange and onSuccess to prevent premature queue emptying and early save lock release. Add queueMissingSizeGeneration action and useMissingSizesCheck hook so the editor detects images with incomplete sub-sizes on load and queues client-side thumbnail generation.
|
Ready for more testing... |
It turns out this is happening on trunk, too. I think I have a fix, I'll just test it a little more locally and then throw up a PR. |
|
Fix in: #76124 |
Resolve conflict in upload-media/src/store/types.ts by keeping both the new imageQuality/mediaFinalize settings from trunk and the retry settings from this branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflict in private-actions.ts: use trunk's improved condition (attachment && !isBlobURL) while preserving finishUpload guard. Update package-lock.json with @wordpress/warning dependency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
andrewserong
left a comment
There was a problem hiding this comment.
Just coming back to test this PR again, and I'm wondering if it should be split out into separate features / PRs?
I understand the value or the retry and backoff logic, but I'm a bit concerned about the useMissingSizesCheck and what could go wrong when firing off lots of API requests and attempting to upload and generate sub-sizes when the editor loads. I've added a comment to that effect, but my general concern is more that I don't know what could go wrong, which makes it hard to tell if this is the right path forward.
Sorry that my comment here is a little vague! I'm going AFK after tomorrow, and wanted to at least leave some comments here, but I'm not sure how helpful it is in advancing the PR.
As always, thank you for the continued efforts to polish this feature! The ideas here are really good, I think we'll just need to be quite careful about when and how we fire off requests, as depending on the circumstances it could be quite unexpected or cause unintended performance issues.
test-results/.last-run.json
Outdated
| @@ -0,0 +1,10 @@ | |||
| { | |||
There was a problem hiding this comment.
Looks like this was accidentally committed?
| * | ||
| * Only active when client-side media processing is enabled. | ||
| */ | ||
| export default function useMissingSizesCheck() { |
There was a problem hiding this comment.
I'm a bit unsure of this hook in general and of the performance implications of both useMissingSizesCheck and the retry logic.
If I open a post that contains a lot of images, then loading the post itself results in firing a lot of separate API calls:
2026-03-18.16.46.40.mp4
Also, it iterates over the entire block hierarchy. Typically in the editor, there's some performance optimisations so that only visible blocks are displayed, and the logic here kind of breaks that I think.
It could also be unexpected for a user who's intending to simply open a page or post to perform an edit, for an update action like firing off generating sub-sizes to immediately occur. We kind of don't know the user's intent when the editor loads.
Split per reviewer feedback: keep error classification (ErrorCode enum), user-friendly error messages, warning() logging, and debug instrumentation. Retry logic (automatic retry, exponential backoff, missing sizes check) will be a separate PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds automatic retry with exponential backoff for transient upload failures (network errors, timeouts, server errors). Also adds a hook to detect and regenerate missing image sub-sizes when the editor loads. Key features: - ErrorCode enum for error classification and retry decisions - Exponential backoff with jitter to prevent thundering herd - PendingRetry status and retry selectors for state tracking - useMissingSizesCheck hook for missing image sub-size detection - queueMissingSizeGeneration action for client-side sub-size generation Depends on #74917 for error handling infrastructure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@andrewserong I am going to make this PR about error handling improvements only, then work on retry capability, offline resilience in this PR: #76765 |
Wire UploadError.isRetryable into cancelItem to automatically retry transient failures (network errors, timeouts, server 5xx, VIPS worker errors) with exponential backoff up to 3 attempts. Delay doubles each retry (1s, 2s, 4s) capped at 10s. On retry, operations are reset to Prepare so the item re-enters the pipeline from scratch, since the operation list is consumed as operations complete. Manual retryItem also resets operations. Remove noisy warning() calls that fired on every cancellation and batch completion. Remove the now-unused @wordpress/warning dependency. Remove accidentally committed test-results artifact.

Fixes #74366
Summary
This PR adds error handling improvements to the upload-media package, split from the original combined PR per reviewer feedback. Retry logic will be handled in a separate PR.
Key Features:
ErrorCodeenum with error types categorizing network errors, validation errors, processing errors, and user actions@wordpress/warningfor structured logging of upload cancellations, errors, and batch completionsFiles Changed:
Modified:
packages/upload-media/src/upload-error.ts- Added ErrorCode enum and isRetryable getterpackages/upload-media/src/store/actions.ts- Added warning() logging to cancelItempackages/upload-media/src/store/private-actions.ts- Added measure() instrumentation for upload operationspackages/upload-media/package.json- Added @wordpress/warning dependencypackages/upload-media/tsconfig.json- Added warning path referencepackages/upload-media/README.md- Updated documentationCreated:
packages/upload-media/src/error-messages.ts- User-friendly i18n error messagespackages/upload-media/src/store/utils/debug-logger.ts- Performance measurement utility using User Timings APITest plan