Skip to content

Refine MP4 metadata writes and add resilient XMP cleanup#37

Merged
ChrisAdamsdevelopment merged 1 commit into
mainfrom
codex/remove-xmp-metadata-from-mp4/m4a-output
May 20, 2026
Merged

Refine MP4 metadata writes and add resilient XMP cleanup#37
ChrisAdamsdevelopment merged 1 commit into
mainfrom
codex/remove-xmp-metadata-from-mp4/m4a-output

Conversation

@ChrisAdamsdevelopment
Copy link
Copy Markdown
Owner

@ChrisAdamsdevelopment ChrisAdamsdevelopment commented May 20, 2026

Motivation

  • Prevent unqualified ExifTool tag writes from creating unwanted XMP metadata while preserving MP4/QuickTime ItemList and Keys descriptive fields.
  • Ensure ItemList/Keys metadata (e.g., Sobelo / Triple7 data) survives a safe XMP cleanup and maintain existing forensic diagnostics.

Description

  • Reworked buildMetaToWrite in server/processor.js to remove unqualified/generic writes and switch to explicit ItemList:*, QuickTime:* (kept), and Keys:* writes for title/artist/producer/copyright/genre/keywords/description/comment.
  • Removed bare assignments for generic tags such as Title, Artist, Author, AlbumArtist, Producer, Copyright, Genre, Keywords, Description, and Comment and added targeted Keys:/ItemList: equivalents (for example Keys:Title, Keys:DisplayName, Keys:Artist, ItemList:Author, ItemList:AlbumArtist, Keys:Producer, Keys:Keywords, Keys:Description, etc.).
  • Added a safer XMP cleanup sequence after descriptive writes that first runs -XMP:all= then verifies descriptive metadata persistence and, if stripped, re-applies the metadata and runs a targeted fallback cleanup (-XMP-dc:all=, -XMP-pdf:all=, -XMP-tiff:all=, -XMP-xmpDM:all=, -XMP-x:XMPToolkit=) so ItemList/Keys are not removed.
  • Kept all diagnostics and verification machinery intact including runId, per-stage SHA-256 hashes, deepSnapshotsByStage, and metadataPersistenceStage logic.

Testing

  • Ran syntax check with node --check server/processor.js, which completed successfully.
  • Performed a local static inspection of buildMetaToWrite to confirm no assignments remain for unqualified generic tags (no metaToWrite.Title, metaToWrite.Artist, metaToWrite.Producer, metaToWrite.Copyright, metaToWrite.Genre, metaToWrite.Keywords, metaToWrite.Description, or metaToWrite.Comment).
  • Attempted a runtime helper invocation to exercise buildMetaToWrite, but the environment lacks the exiftool-vendored module so that dynamic import test could not run; this does not affect the static change or syntax validation.
  • Observed that the ItemList/Keys mappings for the Sobelo/Triple7 example remain present in buildMetaToWrite and the XMP cleanup branch will preserve or reapply them as needed during processing.

Codex Task

Summary by Sourcery

Refine MP4/QuickTime metadata writes to target ItemList/Keys tags explicitly and introduce a safer XMP cleanup sequence that preserves or restores descriptive metadata while maintaining existing diagnostics.

Bug Fixes:

  • Prevent unqualified ExifTool tag writes from creating unintended XMP metadata for MP4/QuickTime files.
  • Ensure descriptive ItemList/Keys metadata survives XMP cleanup by reapplying it if a full XMP wipe strips the fields.

Enhancements:

  • Expand descriptive metadata mapping to include explicit Keys-based tags (e.g., title, artist, author, album artist, copyright, keywords, genre, description, comment) alongside ItemList and QuickTime tags.
  • Adjust the XMP cleanup process to first clear all XMP, then fall back to targeted namespace cleanup only when needed, while keeping forensic snapshotting and hashing intact.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 20, 2026

Reviewer's Guide

Adjusts MP4/QuickTime metadata writing to use explicit ItemList/QuickTime/Keys tags instead of generic tags and introduces a safer, two-phase XMP cleanup that preserves or reapplies descriptive metadata while keeping existing diagnostics intact.

Sequence diagram for new two phase XMP cleanup in processMediaFile

sequenceDiagram
  participant P as processMediaFile
  participant E as exiftool

  P->>E: write(outputPath, metaToWrite, -overwrite_original)
  P->>E: read(outputPath)
  P->>P: buildMetadataSnapshot(tags_after_write)
  P->>E: write(outputPath, {}, -XMP:all= -XMP:XMPToolkit= -overwrite_original)
  P->>E: read(outputPath)
  P->>P: buildMetadataSnapshot(afterXmpCleanupTags)
  P->>P: hasDescriptiveMetadata(snapshot_after_cleanup)

  alt preservedAfterXmpCleanup
    P->>P: afterXmpCleanupSnapshot = snapshot_after_cleanup
  else strippedAfterXmpCleanup
    P->>E: write(outputPath, metaToWrite, -overwrite_original)
    P->>E: write(outputPath, {}, -XMP-dc:all= -XMP-pdf:all= -XMP-tiff:all= -XMP-xmpDM:all= -XMP-x:XMPToolkit= -overwrite_original)
    P->>E: read(outputPath)
    P->>P: buildMetadataSnapshot(finalXmpCleanupTags)
    P->>P: afterXmpCleanupSnapshot = snapshot_final
  end
Loading

Flow diagram for updated MP4 metadata tag mapping in buildMetaToWrite

flowchart TD
  A[buildMetaToWrite<br/>inputs: platform, metadata<br/>derives: safeTitle, safeArtist,<br/>safeProducer, copyright,<br/>tagsArray, safeGenre,<br/>safeDescription, safeComment]

  A --> B[Set title tags<br/>ItemList:Title = safeTitle<br/>QuickTime:Title = safeTitle<br/>Keys:Title = safeTitle<br/>Keys:DisplayName = safeTitle]

  A --> C{safeArtist?}
  C -->|yes| D[Set artist tags<br/>ItemList:Artist = safeArtist<br/>QuickTime:Artist = safeArtist<br/>ItemList:Author = safeArtist<br/>ItemList:AlbumArtist = safeArtist<br/>Keys:Artist = safeArtist<br/>Keys:Author = safeArtist]

  A --> E{safeProducer?}
  E -->|yes| F[Set producer tags<br/>ItemList:Producer = safeProducer<br/>Keys:Producer = safeProducer]

  A --> G{copyright?}
  G -->|yes| H[Set copyright tags<br/>ItemList:Copyright = copyright<br/>QuickTime:Copyright = copyright<br/>Keys:Copyright = copyright]

  A --> I{tagsArray length > 0?}
  I -->|yes| J[Set keyword tags<br/>ItemList:Keyword = tagsArray<br/>Keys:Keywords = tagsArray]

  A --> K{safeGenre?}
  K -->|yes| L[Set genre tags<br/>ItemList:Genre = safeGenre<br/>QuickTime:Genre = safeGenre<br/>Keys:Genre = safeGenre]

  A --> M{platform-specific description/comment?}
  M -->|YouTube/General safeDescription| N[Set description/comment<br/>ItemList:Description or Comment<br/>QuickTime:Description/Comment<br/>Keys:Description/Comment]
  M -->|Spotify safeDescription| O[Set description<br/>ItemList:Description<br/>QuickTime:Description<br/>Keys:Description]

  N --> P[Return metaToWrite<br/>with only ItemList/QuickTime/Keys tags]
  O --> P
  L --> P
  J --> P
  H --> P
  F --> P
  D --> P
  B --> P
Loading

File-Level Changes

Change Details Files
Switch descriptive metadata writes from generic ExifTool tags to explicit MP4/QuickTime ItemList and Keys tags for titles, artists, producers, copyright, genres, keywords, descriptions, and comments.
  • Remove assignments to unqualified generic tags like Title, Artist, Author, AlbumArtist, Producer, Copyright, Genre, Keywords, Description, and Comment in the metadata builder.
  • Add explicit Keys:* writes (e.g., Keys:Title, Keys:DisplayName, Keys:Artist, Keys:Author, Keys:Producer, Keys:Copyright, Keys:Keywords, Keys:Genre, Keys:Description, Keys:Comment) alongside existing ItemList:* and QuickTime:* fields.
  • Ensure platform-specific branches (YouTube, Spotify, General) only emit ItemList:/QuickTime:/Keys-tagged metadata and preserve existing Sobelo/Triple7 ItemList/Keys mappings.
server/processor.js
Introduce a resilient XMP cleanup flow that first clears all XMP, then verifies descriptive metadata persistence and conditionally reapplies metadata with a narrower XMP purge while keeping forensic diagnostics unchanged.
  • Change the XMP cleanup write step to use -XMP:all= plus -XMP:XMPToolkit= before overwriting the original file.
  • After initial XMP cleanup, read tags, check for presence of descriptive metadata via hasDescriptiveMetadata, and if missing, re-write descriptive metadata and perform a narrower XMP cleanup (-XMP-dc:all=, -XMP-pdf:all=, -XMP-tiff:all=, -XMP-xmpDM:all=, -XMP-x:XMPToolkit=).
  • Set the final post-cleanup snapshot from either the first cleanup read or a second read after fallback, while keeping existing runId, per-stage hashes, deepSnapshotsByStage, and metadataPersistenceStage logic intact.
server/processor.js

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@ChrisAdamsdevelopment ChrisAdamsdevelopment merged commit 2655a1e into main May 20, 2026
4 of 5 checks passed
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant