This plan outlines the steps to modify the application to store image metadata (tags, descriptions, text content) directly in the image files using XMP (Extensible Metadata Platform) as the primary method, with format-specific approaches (EXIF for JPEG/TIFF, text chunks for PNG) as secondary options. This will make the metadata portable with the images themselves and provide a more reliable and standard approach to metadata storage that is widely supported by image editing software.
- Metadata is stored in
image_metadata.jsonfiles in each directory instead of in the images themselves - This makes metadata non-portable when images are moved outside the application
- This doesn't follow the stated requirement of embedding metadata in the images
- Current approach lacks standardization and widespread software support
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Understand the current metadata storage implementation
-
Research Python libraries for image metadata manipulation:
- Evaluate
Pillow(PIL) for EXIF manipulation and PNG text chunks - Evaluate
python-xmp-toolkitfor XMP implementation - Evaluate
pyexiv2for comprehensive metadata support - Evaluate
exiftoolwrapper libraries for Python - Determine which library best suits our needs
- Evaluate
-
Identify metadata standards to use:
- Determine if EXIF, XMP, IPTC, or a combination is most appropriate
- Research best practices for storing text data in image metadata
- Identify existing standard fields vs. custom fields needed
- Research PNG text chunk capabilities for metadata storage
- Evaluate XMP support across different image editors/viewers
-
Analyze compatibility issues:
- Check compatibility across different image formats (JPEG, PNG, etc.)
- Assess metadata size limitations
- Check for platform-specific issues (Windows, macOS, Linux)
- Research XMP implementation requirements
COMPLETION SANITY CHECK:
- Selected libraries are mature, well-maintained, and have good documentation
- Chosen metadata standards are widely supported
- Documented potential limitations and compatibility issues
- Created a clear technical approach document
Checkpoint 0: Technical assessment complete, libraries and approach selected. ✅
Commit Point 0: Research and assessment complete ✓
PHASE COMPLETED: Created image_metadata_embedding_technical_assessment.md with detailed analysis of available libraries and a recommendation to use a comprehensive approach with python-xmp-toolkit as primary method, supplemented by piexif for EXIF and Pillow for PNG text chunks, with a JSON fallback mechanism.
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Understand the installation requirements for python-xmp-toolkit
-
Set up development environment with required dependencies:
- Add python-xmp-toolkit to requirements.txt
- Add piexif to requirements.txt
- Create fallback detection for Exempi library availability
- Test dependency installation and fallback mechanisms
-
Handle platform-specific installation requirements:
- Identify macOS installation path for Exempi (/opt/homebrew/lib)
- Implement robust error handling for missing dependencies
- Create fallback mechanisms for when native dependencies are unavailable
- NEW: Add startup check for Exempi library and exit if not available
-
Document dependency management:
- Add automated dependency detection in code
- Enable graceful degradation when specific libraries aren't available
- Configure proper environment detection
COMPLETION SANITY CHECK:
- All dependencies can be installed and used
- Fallback mechanisms properly implemented when dependencies are missing
- Robust error handling for dependency issues
- Environment detection is reliable
- NEW: Application refuses to start if Exempi is not properly installed
Checkpoint 1: Environment setup and dependency management partially complete.
Commit Point 1: Dependencies added to requirements.txt ✓
PHASE PARTIALLY COMPLETED: Added dependencies to requirements.txt, but integration with the application is incomplete. The fallback mechanisms are not being used in the actual application. The application currently logs "Exempi library not found" but continues to run without XMP support.
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Review the image processor service and storage service implementation
-
Create a new
ImageMetadataServiceclass:- Implement methods for reading metadata from images using XMP
- Implement methods for writing metadata to images using XMP
- Define custom namespace for application-specific metadata
- Add format-specific handlers for different image types:
- XMP handler for all formats using python-xmp-toolkit
- JPEG/TIFF secondary handler using piexif for EXIF data
- PNG secondary handler using Pillow for text chunks
- Include error handling for corrupted files or unsupported formats
-
Create schema for metadata storage in images:
- Define standard XMP fields for description, tags, etc.
- Create a mapping between our internal model and XMP fields
- Define custom XMP properties for app-specific metadata
- Create format-specific mappings for secondary storage options
- Document the schema with format-specific details
-
Implement fallback mechanisms:
- Handle XMP read failures gracefully
- Fall back to format-specific methods (EXIF, text chunks)
- Add comprehensive logging for metadata operations
- Implement format detection and appropriate handling
- Fix platform-specific issues with PNG text chunks handling
COMPLETION SANITY CHECK:
- All metadata operations are properly encapsulated in the service
- XMP operations work when XMP support is available
- Format-specific handlers work correctly as fallbacks
- Service follows best practices for error handling
- Unit tests cover main functionality for each format
- Service is capable of being used in the application
Checkpoint 2: ImageMetadataService implementation complete. ✅
Commit Point 2: Metadata service implementation complete ✓
PHASE COMPLETED: Created and tested the ImageMetadataService class. The service works properly in standalone tests and is ready for integration with the application. Enhanced EXIF support has been added to handle ExifTool-modified images, and we've added format detection to correctly handle MPO files (which are JPEG-based and support EXIF).
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Review current storage integration points
-
Update storage service to use the new metadata service:
- Modify
load_or_create_metadatato read from images using XMP when possible - Update metadata update operations to write to images using XMP
- Maintain JSON files as backup/cache but treat image metadata as source of truth
- Add format-specific optimizations for different image types
- Modify
-
Adapt
ImageProcessorservice:- Update methods to store processing results directly in the image using XMP
- Ensure compatibility with existing processing workflow
- Add validation to verify metadata was properly stored
- Handle format-specific limitations gracefully
-
Update vector store synchronization:
- Ensure vector store uses metadata from images
- Update synchronization to check both XMP and JSON metadata
- Prioritize image metadata when available
- Add conflict resolution for cases where both sources exist
COMPLETION SANITY CHECK:
- All services properly integrated with the new metadata service
- XMP metadata handling works correctly across the application
- Format-specific handling is correct across the application
- Backward compatibility is maintained
- Error handling is comprehensive
- README.md is updated if necessary
Checkpoint 3: Integration with existing services in progress.
Commit Point 3: Image update endpoint now using metadata service to store metadata directly in images ✓
PHASE IN PROGRESS: We've started integrating the ImageMetadataService with the application. The /update endpoint in images.py now uses the service to read and write metadata directly to image files. However, we've discovered that the JSON files are still being created, and we need to continue the integration to ensure metadata is read directly from images throughout the application.
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Understand the current metadata storage format
-
Create migration tool for existing data:
- Implement function to read existing JSON metadata
- Add function to write JSON metadata to image files using XMP
- Add format-specific migration options as fallbacks
- Include validation to verify migration success
- Add progress reporting with format-specific details
-
Implement automatic migration:
- Add migration when loading a directory with existing JSON metadata
- Add option to skip migration if needed
- Ensure migration is non-destructive to existing data
- Add format-specific handling for different image types
-
Add command-line migration tool:
- Create script for batch migration of directories
- Add options for backup and verification
- Include detailed logging
- Add format statistics reporting
COMPLETION SANITY CHECK:
- Migration tool successfully transfers metadata to images using XMP
- Format-specific migrations work correctly as fallbacks
- Validation confirms data integrity
- Error handling is robust
- Documentation includes migration instructions
Checkpoint 4: Migration implementation not started. ❌
Commit Point 4: Migration tool implementation not started ✗
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Review testing best practices
-
Update test suite:
- Add unit tests for the metadata service with XMP focus
- Add format-specific tests for each image type (JPEG, PNG, TIFF, etc.)
- Create a comprehensive testing framework
- Update existing tests to verify image metadata
- Test both with and without XMP support
- Test with various image formats
-
Perform real-world testing:
- Create standalone test scripts for both JPEG and PNG formats
- Test platform-specific handling of text chunks
- Test with actual application to verify metadata is embedded in images
- Test error handling and recovery in production environment
- Verify metadata consistency between writes and reads in the application
-
Edge case testing:
- Test read/write with fresh images
- Test with read-only files
- Test with nonexistent files
- Test updating existing metadata
- Test format detection functionality
- Test ExifTool-modified images
- Test MPO format files (multi-picture objects)
- Test in real application environment with large image collections
COMPLETION SANITY CHECK:
- All tests pass for standalone service
- Tests verify integration with application
- Format-specific tests validate correct behavior
- Edge cases are properly handled
- Standalone test scripts work correctly
- Tests verify metadata is being embedded in images in standalone tests
- Tests verify metadata is being embedded in images in the application
MANUAL TESTING STEPS:
-
Basic Metadata Verification:
- Use the application to add tags and descriptions to test images
- Verify metadata is saved directly to the image files
- Use external tools (ExifTool, macOS Preview, etc.) to confirm metadata is visible
- Restart app and verify metadata is loaded directly from the images
- Check different image formats (JPEG, PNG, TIFF)
-
ExifTool Interoperability Testing:
- Create test images with metadata using ExifTool
- Load these images in the application
- Verify the application correctly reads ExifTool-added metadata
- Make modifications and verify they're preserved
- Check that all ExifTool-added fields are readable
-
Edge Case Testing:
- Test with corrupted image files
- Test with read-only files and directories
- Test with unusual image formats
- Test with large metadata (long descriptions, many tags)
- Test with files that have no metadata
- Test with images that have metadata from other applications
-
Performance Testing:
- Test with large image collections (1000+ images)
- Verify acceptable loading times
- Check memory usage during large batch operations
Checkpoint 5: Testing of service complete, integration testing in progress.
Commit Point 5: Service testing complete, integration testing in progress ✓
PHASE PARTIALLY COMPLETED: Created and tested the ImageMetadataService class in isolation with comprehensive tests. We've also begun testing integration with the application. The service now properly handles ExifTool-modified images with descriptions, and MPO format detection has been added. Additional manual testing has been done with ExifTool-modified images to verify compatibility.
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Review current UI implementation
-
Update UI components:
- Add indicators for embedded XMP metadata
- Update metadata display to show source (XMP, EXIF, JSON)
- Add controls for metadata migration if needed
- Add format-specific indicators and controls
-
Update user messaging:
- Add notifications for successful metadata embedding
- Include clear error messages for metadata failures
- Add migration progress indicators
- Add format-specific hints and warnings
- Add info about XMP compatibility with other software
-
Add settings if needed:
- Option to control metadata embedding behavior
- Format-specific options (e.g., PNG compression level)
- Option to manage backup JSON files
- XMP-specific preferences
COMPLETION SANITY CHECK:
- UI changes are consistent with application style
- XMP-related UI elements are intuitive
- Format-specific elements are intuitive
- User feedback is clear and informative
- Settings are properly documented
- README.md is updated with UI changes
Checkpoint 6: UI updates complete.
Commit Point 6: UI updates complete ✓
START REQUIREMENTS:
-
Review AI_ASSISTANT_GUIDE.md and README.md thoroughly before beginning
-
Review documentation best practices
-
Update API documentation:
- Document the new metadata service with XMP focus
- Document custom XMP namespace and properties
- Include format-specific details
- Update storage service documentation
- Document migration process
- Add examples for common operations
-
Update user documentation:
- Explain XMP metadata embedding benefits
- Explain software compatibility advantages
- Add instructions for migration
- Include format-specific information
- Include troubleshooting section
- Document any new settings
-
Code cleanup:
- Remove deprecated code
- Standardize naming and comments
- Optimize performance for XMP and format-specific operations
- Address any technical debt
COMPLETION SANITY CHECK:
- Documentation is complete and accurate with XMP and format-specific details
- Code is clean and follows standards
- README.md is fully updated
- No deprecated or unused code remains
Commit Point 7: Documentation and cleanup complete ✓
| Standard | Pros | Cons | Use Case |
|---|---|---|---|
| XMP | Cross-format consistency, extensible, rich structures | Requires additional library | Primary method for all formats |
| EXIF | Widely supported, good for basic data | Limited field types, size constraints | Secondary for JPEG/TIFF |
| PNG Text Chunks | Native PNG support, compression options | PNG-specific, not as standardized | Secondary for PNG |
| IPTC | Standard for media industry | Less supported in consumer software | Alternative option |
| Library | Pros | Cons | Notes |
|---|---|---|---|
| python-xmp-toolkit | Comprehensive XMP support, all formats | Requires exempi | Primary metadata handler |
| Pillow (PIL) | Built-in to our app, PNG text chunks | Limited EXIF support | Format detection, PNG fallback |
| piexif | Pure Python, good EXIF support | Limited to EXIF, no PNG support | JPEG/TIFF fallback |
| pyexiv2 | Comprehensive metadata support | Requires C++ libs | Not compatible with Python 3.11 |
| Format | XMP Support | Fallback Method | Notes |
|---|---|---|---|
| JPEG | Excellent | EXIF via piexif | Best format for rich metadata |
| PNG | Good | Text chunks via Pillow | Well-supported with XMP |
| TIFF | Excellent | EXIF via piexif | Well-supported format |
| WebP | Limited | EXIF via piexif | Newer format with evolving support |
| GIF | Limited | JSON fallback | Limited metadata support |
| Risk | Probability | Impact | Mitigation |
|---|---|---|---|
| Metadata corruption | Low | High | Implement validation and backup system |
| Format-specific limitations | Medium | Medium | Add format-specific handlers and fallbacks |
| Performance impact on large collections | Medium | Medium | Implement caching and batch processing |
| User data loss | Low | Critical | Maintain JSON backups of all metadata |
| External library issues | Low | High | Thorough testing and error handling |
| XMP library installation issues | Medium | High | Provide detailed installation instructions, fallbacks |
| Exempi dependency issues | Medium | Medium | Document installation process, provide fallback options |
- Use standard XMP fields and namespaces where possible
- Follow XMP best practices for custom namespaces
- Implement proper error handling and recovery
- Maintain backward compatibility with JSON storage
- Add comprehensive logging for debugging
- Validate metadata after writing
- Create backups before modifying existing metadata
- Follow library-specific best practices
- Add appropriate documentation
- Implement format detection and appropriate handling
- Test with real-world image editing software
- Enhanced integration with external metadata tools
- Batch metadata editing features
- Advanced metadata search capabilities using XMP
- Custom metadata schema definition
- Metadata templates for different use cases
- Support for additional image formats
- Extended XMP for very large metadata