Skip to content

Conversation

@amrabdelmotteleb
Copy link

Upgrade Ditto

Summary

This PR modernizes the Ditto codebase to work with newer PyTorch versions and improves cross-platform compatibility. The changes eliminate the dependency on Apex (which requires Visual Studio 2019 tools that can no longer be installed for free), use newer Python versions (currently using Python 3.12.11 on Windows) and leverage native PyTorch features for automatic mixed precision training.

Motivation

When working with the original Ditto implementation, I encountered several blockers:

  • Apex dependency: Required Microsoft Visual Studio 2019 tools to build, which is no longer freely available, preventing the use of automatic mixed precision training
  • Limited model compatibility: Could not load newer pre-trained language models like microsoft/deberta-v3-small
  • Outdated dependencies: Newer PyTorch versions natively support features like AdamW optimizer and automatic mixed precision, eliminating the need for external dependencies

Changes

Cross-Platform Compatibility

  • ✅ Encode loaded data using UTF-8 to handle diverse character sets
  • ✅ Normalize path separators for seamless operation across Windows, Linux, and macOS

Modernized PyTorch Integration

  • ✅ Import AdamW from torch.optim instead of transformers
  • ✅ Replace Apex with native torch.amp for automatic mixed precision
  • ✅ Implement gradient scaling by default when AMP is enabled

Enhanced Mixed Precision Support

  • ✅ Add AMP support in model evaluation for faster inference on modern GPUs
  • ✅ Ensure AMP consistency between training and evaluation steps

Command-Line Interface Improvements

  • ✅ Rename --fp16 argument to --amp for clarity and accuracy
  • ✅ Add explicit --use_gpu argument for GPU training control

Environment Updates

  • ✅ Upgraded to Python 3.12.11
  • ✅ Created updated_requirements.txt with modern library versions
  • ✅ Added .gitignore file for better project hygiene

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant