macOS/M-Series Support for Qwen3-TTS by evuori · Pull Request #124 · QwenLM/Qwen3-TTS

evuori · 2026-01-28T20:21:34Z

Overview

This PR adds comprehensive macOS/Apple Silicon (M-series) support to Qwen3-TTS while maintaining full CUDA/GPU compatibility. The implementation includes intelligent device auto-detection, smart model path detection, and improved documentation.

Key Changes

1. Intelligent Device Auto-Detection

New File: qwen_tts/core/device_utils.py (256 lines)
get_optimal_device() - Auto-detects MPS > CUDA > CPU with intelligent fallback
get_attention_implementation() - Returns appropriate attention backend per device (auto-skips FlashAttention on non-CUDA)
device_synchronize() - Device-agnostic synchronization replacing torch.cuda.synchronize()
get_device_info() - Human-readable device descriptions
get_model_path() - Smart model path detection (local ./models/ first, then HuggingFace)

Benefits:

✅ macOS users can run examples without any code changes
✅ Automatic MPS detection on Apple Silicon
✅ FlashAttention gracefully skipped on non-CUDA devices
✅ Device-agnostic timing works everywhere

2. Updated Examples (All 4 example files)

examples/test_model_12hz_custom_voice.py
examples/test_model_12hz_voice_design.py
examples/test_model_12hz_base.py
examples/test_tokenizer_12hz.py

Changes:

✅ Use get_optimal_device() instead of hardcoded "cuda:0"
✅ Use get_attention_implementation() instead of hardcoded "flash_attention_2"
✅ Use device_synchronize() instead of torch.cuda.synchronize()
✅ Use get_model_path() for smart model detection
✅ Output audio files to ./example_output/ directory

3. Updated Scripts

qwen_tts/cli/demo.py - CLI demo with device auto-detection
finetuning/prepare_data.py - Fine-tuning prep with device auto-detection

4. Documentation Updates

README.md

New "Model Loading and Caching" section with smart path detection
New "macOS / Apple Silicon (M1/M2/M3/M4) Support" section
Updated model download instructions to use ./models/ directory
Added code examples and troubleshooting guide
CLAUDE.md

Updated "Model Loading Best Practices" with device utilities
Added "macOS / Apple Silicon Development" section with complete examples
Added directory structure diagram for model organization

5. Configuration Updates

.gitignore - Added example_output/ to prevent committing generated audio files

Commits

db29e79 feat: output example audio files to example_output directory
0f62671 docs: update model loading documentation for smart path detection
56b6c80 feat: add smart model path detection (local models or HuggingFace)
552709c fix: remove trailing slashes from model paths in examples
3514e0b feat: add intelligent device auto-detection for macOS/MPS support

Statistics

Files Modified: 11
New Files: 1 (device_utils.py)
Lines Added: ~600
Zero Breaking Changes: All existing code continues to work

How It Works

Device Auto-Detection

from qwen_tts.core.device_utils import get_optimal_device, get_attention_implementation

device = get_optimal_device() # MPS > CUDA > CPU
attn = get_attention_implementation(device) # Auto-skips FlashAttention on non-CUDA

Smart Model Path Detection

from qwen_tts.core.device_utils import get_model_path

Checks ./models/Qwen3-TTS-12Hz-1.7B-CustomVoice first
Falls back to HuggingFace if not found
model_path = get_model_path("Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice")

Backward Compatibility

✅ 100% Backward Compatible

Explicit device specs like device_map="cuda:0" still work
Explicit attention specs like attn_implementation="flash_attention_2" still work
Existing code paths unchanged
No breaking API changes

Testing

Users can test with:

Auto-detection (recommended)
python examples/test_model_12hz_custom_voice.py

CLI demo
qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Fine-tuning prep with auto-detection
python finetuning/prepare_data.py --input_jsonl train.jsonl --output_jsonl train_with_codes.jsonl

Example Output

On macOS with MPS:

Using device: Apple Metal Performance Shaders (MPS) - Apple Silicon GPU
Found local model: ./models/Qwen3-TTS-12Hz-1.7B-CustomVoice
[CustomVoice Single] time: 2.456s

On NVIDIA GPU:

Using device: CUDA GPU: NVIDIA GeForce RTX 4090
Local model not found at ./models/Qwen3-TTS-12Hz-1.7B-CustomVoice, will download from HuggingFace...
[CustomVoice Single] time: 0.789s

Implement cross-platform device detection (MPS > CUDA > CPU) while maintaining full backward compatibility with existing CUDA workflows. Changes: - New module: qwen_tts/core/device_utils.py with device detection - Auto-detect optimal device (MPS > CUDA > CPU) - Auto-select attention implementation (skip FlashAttention on non-CUDA) - Device-agnostic synchronization for accurate timing measurements - Update all examples to use device auto-detection - Update CLI demo to support device auto-detection - Update fine-tuning prep script with device auto-detection - Add comprehensive macOS/Apple Silicon documentation to README - Update CLAUDE.md with macOS development guidelines Benefits: - macOS users can now run examples without code modifications - Automatic MPS detection and usage on Apple Silicon - FlashAttention gracefully skipped on non-CUDA devices - 100% backward compatible - existing device specs still work - Better support for diverse hardware environments Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Model paths should not have trailing slashes when passed to HuggingFace model loaders. This fixes the HFValidationError when running examples. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Add get_model_path() utility function that automatically checks for locally downloaded models in ./models/ directory and uses them if available, otherwise falls back to HuggingFace model IDs for auto-download. This allows users to: 1. Run examples with pre-downloaded models without network access 2. Automatically download models on first run 3. Avoid redownloading models if they're already cached locally Updated all examples and scripts to use get_model_path(): - test_model_12hz_custom_voice.py - test_model_12hz_voice_design.py - test_model_12hz_base.py - test_tokenizer_12hz.py - qwen_tts/cli/demo.py - finetuning/prepare_data.py Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Update README.md and CLAUDE.md to document the smart model path detection feature that automatically checks for locally downloaded models in the ./models/ directory before downloading from HuggingFace. Key updates: - New 'Model Loading and Caching' section in README.md - Updated model download instructions to use ./models/ directory - Added directory structure example showing recommended layout - Added code examples showing how smart path detection works - Updated CLAUDE.md model loading best practices with get_model_path() - Added complete example showing all device utilities together This ensures users understand that examples will automatically work with locally downloaded models while maintaining backward compatibility with HuggingFace auto-download. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Update all example scripts to write generated audio files to a dedicated ./example_output/ directory instead of the current working directory. This keeps outputs organized and separate from the source code. Changes: - test_model_12hz_custom_voice.py: writes to ./example_output/ - test_model_12hz_voice_design.py: writes to ./example_output/ - test_tokenizer_12hz.py: writes to ./example_output/ - Add example_output/ to .gitignore to prevent committing generated files All example scripts now: 1. Import os module 2. Create output_dir = "example_output" 3. Use os.makedirs(output_dir, exist_ok=True) to ensure directory exists 4. Write all audio files to os.path.join(output_dir, filename) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

yerffejytnac · 2026-02-05T00:09:59Z

this is awesome 👏

israelandrewbrown · 2026-02-20T17:12:38Z

#abandoned-repository

evuori · 2026-02-20T17:29:43Z

#abandoned-repository

?

israelandrewbrown · 2026-02-20T18:53:59Z

@evuori

It's appears as though the developers have abandoned this repository.

Can't wait until they merge this, in the meantime we'll have to use "voicebox".

Great pr btw.

evuori and others added 10 commits January 28, 2026 20:42

chore: add model download script and update gitignore

cd8e8ca

chore: ignore .claude configuration

24b3240

chore: add claude instructions

2b8b683

fix: python version issues

a0c8bb3

fix: the training and inference pipelines are now consistent

51b8aeb

fix: remove trailing slashes from model paths in examples

552709c

Model paths should not have trailing slashes when passed to HuggingFace model loaders. This fixes the HFValidationError when running examples. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

macOS/M-Series Support for Qwen3-TTS#124

macOS/M-Series Support for Qwen3-TTS#124
evuori wants to merge 10 commits intoQwenLM:mainfrom
evuori:improve_macos_support

evuori commented Jan 28, 2026 •

edited

Loading

Uh oh!

yerffejytnac commented Feb 5, 2026

Uh oh!

israelandrewbrown commented Feb 20, 2026

Uh oh!

evuori commented Feb 20, 2026

Uh oh!

israelandrewbrown commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

evuori commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Key Changes

1. Intelligent Device Auto-Detection

Benefits:

2. Updated Examples (All 4 example files)

Changes:

3. Updated Scripts

4. Documentation Updates

5. Configuration Updates

Commits

Statistics

How It Works

Device Auto-Detection

Smart Model Path Detection

Backward Compatibility

Testing

Example Output

On macOS with MPS:

On NVIDIA GPU:

Uh oh!

yerffejytnac commented Feb 5, 2026

Uh oh!

israelandrewbrown commented Feb 20, 2026

Uh oh!

evuori commented Feb 20, 2026

Uh oh!

israelandrewbrown commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

evuori commented Jan 28, 2026 •

edited

Loading