Skip to content

Conversation

@TheApeMachine
Copy link

  • Implement reference audio
  • Automatically download OpenMuQ/MuQ-MuLan-large (or other compatible pre-trained model)
  • Add deterministic selection within audio prompt, or random

Notes: Not sure what your intentions are with the reference audio, but it seems to work pretty well as far as I can tell. I'm opening it as a pull request to see what your thoughts are, but feel free to ignore/close it if not interesting, I just wanted to experiment with it :) Also, there may be some small pieces of code in this branch that is related to an analysis harness I was working on to try and pinpoint the reason for the AI "shimmer" that seems to be common in music generation models, so apologies for that. Finally, I had to base this on my other pull request's branch, as I do not have a CUDA compatible machine here at the moment, so I can only work on Metal.

…e selection. Update argument handling in `run_music_generation.py` and improve `HeartMuLaGenPipeline` class for better input processing and model execution.
…odec model. Update `run_lyrics_transcription.py` to dynamically select device based on availability, and modify `HeartCodec` to determine device from input tensor or model parameters. Improve `HeartMuLaGenPipeline` to support autocast on MPS for better performance.
…mize audio token padding. Introduce a context manager for autocast that gracefully handles unsupported cases, and preallocate buffers for audio tokens to enhance performance during generation.
…ce on MPS. Update `pyproject.toml` to include the optimizer package directory. Enhance `HeartMuLaGenPipeline` to optionally enable Metal optimizations during model execution, improving performance for Llama blocks.
…w Metal kernels and Python wrappers. Update `pyproject.toml` to remove the optimizer package directory. Enhance runtime detection for Metal support and build tools availability.
…add optional dependencies for MuQ-MuLan. Modify `README.md` to reflect new Python version recommendations and installation instructions for optional features. Enhance `run_music_generation.py` and `HeartMuLaGenPipeline` to support reference audio conditioning and auto-download of MuQ-MuLan, improving music generation capabilities.
@frink
Copy link

frink commented Jan 25, 2026

Is this to try to figure out how to do style transfer one song to a new one?

Are you getting shimmer in this model too?

The problem in Suno appeared to be the 10hz generation rate and the overfit on the highs dues to a lot of music having rise and fall of pads in that band. The fix then is to move to 32hz codec space and using an RNN type network (Think RWKV-X) instead of straight transformers or diffusion. But that means NEW model architecture.

@TheApeMachine
Copy link
Author

@frink Thanks for that information! This is something I didn't know about. And yes, that shimmer is still there. Though if you say "appeared" to be, suggesting that Suno solved it, we may be talking about a different shimmer. I am talking about a metallic sound that makes hi-hats and cymbals sound very unnatural, though it can be present in vocals and other instruments to. I always figured it has something to do with bitrate encoding artifacts sneaking in?

@frink
Copy link

frink commented Feb 8, 2026

@TheApeMachine that's a different issue. The way to solve that problem is to use something like AirWindows Average and sand off the highs then use a harmonic exciter to bring them back. The problem is that these codecs put noise above about 8-12KHz which makes everything up there sound like ice pics. So like I said you have to sand it off and then put it back yourself. I've used Amp Head plugins to excite harmonics and various other things. You can also blur and smooth with reverb and delay. But it's just ugly artifacts of the model and the way the AI works. That's not shimmer it's white noise with an low shelf set very high. It's also the biggest giveaway that the music is AI generated. So you have to polish it off if you want to pass things off for real.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants