A simple and clean implementation of audio source separation using Non-negative Matrix Factorization (NMF). This project separates two sources from a mixed audio signal.
. βββ main.py # Main script βββ source_separation.py # NMF dictionary & separation βββ arguments.py # CLI arguments βββ data_loader.py # Audio loading βββ utils.py # Utilities (SDR, spectrograms) βββ outputs/ # Output results βββ data/ βββ train/ β βββ source_1.wav β βββ source_2.wav βββ test/ βββ test.wav
pip install numpy librosa soundfile scikit-learn matplotlib
python main.py
Argument | Description | Default |
---|---|---|
--data_path | Path to dataset | ./data |
--output_dir | Directory to save outputs | ./outputs |
--sr | Sampling rate | 48000 |
--n_components | NMF components | 64 |
--n_fft | FFT window size | 1024 |
--hop_length | Hop size for STFT | 512 |
--eval_sdr | Print SDR before/after | flag |
--play_audio | Play result (IPython only) | flag |
--save_audio | Save separated audio | flag |
--plot_spectrogram | Save spectrogram image | flag |
The script will save audio and a spectrogram image:
outputs/ βββ estimated_source_1.wav βββ estimated_source_2.wav βββ spectrograms.png
Lee, Daniel D., and H. Sebastian Seung.
"Learning the parts of objects by non-negative matrix factorization."
Nature 401.6755 (1999): 788β791.
π DOI: 10.1038/44565