Audio file (.wav file)
input.wav is (/Test/003 - Actions - One Minute Smile/mixture.wav)
in DSD100 dataset. (can be donwloaded from http://liutkus.net/DSD100.zip)
To reduce calculation cost, input.wav is clipped from original.
Bandwidth extented audio file (.wav file)
Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.
For the sample wav,
$ python3 deep_music_enhancer.py
Supported model types are [resnet
, resnet_bn
, resnet_da
, resnet_do
, unet
, unet_bn
, unet_da
, unet_do
].
bn means batch normlization, do means dropout, da means data augmentation.
Model type can be specified as below.
$ python3 deep_music_enhancer.py --model [MODEL TYPE]
You can specify input audio files by adding --input
option.
$ python3 deep_music_enhancer.py --input [INPUT WAV FILE]
If you save audio output with specified name, you have to add --savefile
option.
$ python3 deep_music_enhancer.py --savepath [OUTPUT NAME]
Additionaly, you can use --vis
option in order to visualize spectrogram of input and output audio.
Spectrogram of output audio (butter filter)
Spectrogram of output audio (cheby1 filter)
Pytorch
ONNX opset=11