-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Info predict file #2
Comments
Hello Thomas,
I guess I've the file that you mentioned. Is it "predict.py" or
"prediction.py"?
I can send it to you. But I am not sure about the content. It was 3+ years
ago))
Kind regards,
Artem
…On Sat, 27 May 2023, 11:02 TomasSand, ***@***.***> wrote:
Dear Artyom Ivasik, your project is very interesting. I try to run your
code and it's ok. But i don't find the predict file that you mentions in
your Thesis. It's possible view this file?
Thanks for your support.
—
Reply to this email directly, view it on GitHub
<#2>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQGSOQJFRAJIYHMCOXAVBFDXIGYJHANCNFSM6AAAAAAYRA47R4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Dear Artem, Thanks for your answer.
I would like to try using the network after the training phase to
separate the sources. I guess it's prediction :)
Thanks again for the kind reply and congrats again for your project.
King regards
Tommaso
… Il 30/05/2023 09:07 CEST Artyom Ivasik ***@***.***> ha scritto:
Hello Thomas,
I guess I've the file that you mentioned. Is it "predict.py" or
"prediction.py"?
I can send it to you. But I am not sure about the content. It was 3+ years
ago))
Kind regards,
Artem
On Sat, 27 May 2023, 11:02 TomasSand, ***@***.***> wrote:
> Dear Artyom Ivasik, your project is very interesting. I try to run your
> code and it's ok. But i don't find the predict file that you mentions in
> your Thesis. It's possible view this file?
> Thanks for your support.
>
> —
> Reply to this email directly, view it on GitHub
> <#2>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AQGSOQJFRAJIYHMCOXAVBFDXIGYJHANCNFSM6AAAAAAYRA47R4>
> .
> You are receiving this because you are subscribed to this thread.Message
> ID: ***@***.***>
>
—
Reply to this email directly, view it on GitHub #2 (comment), or unsubscribe https://github.com/notifications/unsubscribe-auth/AJCBRJOS3JBC7YKPZYRSNRLXIWME7ANCNFSM6AAAAAAYRA47R4.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hello Tommaso,
Sorry for the delay. I attached two files. I hope one of them is what you
are looking for.
Kind regards,
Artem
вт, 30 мая 2023 г. в 17:34, TomasSand ***@***.***>:
… Dear Artem, Thanks for your answer.
I would like to try using the network after the training phase to
separate the sources. I guess it's prediction :)
Thanks again for the kind reply and congrats again for your project.
King regards
Tommaso
> Il 30/05/2023 09:07 CEST Artyom Ivasik ***@***.***> ha scritto:
>
>
>
>
>
> Hello Thomas,
>
> I guess I've the file that you mentioned. Is it "predict.py" or
> "prediction.py"?
> I can send it to you. But I am not sure about the content. It was 3+
years
> ago))
>
> Kind regards,
> Artem
>
> On Sat, 27 May 2023, 11:02 TomasSand, ***@***.***> wrote:
>
> > Dear Artyom Ivasik, your project is very interesting. I try to run your
> > code and it's ok. But i don't find the predict file that you mentions
in
> > your Thesis. It's possible view this file?
> > Thanks for your support.
> >
> > —
> > Reply to this email directly, view it on GitHub
> > <
#2>,
> > or unsubscribe
> > <
https://github.com/notifications/unsubscribe-auth/AQGSOQJFRAJIYHMCOXAVBFDXIGYJHANCNFSM6AAAAAAYRA47R4
>
> > .
> > You are receiving this because you are subscribed to this
thread.Message
> > ID: ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
#2 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AJCBRJOS3JBC7YKPZYRSNRLXIWME7ANCNFSM6AAAAAAYRA47R4
.
> You are receiving this because you authored the thread.Message ID:
***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQGSOQLMD5GQIAHW6O5IMUTXIYAO7ANCNFSM6AAAAAAYRA47R4>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Dear Artem, thanks for your answer :)
but i don't see the file.
I apologize if i do something wrong :)
I look forward to your reply.
Thanks again
Tommaso
Il 06/06/2023 21:47 CEST Artyom Ivasik ***@***.***> ha scritto:
…
Hello Tommaso,
Sorry for the delay. I attached two files. I hope one of them is what you
are looking for.
Kind regards,
Artem
вт, 30 мая 2023 г. в 17:34, TomasSand ***@***.***>:
> Dear Artem, Thanks for your answer.
>
> I would like to try using the network after the training phase to
> separate the sources. I guess it's prediction :)
> Thanks again for the kind reply and congrats again for your project.
>
> King regards
> Tommaso
>
>
>
> > Il 30/05/2023 09:07 CEST Artyom Ivasik ***@***.***> ha scritto:
> >
> >
> >
> >
> >
> > Hello Thomas,
> >
> > I guess I've the file that you mentioned. Is it "predict.py" or
> > "prediction.py"?
> > I can send it to you. But I am not sure about the content. It was 3+
> years
> > ago))
> >
> > Kind regards,
> > Artem
> >
> > On Sat, 27 May 2023, 11:02 TomasSand, ***@***.***> wrote:
> >
> > > Dear Artyom Ivasik, your project is very interesting. I try to run your
> > > code and it's ok. But i don't find the predict file that you mentions
> in
> > > your Thesis. It's possible view this file?
> > > Thanks for your support.
> > >
> > > —
> > > Reply to this email directly, view it on GitHub
> > > <
> #2>,
> > > or unsubscribe
> > > <
> https://github.com/notifications/unsubscribe-auth/AQGSOQJFRAJIYHMCOXAVBFDXIGYJHANCNFSM6AAAAAAYRA47R4
> >
> > > .
> > > You are receiving this because you are subscribed to this
> thread.Message
> > > ID: ***@***.***>
> > >
> >
> > —
> > Reply to this email directly, view it on GitHub
> #2 (comment),
> or unsubscribe
> https://github.com/notifications/unsubscribe-auth/AJCBRJOS3JBC7YKPZYRSNRLXIWME7ANCNFSM6AAAAAAYRA47R4
> .
> > You are receiving this because you authored the thread.Message ID:
> ***@***.***>
> >
>
> —
> Reply to this email directly, view it on GitHub
> <#2 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AQGSOQLMD5GQIAHW6O5IMUTXIYAO7ANCNFSM6AAAAAAYRA47R4>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***
> com>
>
—
Reply to this email directly, view it on GitHub #2 (comment), or unsubscribe https://github.com/notifications/unsubscribe-auth/AJCBRJLPGQT2IFHC5HRIOETXJ6COZANCNFSM6AAAAAAYRA47R4.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
prediction.py
from keras.models import load_model
import numpy as np
import musdb
import librosa
import librosa.display
import math
from time import time
def P2R(radii, angles):
return radii * (math.cos(angles) + 1j*math.sin(angles))
def get_power_and_angle(audio, orig_sr, new_sr, win_len, hop_len):
if orig_sr!= new_sr:
audio = librosa.core.resample(audio, orig_sr, new_sr)
stft_ar = librosa.stft(audio, win_len, hop_len, center = False)
return np.abs(stft_ar), np.angle(stft_ar)
def normalize_pow_spec(pow_spec):
# Масштабирование спектрограммы к [0;1]
pow_spec_norm = np.zeros((pow_spec.shape[0], pow_spec.shape[1]))
max = np.max(pow_spec)
if max != 0:
pow_spec_norm = pow_spec / max
return pow_spec_norm
def save_audio(power_spec, phases, hop_length, win_length, sampling_rate,
path, mask = None):
if mask is None:
mask = np.ones(power_spec.shape)
stft_array = np.empty(power_spec.shape, dtype=complex)
for i in range(stft_array.shape[0]):
for j in range(stft_array.shape[1]):
stft_array[i, j] = P2R(power_spec[i][j]*mask[i][j],
phases[i][j])
y = librosa.core.istft(stft_array, hop_length, win_length,)
librosa.output.write_wav(path, y, sampling_rate)
ORIG_SR = 44100 # Оригинальная частота дискретизации
SR = 22050 # Частота дискретизации, с которой работает сеть
WIN = 1024 # Размер окна дискретного преобразования Фурье
HOP = 256 # Размер шага дискретного преобразования Фурье
sample_len = 27 # Размер отрезка спектрограммы, на котором обучена сеть
TRGT = 'vocals' # Выделяемый источник
# Загрузка модели
t0 = time()
print("<----------[INFO] model loading...")
model = load_model('output/model final 4.hdf5')
t1 = time()
print("<----------[INFO] model was loaded in " + str(round((t1-t0), 1)) + "
seconds...")
#Загрузка тестового набора
mus = musdb.DB(root='./musdb18', subsets='test', split=None)
for track in mus:
print(track.name)
mix = librosa.core.to_mono(track.audio.T)
mix_power_spec, mix_phases = get_power_and_angle(mix, ORIG_SR, SR, WIN,
HOP)
target = librosa.core.to_mono(track.targets[TRGT].audio.T)
target_power_spec, target_phases = get_power_and_angle(target, ORIG_SR,
SR, WIN, HOP)
# Нормализация и расширение спектрограммы
mix_power_spec_norm = normalize_pow_spec(mix_power_spec)
pad = np.zeros((mix_power_spec_norm.shape[0], sample_len//2))
mix_power_spec_norm = np.concatenate((pad, mix_power_spec_norm), axis=1)
mix_power_spec_norm = np.concatenate((mix_power_spec_norm, pad), axis=1)
mask = np.zeros((mix_power_spec_norm.shape[0], 0))
bin_mask = np.zeros((mix_power_spec_norm.shape[0], 0))
set_to_zero_mask = np.zeros((mix_power_spec_norm.shape[0], 0))
for i in range(mix_power_spec.shape[1]):
mix = mix_power_spec_norm[:,i:i+sample_len]
mix = np.expand_dims(mix, 0)
mix = np.expand_dims(mix, len(mix.shape))
pred = model.predict(mix)[0]
mask = np.concatenate((mask, np.expand_dims(pred,
len(pred.shape))), axis=1)
for i in range(mask.shape[0]):
for j in range(mask.shape[1]):
if mask[i][j] >= 0.5:
bin_mask[i][j] = 1
set_to_zero_mask[i][j] = mask[i][j]
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR,
'output/estimation/soft/'+ track.name +'.wav', mask)
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR,
'output/estimation/bin/'+ track.name +'.wav', mask)
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR,
'output/estimation/zero/'+ track.name +'.wav', mask)
break;
…_______________________________________________________________________________________________________-
predict.py
from keras.models import load_model
import numpy as np
#import musdb
import librosa
import librosa.display
import math
from time import time
import matplotlib.pyplot as plt
def P2R(radii, angles):
return radii * (math.cos(angles) + 1j*math.sin(angles))
def get_power_and_angle(audio, orig_sr, new_sr, win_len, hop_len):
if orig_sr!= new_sr:
audio = librosa.core.resample(audio, orig_sr, new_sr)
stft_ar = librosa.stft(audio, win_len, hop_len, center = False)
return np.abs(stft_ar), np.angle(stft_ar)
def normalize_pow_spec(pow_spec):
# Масштабирование спектрограммы к [0;1]
pow_spec_norm = np.zeros((pow_spec.shape[0], pow_spec.shape[1]))
max = np.max(pow_spec)
if max != 0:
pow_spec_norm = pow_spec / max
return pow_spec_norm
def save_audio(power_spec, phases, hop_length, win_length, sampling_rate,
path, mask = None):
if mask is None:
mask = np.ones(power_spec.shape)
stft_array = np.empty(power_spec.shape, dtype=complex)
for i in range(stft_array.shape[0]):
for j in range(stft_array.shape[1]):
stft_array[i, j] = P2R(power_spec[i][j]*mask[i][j],
phases[i][j])
y = librosa.core.istft(stft_array, hop_length, win_length,)
librosa.output.write_wav(path, y, sampling_rate)
def plot_spec(power):
plt.subplots(figsize=(power.shape[1]/120,power.shape[0]/120), dpi = 150)
librosa.display.specshow(data = librosa.core.amplitude_to_db(power,
ref=np.max),
y_axis='linear', #log or linear
x_axis='time',
sr = SR,
hop_length = HOP)
ORIG_SR = 44100
SR = 22050 # Частота дискретизации, с которой работает сеть
WIN = 1024 # Размер окна дискретного преобразования Фурье
HOP = 256 # Размер шага дискретного преобразования Фурье
sample_len = 27
TRGT = 'vocals'
##Загрузка тестового набора
#mus = musdb.DB(root='./musdb18', subsets='test', split=None)
#track = mus[1]
#print(track.name)
#ORIG_SR = track.rate
#track.chunk_duration = 15
#track.chunk_start = 0
#mix = librosa.core.to_mono(track.audio.T)
#target = librosa.core.to_mono(track.targets[TRGT].audio.T)
#
#mix_power_spec, mix_phases = get_power_and_angle(mix, ORIG_SR, SR, WIN,
HOP)
#save_audio(mix_power_spec, mix_phases, HOP, WIN, SR, 'D:/mix.wav')
#
#target_power_spec, target_phases = get_power_and_angle(target, ORIG_SR,
SR, WIN, HOP)
#save_audio(target_power_spec, target_phases, HOP, WIN, SR,
'D:/orig_source.wav')
#
#other_power_spec = np.zeros(mix_power_spec.shape)
#for i in range(mix_power_spec.shape[0]):
# for j in range(mix_power_spec.shape[1]):
# if mix_power_spec[i][j] >= target_power_spec[i][j]:
# other_power_spec[i][j] = mix_power_spec[i][j] -
target_power_spec[i][j]
# else:
# other_power_spec[i][j] = 0
#save_audio(other_power_spec, mix_phases, HOP, WIN, SR, 'D:/orig_other.wav')
name = 'Кино'
mix, ORIG_SR = librosa.core.load(path = 'D:\\tracks\\' + name + '.mp3',
sr = 44100,
mono = True,
offset=21,
duration=34-21)
mix_power_spec, mix_phases = get_power_and_angle(mix, ORIG_SR, SR, WIN, HOP)
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR, 'D:\\tracks\\result\\'
+ name + ' mix.wav')
plot_spec(mix_power_spec)
# Нормализация и расширение спектрограммы
mix_power_spec_norm = normalize_pow_spec(mix_power_spec)
pad = np.zeros((mix_power_spec_norm.shape[0], sample_len//2))
mix_power_spec_norm = np.concatenate((pad, mix_power_spec_norm), axis=1)
mix_power_spec_norm = np.concatenate((mix_power_spec_norm, pad), axis=1)
# Загрузка модели
t0 = time()
print("<----------[INFO] model loading...")
model = load_model('output/model final 10.hdf5')
t1 = time()
print("<----------[INFO] model was loaded in " + str(round((t1-t0), 1)) + "
seconds...")
t0 = time()
print("<----------[INFO] mask forming...")
mask = np.zeros((mix_power_spec_norm.shape[0], 0))
for i in range(mix_power_spec.shape[1]):
mix = mix_power_spec_norm[:,i:i+sample_len]
mix = np.expand_dims(mix, 0)
mix = np.expand_dims(mix, len(mix.shape))
pred = model.predict(mix)[0]
mask = np.concatenate((mask, np.expand_dims(pred, len(pred.shape))),
axis=1)
# Обработка маски
for i in range(mask.shape[0]):
for j in range(mask.shape[1]):
if mask[i][j] >= 0.5:
mask[i][j] = 1
else:
mask[i][j] = 0
# if mask[i][j] <= 0.5:
# mask[i][j] = 0
plot_spec(mask*1000)
t1 = time()
print("<----------[INFO] mask was formed in " + str(round((t1-t0), 1)) + "
seconds...")
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR, 'D:\\tracks\\result\\'
+ name + ' source.wav', mask)
plot_spec(mix_power_spec*mask)
reverse_mask = np.zeros(mask.shape)
for i in range(mask.shape[0]):
for j in range(mask.shape[1]):
reverse_mask[i,j] = 1 - mask[i,j]
save_audio(mix_power_spec, mix_phases, HOP, WIN, SR, 'D:\\tracks\\result\\'
+ name + ' other.wav', reverse_mask)
сб, 27 мая 2023 г. в 11:02, TomasSand ***@***.***>:
Dear Artyom Ivasik, your project is very interesting. I try to run your
code and it's ok. But i don't find the predict file that you mentions in
your Thesis. It's possible view this file?
Thanks for your support.
—
Reply to this email directly, view it on GitHub
<#2>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQGSOQJFRAJIYHMCOXAVBFDXIGYJHANCNFSM6AAAAAAYRA47R4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Dear Artyom Ivasik, your project is very interesting. I try to run your code and it's ok. But i don't find the predict file that you mentions in your Thesis. It's possible view this file?
Thanks for your support.
The text was updated successfully, but these errors were encountered: