Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory usage #18

Open
andrei4002 opened this issue Jan 8, 2014 · 22 comments
Open

memory usage #18

andrei4002 opened this issue Jan 8, 2014 · 22 comments

Comments

@andrei4002
Copy link

I'm having issues fingerprinting a 2 hour long mp3. From what i can see, it fills up the memory (RAM) and then the script crashes. The filesize is around 140MB and I was testing in a virtual machine with ubuntu/5GB ram/3GB swap.
Any thoughts on this?

@pguridi
Copy link
Contributor

pguridi commented Jan 8, 2014

Sounds like this is because now the files are converted to wav in memory. SHA: 7122e11

@worldveil
Copy link
Owner

What needs to happen is that Dejavu should convert and fingerprint piece by piece. The fingerprinting process for a single audio track is actually embarrassingly parallel, so there's no need to tax the RAM like we are doing now. You just have to be careful that you start fingerprinting each chunk of the audio such that it overlaps at least

DEFAULT_WINDOW_SIZE * DEFAULT_OVERLAP_RATIO

so that the windowing process doesn't miss fingerprints on window borders.

If this is something you care about, I would gladly accept a pull request fixing it.

@Wessie
Copy link
Contributor

Wessie commented Jan 21, 2014

The piece by piece fingerprinting is slightly harder to do due to supporting multiple channels. The implementation for this would require to interleave fingerprinting per channel.

But from the tests I ran while trying to gain speedups there seems to be no notable difference between stereo fingerprinting and mono fingerprinting and their accuracy. Admittedly I've not tested this with microphone input.

As for the fingerprinting in chunks, what is the required size of each chunk?

@worldveil
Copy link
Owner

That's just the DEFAULT_WINDOW_SIZE * DEFAULT_OVERLAP_RATIO that I mentioned. This is the amount of overlapping required to keep the same fingerprinting behaviour.

I'd welcome PRs fixing this.

@utilitarianexe
Copy link

Sorry for the terrible formating but here is some code that will split up an audio file for you into peices that dejavu can handel. Would be much better to modify dejavu but this should hopefully help.

def write_audio_file(file_path,audio_array):
'''
Need to do this just for testing
want to be able to create some bizzarre test files for croma print
'''
pipe = sp.Popen([ FFMPEG_BIN,
'-y', # (optional) means overwrite the output file if it already exists.
'-r', "44100", # the input will have 44100 Hz
'-ac','2', # the input will have 2 channels (stereo)
"-f", 's16le', # means 16bit input
'-i', '-', # means that the input will arrive from the pipe
'-vn', # means "don't expect any video input"
'-acodec', "aac","-strict" ,"-2",#"ac3_fixed", # output audio codec
file_path],
stdin=sp.PIPE,stdout=sp.PIPE, stderr=sys.stdin)

pipe.stdin.write(audio_array)

def get_audio_array(pipe,minutes):
number_of_audio_frames = 88200_30_minutes
bytes_in_frame = 4
bytes_to_read = number_of_audio_frames*bytes_in_frame
raw_audio = pipe.stdout.read(bytes_to_read)

raw_audio_array = numpy.fromstring(raw_audio, dtype="int16")
if len(raw_audio_array) < 1:
    return None,None,'reached end of file'
audio_array = raw_audio_array.reshape((len(raw_audio_array)/2,2))
return audio_array,raw_audio,None

def chunk_file(file_path,chunk_folder_path,max_chunk_size,file_suffix = ''):
pipe = read_audio_data(file_path)
extention = '.m4a' #probably a bad choice but first I go to work
i = 0
while True:
i += 1
audio_data,raw_audio_data,error = get_audio_array(pipe,5)
if error is not None:
print error
return

    chunk_file_path = chunk_folder_path + '/chunk_'  + str(i)+ '_' + file_suffix + extention
    print 'write file ' + chunk_file_path
    write_audio_file(chunk_file_path,raw_audio_data)

@utilitarianexe
Copy link

Oh it is the chunck file function you need to call with with max_chun_size being an int number of minutes.

@worldveil
Copy link
Owner

Sorry, I'm really having trouble following your thoughts/code. Code sections will help, but I think an end-to-end example would be better.

Can you format this all in a branch or PR?

@utilitarianexe
Copy link

'''
This will audio files that are too long into sections.
There are probably better libs to do this. 
'''


FFMPEG_BIN = 'ffmpeg'
import subprocess as sp
import numpy
import sys

def read_audio_data(file_path,offset='00:00:00'):
    #define TRUE 0
    command = [ FFMPEG_BIN,
                '-ss',offset,
                '-i', file_path,
                '-f', 's16le',
                '-acodec', 'pcm_s16le',
                '-ar', '44100', # ouput will have 44100 Hz
                '-ac', '2', # stereo (set to '1' for mono)
                '-']
    pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)
    return pipe

def write_audio_file(file_path,audio_array):
    '''
    Need to do this just for testing
    want to be able to create some bizzarre test files for croma print
    '''
    pipe = sp.Popen([ FFMPEG_BIN,
       '-y', # (optional) means overwrite the output file if it already exists.

       #"-acodec", "pcm_s16le", # means raw 16bit input
       '-r', "44100", # the input will have 44100 Hz
       '-ac','2', # the input will have 2 channels (stereo)
        "-f", 's16le', # means 16bit input
       '-i', '-', # means that the input will arrive from the pipe
       '-vn', # means "don't expect any video input"
       '-acodec', "aac","-strict" ,"-2",#"ac3_fixed", # output audio codec
        #'-acodec', "adpcm_sw",#"ac3_fixed", # output audio codec

       #'-b',"mp3", # output bitrate (=quality). Here, 3000kb/second
                      file_path],
                    stdin=sp.PIPE,stdout=sp.PIPE, stderr=sys.stdin)

    pipe.stdin.write(audio_array)

def get_audio_array(pipe,minutes):
    number_of_audio_frames = 88200*30*minutes 
    bytes_in_frame = 4
    bytes_to_read = number_of_audio_frames*bytes_in_frame
    raw_audio = pipe.stdout.read(bytes_to_read)

    raw_audio_array = numpy.fromstring(raw_audio, dtype="int16")
    if len(raw_audio_array) < 1:
        return None,None,'reached end of file'
    audio_array = raw_audio_array.reshape((len(raw_audio_array)/2,2))
    return audio_array,raw_audio,None


def chunk_file(file_path,chunk_folder_path,max_chunk_size,file_suffix = '',extention='.m4a'):
    pipe = read_audio_data(file_path)
    i = 0
    while True:
        i += 1
        audio_data,raw_audio_data,error = get_audio_array(pipe,5)
        if error is not None:
            print error
            return

        chunk_file_path = chunk_folder_path + '/chunk_'  + str(i)+ '_' + file_suffix + extention
        print 'write file ' + chunk_file_path
        write_audio_file(chunk_file_path,raw_audio_data)

def chunk_folder(folder_path,chunked_folder_path,max_chunk_size):
    '''
    so basically go through all files in folder
    turn each file into a bunch of files of length max_chunk_size(given in minutes)
    '''
    from os import listdir
    from os.path import isfile, join
    file_names = [ file_name for file_name in listdir(folder_path) if isfile(join(folder_path,file_name)) ]
    for file_name in file_names:
        file_path = folder_path + '/' +file_name
        chunk_file(file_path,chunked_folder_path,max_chunk_size,file_suffix=file_name)

@utilitarianexe
Copy link

basically call
chunk_folder(path_to_long_audio_files,path_to_output_chunked_files,minutes_for_each_chunk_like_5)

Now you can use dejavu on long files. But would be better if dejavu did this internally.

@CommonLoon102
Copy link

The problem even worse when it tries to fingerprint more files parallel.
I've hardcoded 1 into the script to avoid out of memory:

pool = multiprocessing.Pool(1)

From the command line it isn't possible to pass the max parallel processes as a parameter and it will be defaulted to 4.

djv.fingerprint_directory(directory, ["." + extension], 4)

The program doing a very good job besides this! It just doesn't really can handle long audio files.

@thesunlover
Copy link
Contributor

created a pull request that.
if you find some other improvement on the code go for it
#75

@thesunlover
Copy link
Contributor

Hi, guys.

Can someone guide me how to calculate the starting offset of the chunks?
it's obvious that I missed this when I created the PRs,
pls, help

@pimpmypixel
Copy link

HI guys

Did you guys have any luck in getting this running in regards to low memory setups like Raspberry Pi's...?

@arunganesan
Copy link

I can run it on RPi 2 no problem. I didn't try timing it but it seems to run pretty fast.

@pimpmypixel
Copy link

@arunganesan what commit are you using?

@arunganesan
Copy link

Hm, Im not sure. I just cloned the latest commit from like a few days ago. I am only fingerprinting short sounds. At max 30 seconds.

@thesunlover
Copy link
Contributor

You can check my repository
https://github.com/IskrenStanislavov/dejavu/tree/split-fingerprinting
it is tested to work with 4 hours audios
the only missing thing is the offset_seconds.detection from the original audio

@thesunlover
Copy link
Contributor

@worldveil
Hello, Will.
Is this the direction I should follow to calculate the proper offset_seconds?
3 min * 60 sec * 44100 samples per sec * 2 channels = 15,876,000 samples

@thesunlover
Copy link
Contributor

I would like to complete this PullRequest, but I definetely need your help...
Two things.

  1. I want to know how to properly calculate the offset in value manner, not the matrix version. Cause I don't have the experience with the tools used in fingerprinting.
  2. By splitting into files do I need to set up the time limits of files as described in:
    memory usage #18 (comment)
    and how to calculate in seconds.
    Help would be highly appreciated.

@shemul
Copy link

shemul commented May 4, 2016

Does dejavu can fingerprint 2 hours long mp3 correctly and search yet ? what did i miss ?

@sheffieldnikki
Copy link

sheffieldnikki commented Aug 3, 2016

Any news on fixing this by merging split-fingerprinting? dejavu is almost unusable on low memory machines - even the example mp3 files give out of memory errors when trying to fingerprint on a 512MB machine :( (and relying on swap is a disaster on this machine - its only storage is a memory card). Thanks

Edit: got something working by fingerprinting my songs on a machine with lots more memory, and then simply copying the MySQL innodb database files over to the small machine. Seems to be running the recognition fine :)

@ajahongir
Copy link

I had this issue in 2 ubuntu droplets. both had 512mb memory. then I extend one of them to 2gb and problem seams had disappeared!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests