-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split big files #75
Split big files #75
Conversation
warn: config is very weak n slow need to add tuning
So to be clear, this splits up large files over a certain length (ie, 3 minutes), and fingerprints each as a separate "song", yes? |
updated the description of the PR |
@@ -16,6 +32,9 @@ class Dejavu(object): | |||
OFFSET = 'offset' | |||
OFFSET_SECS = 'offset_seconds' | |||
|
|||
SPLIT_DIR = "split_dir" | |||
OVERWRITE_WHEN_SPLITING = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overwrites what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its about overwriting the temp files that were split with ffmpeg.
probably wouldnt be needed if we delete the temporary directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably should have coded it as "always overwrite" with no conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed the constant to OVERWRITE_TEMP_FILES_WHEN_SPLITING
Also the binary files (large mp3s) are too large and I'd rather not increase the size of the repo with those. Can you instead give a public link (could even download with |
ok, will do it later in the evening |
removed the non-copyright files and added auto-creation of the long file with "ffmpeg concat" |
do we need to test with other formats like ogg & flv? |
If pydub handles them, I think it should be fine. Will review this sometime this weekend I hope. |
OVERWRITE_TEMP_FILES_WHEN_SPLITING
song_name_for_the_split
I have not thought about using 1 minute limit and splitting the existing files. If I set the maximum audio length for straight fingerprinting to be 1 minute and use all the available CPU cores it might be faster to fingerprint even short songs. - full CPU usage and less amount of memory usage at the same time. What do you say, should i test that? |
slice limit is now a property in the Dejavu class
in the last 2 commits i moved a few arguments to be properties of Dejavu. |
will review this week - apologies for the delay! |
OK, I'm pulling this for review right now. First things first, we need to remove the large binary files from the git history. You removed them from current version, but they are still in history ( Second, yes the arguments for fingerprinting and splitting large files should be in the config. And we should document those options in Many thanks for your patience in getting back to you! |
I think it would be easier for me to is that ok? |
sounds fine to me. |
were you able to create a new PR with this? would gladly merge. |
Hello, worldveil |
reposted the PR in here: |
@worldveil would you please review the new PR #87 and give advise how to properly calculate the offset_seconds for the parts that follow the first one ? |
@thesunlover I left a comment in PR #87 on how I got offset_seconds working. |
added support for huge files
check the long_test.py and play with minutes & processes number to fit in the RAM available
Edit:
Procedure of the process: