mp3_extraction

This program extracts the mp3 files from the website "albalearning.com/audiolibros" where a vast collection of audiobooks is stored (mainly in Spanish even though there are few books in French and English).

The script uses the module BeautifulSoup from the bs4 package to download the mp3 files corresponding to the audiobooks whose authors list is provided within the main() function. The user has the possibility to create the authors list manually by entering the right keywords or by initially extracting the keywords for ALL the authors stored on the website and later filtering according to the user's wish.

The script works correctly except for some books/authors where the webpage may present some defect. For instance, the book entitled "El elixir de larga vida" stored on the website "https://albalearning.com/audiolibros/balzac/elixir.html" exhibits the same link for both the first and the second part of the audiobook leading to a problem that can be easily identified and solved manually. However the algorithm used in the script relies on some sort of pattern so that particular mistakes like this one cannot be predicted and solved automatically because they deviate from the norm. This case is the consequence of a small misprint commited by the web developer of the site.

The script can be easily adapted for extraction of other files from different sites with similar html tag structure. Nevertheless, for the moment the program has just been used on the above-mentioned website with excellent performance.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
MP3_extract.py		MP3_extract.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mp3_extraction

About

Releases

Packages

Languages

esorolla/mp3_extraction

Folders and files

Latest commit

History

Repository files navigation

mp3_extraction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages