Waifu scrapper

Collection of scripts used to scrap waifu data, normalize it, select the top waifus and put the results into a relational database.

Scrapping: Scraps all the waifu data and images from MyWaifuList using internal site API's pretending to be the site doing normal calls (their API endpoints aren't rate-limited at the moment of writing)

Selecting waifus: Selecting is just done by picking the top N waifus ranked by popularity, where popularity is defined as #upvotes+#downvotes

Install

pip install -r requirements.txt

Use

python scrapper.py #Obtain all waifu data
python normalize.py #Normalize image filenames
python waifuselect.py #Select best waifus and prepare the data
python createdb.py #Put all the data in an SQL database

All the scrapped info will be stored in waifus.json. If you're only interested in the dataset itself, see a detailed overview of the dataset and the full dataset on Kaggle.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
final		final
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
createdb.py		createdb.py
error.txt		error.txt
missing.txt		missing.txt
normalize.py		normalize.py
requirements.txt		requirements.txt
scrapper.py		scrapper.py
waifus.json		waifus.json
waifuselect.py		waifuselect.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Waifu scrapper

Install

Use

About

Releases

Packages

Contributors 2

Languages

License

thewaifuproject/scrapper

Folders and files

Latest commit

History

Repository files navigation

Waifu scrapper

Install

Use

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages