This repository contains a fork of the original, but archived
app-store-scraper
package, updated by Futurice for
compatibility with newer Python and requests
package versions.
This version is not published in PyPI (yet), but you can install it as a source package directly from GitHub:
pip install https://github.com/futurice/app-store-scraper/archive/refs/tags/v0.3.6.zip
The README of the original package is reproduced below.
___ _____ _ _____
/ _ \ / ___| | / ___|
/ /_\ \_ __ _ __ \ `--.| |_ ___ _ __ ___ \ `--. ___ _ __ __ _ _ __ ___ _ __
| _ | '_ \| '_ \ `--. \ __/ _ \| '__/ _ \ `--. \/ __| '__/ _` | '_ \ / _ \ '__|
| | | | |_) | |_) | /\__/ / || (_) | | | __/ /\__/ / (__| | | (_| | |_) | __/ |
\_| |_/ .__/| .__/ \____/ \__\___/|_| \___| \____/ \___|_| \__,_| .__/ \___|_|
| | | | | |
|_| |_| |_|
Scrape reviews for an app:
from app_store_scraper import AppStore
from pprint import pprint
minecraft = AppStore(country="nz", app_name="minecraft")
minecraft.review(how_many=20)
pprint(minecraft.reviews)
pprint(minecraft.reviews_count)
Scrape reviews for a podcast:
from app_store_scraper import Podcast
from pprint import pprint
sysk = Podcast(country="nz", app_name="stuff you should know")
sysk.review(how_many=20)
pprint(sysk.reviews)
pprint(sysk.reviews_count)
Let's continue from the code example used in Quickstart.
There are two required and one positional parameters:
country
(required)- two-letter country code of ISO 3166-1 alpha-2 standard
app_name
(required)- name of an iOS application to fetch reviews for
- also used by
search_id()
method to search forapp_id
internally
app_id
(positional)- can be passed directly
- or ignored to be obtained by
search_id
method internally
Once instantiated, the object can be examined:
>>> minecraft
AppStore(country='nz', app_name='minecraft', app_id=479516143)
>>> print(app)
Country | nz
Name | minecraft
ID | 479516143
URL | https://apps.apple.com/nz/app/minecraft/id479516143
Review count | 0
Other optional parameters are:
log_format
- passed directly to
logging.basicConfig(format=log_format)
- default is
"%(asctime)s [%(levelname)s] %(name)s - %(message)s"
- passed directly to
log_level
- passed directly to
logging.basicConfig(level=log_level)
- default is
"INFO"
- passed directly to
log_interval
- log is produced every 5 seconds (by default) as a "heartbeat" (useful for a long scraping session)
- default is
5
The maximum number of reviews fetched per request is 20. To minimise the number of calls, the limit of 20 is hardcoded. This means the review()
method will always grab more than the how_many
argument supplied with an increment of 20.
>>> minecraft.review(how_many=33)
>>> minecraft.reviews_count
40
If how_many
is not provided, review()
will terminate after all reviews are fetched.
NOTE the review count seen on the landing page differs from the actual number of reviews fetched. This is simply because only some users who rated the app also leave reviews.
after
- a
datetime
object to filter older reviews
- a
sleep
- an
int
to specify seconds to sleep between each call
- an
The fetched review data are loaded in memory and live inside reviews
attribute as a list of dict.
>>> minecraft.reviews
[{'userName': 'someone', 'rating': 5, 'date': datetime.datetime(...
Each review dictionary has the following schema:
{
"date": datetime.datetime,
"isEdited": bool,
"rating": int,
"review": str,
"title": str,
"userName": str
}