Skip to content

Commit db925fc

Browse files
authored
Improve data saving (#1)
* add some config files * update valid query fields from current docs https://developers.facebook.com/docs/graph-api/reference/archived-ad/ * timestamp csv files, be more forgiving on input * add new params, overhaul URL construction - Adds ad_type and media_type so we can set those values - changes default API version from v14.0 to v20.0 - renames some variables to match parameters sent to the API - only sends parameters in the URL if they're set * add output folder, update gitignores * add some output, including rate limiting details * rename get_parser -> argument_parser * better organization of saved files save files in a new timestamped folder in the output directory * load access token from env if it's there * use csv.writer to write csv, and write as we go Writing each ad to the csv file as we get it instead of waiting until we have all the ads. * update README * check for different types of rate limit headers https://developers.facebook.com/docs/graph-api/overview/rate-limiting/ * set batch size default to 250 and update help * Facebook Ads Library API -> Meta Ad Library API * improve help
1 parent ca6031f commit db925fc

9 files changed

+343
-124
lines changed

.env-sample

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
ACCESS_TOKEN=xxx

.gitignore

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
*.csv
2+
.DS_Store
3+
.env
4+
__pycache__
5+
pyrightconfig.json
6+
venv

README.md

+100-16
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,119 @@
11
# Ads-Library-API-Script-Repository
2-
Ads-Library-API-Script-Repository is a set of code examples to help user/researchers understand how the Facebook Ads Library API works. It also provides a simple command-line interface(CLI) for users to easily use the Facebook Ads Library API.
2+
Ads-Library-API-Script-Repository is a set of code examples to help user/researchers understand how the Meta Ad Library API works. It also provides a simple command-line interface(CLI) for users to easily use the Meta Ad Library API.
33

4-
## Examples
5-
Here's an example on how to use the CLI:
4+
## Setup
65

7-
$ python fb_ads_library_api_cli.py -t {access_token} -f 'page_id,ad_snapshot_url,funding_entity,ad_delivery_start_time' -c 'CA' -s '.' -v count
6+
### Make sure you have Python 3 installed
87

9-
It would count the number of all polictical ads in CA(Canada);
8+
This command should show you a path to the executable, like `/usr/bin/python3`
9+
```bash
10+
which python3
11+
```
1012

11-
Note: please replace the '{access_token}' with your [Facebook Developer access token](https://developers.facebook.com/tools/accesstoken/).
13+
If Python isn't installed and you're on an Apple computer, [install homebrew](https://brew.sh/) and use it to install python3
14+
```bash
15+
brew install python
16+
```
1217

13-
## Requirements
14-
Ads-Library-API-Script-Repository requires or works with
15-
* Mac OS X or Linux or Window
16-
* Python 3.0+
17-
* Python Requests Library ([installation](https://docs.python-requests.org/en/master/user/install/#install))
18-
* Python iso3166 Library ([installation](https://pypi.org/project/iso3166/))
18+
You can check [Python's downloads page](https://www.python.org/downloads/) for instructions on installing on other operating systems.
1919

20+
### Start a virtual environment
21+
22+
Create the environment
23+
```bash
24+
python3 -m venv venv
25+
```
26+
27+
Activate it
28+
```bash
29+
source venv/bin/activate
30+
```
31+
32+
### Install the required packages
33+
```bash
34+
python3 -m pip install -r requirements.txt
35+
```
36+
37+
## Usage
38+
39+
To use these scripts to access the [Meta Ad Library API](https://www.facebook.com/ads/library/api), you must have a Facebook developer account, which will require you to confirm your identity (by uploading identifying documents such as a drivers license or passport) and mailing address (by entering a code that Meta sends you in the physical mail.)
40+
41+
Once those details are confirmed, you can create a new app (an app of type "Business" will work) which will allow you to generate an access token. That token is required by these scripts to authenticate with the API. The access token can be found on the [Graph API Explorer](https://developers.facebook.com/tools/explorer/) or the [Access Token Tool](https://developers.facebook.com/tools/accesstoken/), where it's described as the "User Token".
42+
43+
The access token can be passed to the script using the `-t` flag, or saved in a `.env` file with the key `ACCESS_TOKEN`. You can copy the `.env-sample` file in this repository to `.env` and fill in your token there.
44+
45+
```bash
46+
cp .env-sample .env
47+
```
48+
49+
If you choose to save the results of your query to a file, they will be saved in the `output` directory, in a folder time-stamped with the time you started the query.
50+
51+
Here are some examples on how to use the CLI:
52+
53+
### Count the number of political ads in Canada (CA)
54+
replace `{access_token}` with your token
55+
```python
56+
python3 python/fb_ads_library_api_cli.py -t {access_token} -f 'page_id,ad_snapshot_url,funding_entity,ad_delivery_start_time' -c 'CA' -s '.' -v count
57+
```
58+
59+
### Search US political ads delivered after 7/20 for "coconut" and save them to a CSV file
60+
Assuming you've put your access token in `.env`
61+
```python
62+
python3 python/fb_ads_library_api_cli.py -f 'id,ad_creation_time,ad_creative_bodies,ad_creative_link_captions,ad_creative_link_descriptions,ad_creative_link_titles,ad_delivery_start_time,ad_delivery_stop_time,ad_snapshot_url,age_country_gender_reach_breakdown,beneficiary_payers,bylines,currency,delivery_by_region,demographic_distribution,estimated_audience_size,eu_total_reach,impressions,languages,page_id,page_name,publisher_platforms,spend,target_ages,target_gender,target_locations' -c 'US' --ad-type 'POLITICAL_AND_ISSUE_ADS' -s 'coconut' --batch-size 250 --after-date 2024-07-20 -v save_to_csv coconut_after_07_20
63+
```
64+
65+
### Options
66+
67+
You can see the available arguments by passing `--help`
68+
69+
```bash
70+
python3 python/fb_ads_library_api_cli.py --help
71+
```
72+
73+
```
74+
The Meta Ad Library API CLI Utility
75+
76+
positional arguments:
77+
action Action to take on the ads, possible values: count,save,save_to_csv,start_time_trending
78+
args The parameter for the specific action
79+
80+
options:
81+
-h, --help show this help message and exit
82+
-t ACCESS_TOKEN, --access-token ACCESS_TOKEN
83+
The Facebook developer access token
84+
-f FIELDS, --fields FIELDS
85+
Fields to retrieve from the Ad Library API, comma-separated, no spaces
86+
-s SEARCH_TERMS, --search-terms SEARCH_TERMS
87+
The terms you want to search for, space-separated
88+
-c COUNTRY, --country COUNTRY
89+
Country code(s), comma-separated, no spaces
90+
--search-page-ids SEARCH_PAGE_IDS
91+
A specific Facebook Page you want to search
92+
--ad-active-status AD_ACTIVE_STATUS
93+
Filter by the current status of the ads at the moment the script runs, can be ALL (default), ACTIVE, INACTIVE
94+
--ad-type AD_TYPE Return this type of ad, can be ALL (default), CREDIT_ADS, EMPLOYMENT_ADS, HOUSING_ADS, POLITICAL_AND_ISSUE_ADS
95+
--media-type MEDIA_TYPE
96+
Return ads that contain this type of media, can be ALL (default), IMAGE, MEME, VIDEO, NONE
97+
--after-date AFTER_DATE
98+
Only return ads that started delivery after this date, in the format YYYY-MM-DD
99+
--batch-size BATCH_SIZE
100+
Request records in batches of this size, default is 250
101+
--retry-limit RETRY_LIMIT
102+
How many times to retry when an error occurs, default is 3
103+
-v, --verbose
104+
```
20105

21106
## How Ads-Library-API-Script-Repository works
22-
The script will query the [Facebook Ads library API](https://www.facebook.com/ads/library/api) to get all the Ads Library information on the Facebook platform;
107+
The script will query the [Meta Ad Library API](https://www.facebook.com/ads/library/api) to get all the Ad Library information on the Facebook platform;
23108

24-
## Full documentation
25-
You can find the full documentation here: (--to-be-added--)
26109

27-
## More about Facebook Ads Library
110+
## More about Meta Ad Library
28111
* Website: https://www.facebook.com/ads/library
29112
* Report: https://www.facebook.com/ads/library/report
30113
* API: https://www.facebook.com/ads/library/api
31114

32115
See the [CONTRIBUTING](CONTRIBUTING.md) file for how to help out.
33116

117+
34118
## License
35119
Ads-Library-API-Script-Repository is licensed under the Facebook Platform License, as found in the LICENSE file.

output/.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# ignore everything in this directory except this file
2+
*
3+
!.gitignore

python/fb_ads_library_api.py

+54-22
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import json
1010
import re
1111
from datetime import datetime
12+
from json.decoder import JSONDecodeError
1213

1314
import requests
1415

@@ -21,52 +22,64 @@ def get_ad_archive_id(data):
2122

2223

2324
class FbAdsLibraryTraversal:
24-
default_url_pattern = (
25-
"https://graph.facebook.com/{}/ads_archive?access_token={}&"
26-
+ "fields={}&search_terms={}&ad_reached_countries={}&search_page_ids={}&"
27-
+ "ad_active_status={}&limit={}"
28-
)
29-
default_api_version = "v14.0"
25+
default_url_parameters = [
26+
"access_token",
27+
"ad_active_status",
28+
"ad_reached_countries",
29+
"ad_type",
30+
"fields",
31+
"limit",
32+
"media_type",
33+
"search_page_ids",
34+
"search_terms",
35+
]
36+
default_url_pattern = "https://graph.facebook.com/{}/ads_archive?"
37+
default_api_version = "v20.0"
3038

3139
def __init__(
3240
self,
3341
access_token,
3442
fields,
35-
search_term,
36-
country,
43+
search_terms,
44+
ad_reached_countries,
45+
ad_type="ALL",
46+
media_type="ALL",
3747
search_page_ids="",
3848
ad_active_status="ALL",
3949
after_date="1970-01-01",
40-
page_limit=500,
50+
limit=250,
4151
api_version=None,
4252
retry_limit=3,
4353
):
4454
self.page_count = 0
4555
self.access_token = access_token
4656
self.fields = fields
47-
self.search_term = search_term
48-
self.country = country
57+
self.search_terms = search_terms
58+
self.ad_reached_countries = ad_reached_countries
59+
self.ad_type = ad_type
60+
self.media_type = media_type
4961
self.after_date = after_date
5062
self.search_page_ids = search_page_ids
5163
self.ad_active_status = ad_active_status
52-
self.page_limit = page_limit
64+
self.limit = limit
5365
self.retry_limit = retry_limit
5466
if api_version is None:
5567
self.api_version = self.default_api_version
5668
else:
5769
self.api_version = api_version
5870

5971
def generate_ad_archives(self):
60-
next_page_url = self.default_url_pattern.format(
61-
self.api_version,
62-
self.access_token,
63-
self.fields,
64-
self.search_term,
65-
self.country,
66-
self.search_page_ids,
67-
self.ad_active_status,
68-
self.page_limit,
69-
)
72+
# construct the URL
73+
next_page_url = self.default_url_pattern.format(self.api_version)
74+
params_to_add = []
75+
76+
for param in self.default_url_parameters:
77+
param_value = getattr(self, param)
78+
if param_value:
79+
params_to_add.append(f"{param}={param_value}")
80+
81+
next_page_url += "&".join(params_to_add)
82+
7083
return self.__class__._get_ad_archives_from_url(
7184
next_page_url, after_date=self.after_date, retry_limit=self.retry_limit
7285
)
@@ -75,13 +88,32 @@ def generate_ad_archives(self):
7588
def _get_ad_archives_from_url(
7689
next_page_url, after_date="1970-01-01", retry_limit=3
7790
):
91+
rate_limit_headers = [
92+
"x-ad-account-usage",
93+
"x-app-usage",
94+
"x-business-use-case-usage",
95+
]
7896
last_error_url = None
7997
last_retry_count = 0
8098
start_time_cutoff_after = datetime.strptime(after_date, "%Y-%m-%d").timestamp()
8199

82100
while next_page_url is not None:
101+
print(">> requesting page at " + next_page_url)
83102
response = requests.get(next_page_url)
84103
response_data = json.loads(response.text)
104+
print(">> got response!")
105+
106+
# get rate limiting details from headers
107+
for header in rate_limit_headers:
108+
usage = response.headers.get(header)
109+
if usage:
110+
try:
111+
print(f">> {header}")
112+
print(json.loads(usage))
113+
except JSONDecodeError as err:
114+
print(">> error trying to get rate limit details from headers!")
115+
print(err)
116+
85117
if "error" in response_data:
86118
if next_page_url == last_error_url:
87119
# failed again

0 commit comments

Comments
 (0)