Connect your Wordfeud to CDF
Before setting up this extractor, you need to:
-
Wordfeud API Package: This script depends on the forked Wordfeud API from https://github.com/mallpunk/Python-Wordfeud-API. Install the Wordfeud API package:
pip install git+https://github.com/mallpunk/Python-Wordfeud-API.git
Or if you have a local development version:
pip install -e ../Python-Wordfeud-API
-
Wordfeud Account: You will need a Wordfeud account and login credentials.
- Register at https://wordfeud.com if you don't have an account
- Note down your email and password for the Wordfeud account
-
Clone this repository and navigate to the wordfeud-cdf directory
-
Install the required dependencies:
pip install -r requirements.txt
-
Create a
credentials.pyfile with your Wordfeud login details:cp credentials_template.py credentials.py # Edit credentials.py with your Wordfeud email, password, and username -
In the Fusion UI, create a new data set and write down the data set ID.
-
In the Fusion UI, create a new security category. Write down the ID of the security category.
-
Create an identity provider application for the extractor:
For Azure AD:
- Create a new Active Directory application in your Azure Active Directory
- Write down the Client ID and create and write down a Client Secret of this app.
- We will refer to these as the EXTRACTOR_CLIENT_ID and EXTRACTOR_CLIENT_SECRET.
- In addition, you should note down the Tenant ID of the Azure Active Directory itself.
- We will refer to that as CDF_AD_TENANT_ID.
For other IDPs:
- Create an application in your identity provider
- Write down the Client ID and create and write down a Client Secret of this app.
- We will refer to these as the EXTRACTOR_CLIENT_ID and EXTRACTOR_CLIENT_SECRET.
- Note down the token URL for your IDP (e.g.,
https://your-idp.com/oauth2/token) - We will refer to that as CDF_TOKEN_URL.
-
Create a group in your identity provider and add the new application to it.
- Write down the Object ID of this group.
-
Create a group in CDF with the following capabilities:
timeseries:read and writescoped to the dataset created in step 4.extractionpipelines:writescoped to dataset created in step 4.extractionruns:writescoped to dataset created in step 4.datasets:readscoped to dataset created in step 4.- Set the Source ID of the group to the Object ID found in step 7.
-
Initialize the extractor by running:
For Azure AD:
python3 handler.py -c EXTRACTOR_CLIENT_ID -k EXTRACTOR_CLIENT_SECRET -t CDF_AD_TENANT_ID -p CDF_PROJECT -b CDF_CLUSTER_BASE_URL -i True -d DATA_SET_ID -a SECURITY_CATEGORY_ID --board_type BoardNormal --rule_set RuleSetNorwegian
For other IDPs:
python3 handler.py -c EXTRACTOR_CLIENT_ID -k EXTRACTOR_CLIENT_SECRET --token_url CDF_TOKEN_URL -p CDF_PROJECT -b CDF_CLUSTER_BASE_URL -i True -d DATA_SET_ID -a SECURITY_CATEGORY_ID --board_type BoardNormal --rule_set RuleSetNorwegian
Note:
- Credentials are loaded from
credentials.pyfile - Extraction pipeline is automatically generated as
extractors/wordfeud-{username}from your credentials - You can override the extraction pipeline with
--extraction_pipeline custom-pipeline-idif needed
Cleanup Option: To delete existing time series before creating new ones, add the
--cleanupflag:python3 handler.py -c EXTRACTOR_CLIENT_ID -k EXTRACTOR_CLIENT_SECRET --token_url CDF_TOKEN_URL -p CDF_PROJECT -b CDF_CLUSTER_BASE_URL -i True -d DATA_SET_ID -a SECURITY_CATEGORY_ID --board_type BoardNormal --rule_set RuleSetNorwegian --cleanup
Board Configuration Options:
- Board Types:
BoardNormal(default),BoardRandom - Rule Sets:
RuleSetNorwegian(default),RuleSetAmerican,RuleSetDanish,RuleSetDutch,RuleSetEnglish,RuleSetFrench,RuleSetSpanish,RuleSetSwedish
Available Rule Sets:
RuleSetAmerican- American English rulesRuleSetDanish- Danish rulesRuleSetDutch- Dutch rulesRuleSetEnglish- English rulesRuleSetFrench- French rulesRuleSetNorwegian- Norwegian rules (default)RuleSetSpanish- Spanish rulesRuleSetSwedish- Swedish rules
Board Types:
BoardNormal- Standard board layout (default)BoardRandom- Randomized board layout
- Credentials are loaded from
-
Create a zip file for CDF function deployment:
# The zip file includes the Wordfeud API code directly # This is handled automatically by the project setup # The handler.zip file is ready for deployment
-
Create a new CDF function in the Functions section of the Fusion UI. Add the following secrets:
wordfeud-emailwith value WORDFEUD_EMAIL from step 2.wordfeud-passwith value WORDFEUD_PASSWORD from step 2.wordfeud-userwith value USERNAME from step 2.board-typewith valueBoardNormalorBoardRandom(optional, defaults to BoardNormal)rule-setwith value from the rule set options above (optional, defaults to RuleSetNorwegian)
-
Set up a schedule for the function.
- Set cron expression to every 20 minutes
*/20 * * * * - Set Client ID and Client Secret to EXTRACTOR_CLIENT_ID and EXTRACTOR_CLIENT_SECRET from step 6
- Data configuration is optional - the function will auto-generate the extraction pipeline name from the
wordfeud-usersecret - If you want to specify a custom extraction pipeline, set data to:
{ "extraction-pipeline": "extractors/wordfeud-custom" }Note: The function automatically uses
extractors/wordfeud-{username}based on yourwordfeud-usersecret. - Set cron expression to every 20 minutes
If you need to backfill data, you may call the Cognite function for the extractor with the data parameter set to:
{
"start-time": UNIX_TIMESTAMP_IN_MS
}Note: The extraction pipeline is automatically generated from your wordfeud-user secret. If you need a custom pipeline, you can specify it explicitly:
{
"extraction-pipeline": "extractors/wordfeud-custom",
"start-time": UNIX_TIMESTAMP_IN_MS
}The extractor supports the following command line arguments:
-c, --client_id: CDF OIDC client ID-k, --key: CDF OIDC client secret-p, --project: CDF project name
-t, --tenant_id: Azure AD tenant ID (for Azure AD)--token_url: Token URL (for other IDPs)
-b, --base_url: CDF cluster base URL (default: https://api.cognitedata.com)-i, --init: Initialize time series and extraction pipeline (default: False)--cleanup: Delete existing time series before creating new ones (requires --init)-d, --dataset: Dataset ID from CDF-a, --admin_security_category: Security category ID--extraction_pipeline: External ID for extraction pipeline (auto-generated from username if not specified)--board_type: Wordfeud board type - BoardNormal or BoardRandom (default: BoardNormal)--rule_set: Wordfeud rule set/language (default: RuleSetNorwegian)-s, --start_time: Begin timestamp in milliseconds (default: one week ago)-e, --end_time: End timestamp in milliseconds (default: now)
The extractor will create the following time series in CDF (where USERNAME is your configured username):
WORDFEUD/USERNAME/rating- Wordfeud Rating - USERNAME (your current rating over time)WORDFEUD/USERNAME/games_played- Wordfeud Games Played - USERNAME (number of games played)WORDFEUD/USERNAME/games_won- Wordfeud Games Won - USERNAME (number of games won)WORDFEUD/USERNAME/win_rate- Wordfeud Win Rate - USERNAME (win rate percentage)WORDFEUD/USERNAME/current_streak- Wordfeud Current Streak - USERNAME (current winning/losing streak)WORDFEUD/USERNAME/best_rating- Wordfeud Best Rating - USERNAME (best rating achieved)
Features:
- Step Charts: All time series are configured as step charts (
is_step=True) for better visualization of discrete changes - Game-Based Datapoints: Datapoints are created only when games are completed, not on every extractor run
- Accurate Timestamps: Uses game finish timestamps instead of extractor run timestamps
- Metadata: Each datapoint includes metadata about the game (game ID, result, opponent, rating change)
- Rating Tracking: Uses the Wordfeud API's per-game rating information for accurate rating changes
How It Works:
- Extractor runs every 20 minutes but only creates datapoints when games are completed
- Checks for new completed games since the last datapoint was created
- Uses Wordfeud API's rating data to get the exact rating after each game
- Creates datapoints with game metadata including rating change, opponent, and game result
- Uses step chart visualization to show discrete changes rather than continuous lines
Datapoint Metadata: Each datapoint includes metadata with the following information:
game_id: The Wordfeud game IDresult: Game result (won/lost)opponent: Opponent's usernamerating_delta: Rating change for this gameruleset: Game rule set usedboard: Board type used
Note: Wordfeud credentials are stored securely in the CDF function secrets, not in time series data.
The extractor includes a cleanup feature to delete existing time series before creating new ones:
- Lists existing time series for the specified username
- Shows time series details (name and external ID)
- Requests user confirmation before deletion
- Deletes only the time series that will be recreated
- Handles cases where no time series exist
Add the --cleanup flag to the initialization command:
python3 handler.py -c CLIENT_ID -k CLIENT_SECRET --token_url TOKEN_URL -p PROJECT -b BASE_URL -i True -d DATASET_ID --cleanup🧹 Cleanup mode: Checking for existing time series for user 'epistel'...
Found 6 existing time series for user 'epistel':
- Wordfeud Rating - epistel (WORDFEUD/epistel/rating)
- Wordfeud Games Played - epistel (WORDFEUD/epistel/games_played)
- Wordfeud Games Won - epistel (WORDFEUD/epistel/games_won)
- Wordfeud Win Rate - epistel (WORDFEUD/epistel/win_rate)
- Wordfeud Current Streak - epistel (WORDFEUD/epistel/current_streak)
- Wordfeud Best Rating - epistel (WORDFEUD/epistel/best_rating)
❓ Do you want to delete these time series? (yes/no): yes
✅ Successfully deleted existing time series
✅ Cleanup completed successfully
The extractor handles credentials and board configuration differently depending on the execution environment:
- Local Testing/Initialization: Credentials are loaded from
credentials.pyfile - CDF Function Execution: Credentials are loaded from function secrets:
wordfeud-email,wordfeud-pass, andwordfeud-userfor credentialsboard-typeandrule-setfor board configuration (optional)
This approach ensures credentials are never stored in CDF time series while still allowing local testing and initialization.
The extractor automatically handles extraction pipeline naming:
- Local Development: Auto-generates
extractors/wordfeud-{username}fromcredentials.py - CDF Function: Auto-generates
extractors/wordfeud-{username}fromwordfeud-usersecret - Override Option: You can specify a custom extraction pipeline with
--extraction_pipelineor in function data
This ensures consistent naming across environments while allowing customization when needed.
The handler.zip file includes all necessary dependencies:
handler.py- Main extractor scriptrequirements.txt- CDF SDK and other dependencieswordfeud_api/- Embedded Wordfeud API code (not published to PyPI)
Note: credentials.py is NOT included in the deployment package. Credentials are loaded from CDF function secrets in production.
The extractor supports different identity provider (IDP) configurations:
- Use
--tenant_idparameter with your Azure AD tenant ID - Token URL is automatically constructed as
https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token
- Use
--token_urlparameter with your IDP's token endpoint - Example:
--token_url https://your-idp.com/oauth2/token - Either
--tenant_idor--token_urlmust be provided
- If you encounter authentication issues, check that your Wordfeud credentials are correct
- If the extractor fails to connect to CDF, verify your identity provider application configuration
- For Azure AD: Check that your tenant ID is correct and the application has proper permissions
- For other IDPs: Verify that your token URL is correct and the application is properly configured
- For rate limiting issues, the extractor includes built-in retry logic with exponential backoff
- If the Wordfeud API package is not found, ensure it's properly installed:
pip install wordfeud-api - If datapoints are not being created, check that games have been completed since the last run
- If rating values seem incorrect, the extractor now uses per-game rating data from the Wordfeud API
- For cleanup issues, ensure you have proper permissions to delete time series in CDF
- If extraction pipeline reporting fails, check that the pipeline exists and the function has proper permissions
- For auto-generated pipeline naming issues, verify that the
wordfeud-usersecret is correctly configured