This is a Python package that integrates Superset and CommCare HQ.
The 'User configurable reports UI' feature flag must be enabled for the domain in CommCare HQ, even if the data sources to be imported were created by Report Builder, not a UCR.
While doing development on top of this integration, it's useful to
install this via pip -e
option so that any changes made get reflected
directly without another pip install
.
- Set up a virtual environment.
- Clone this repo and change into the directory of this repo.
- Run
cp superset_config.example.py superset_config.py
and override the config appropriately. - Run
pip install -e '.[dev]'
-
Create an OAuth application on CommCare HQ using Django Admin at the URL
<hq_host>/admin/oauth2_provider/application/
with the following settings:-
Redirect URIs:
<superset_host>/oauth-authorized/commcare
-
Leave "Post logout redirect URIs" empty.
-
Client type: Confidential
-
Authorization grant type: Authorization code
-
Give your OAuth application a name that users would recognize, like "CommCare Analytics" or "HQ Superset". This name will appear in CommCare HQ's dialog box requesting authorization from the user.
-
Leave "Skip authorization" unchecked
-
Algorithm: No OIDC support
-
-
Update
OAUTH_PROVIDERS
setting insuperset_config.py
with OAuth client credentials obtained from HQ.
Read through the initialization instructions at https://superset.apache.org/docs/installation/installing-superset-from-scratch/#installing-and-initializing-superset.
Create a database for Superset, and a database for storing data from CommCare HQ. Adapt the username and database names to suit your environment.
$ createdb -h localhost -p 5432 -U postgres superset
$ createdb -h localhost -p 5432 -U postgres superset_hq_data
Set the following environment variables:
$ export FLASK_APP=superset
$ export SUPERSET_CONFIG_PATH=/path/to/superset_config.py
Set this environment variable to allow OAuth 2.0 authentication with CommCare HQ over insecure HTTP. (DO NOT USE THIS IN PRODUCTION.)
$ export AUTHLIB_INSECURE_TRANSPORT=1
Initialize the databases. Create an administrator. Create default roles and permissions:
$ superset db upgrade
$ superset db upgrade --directory hq_superset/migrations/
$ superset fab create-admin
$ superset load_examples # (Optional)
$ superset init
You may skip superset load_examples
, although they could be useful.
You should now be able to run superset using the superset run
command:
$ superset run -p 8088 --with-threads --reload --debugger
You can now log in as a CommCare HQ web user.
In order for CommCare HQ to sync data source changes, you will need to allow OAuth 2.0 authentication over insecure HTTP. (DO NOT USE THIS IN PRODUCTION.) Set this environment variable in your CommCare HQ Django server. (Yes, it's "OAUTHLIB" this time, not "AUTHLIB" as before.)
$ export OAUTHLIB_INSECURE_TRANSPORT=1
There might be situations where you need to log into Superset as a local
admin user, for example, to add a database connection. To enable local
user authentication, in superset_config.py
, set
AUTH_TYPE = AUTH_DB
.
Doing this will prevent CommCare HQ users from logging in, so it should only be done in production environments when CommCare Analytics is not in use.
To return to allowing CommCare HQ users to log in, set it back to
AUTH_TYPE = AUTH_OAUTH
.
Celery is used to import UCRs that are larger than
hq_superset.views.ASYNC_DATASOURCE_IMPORT_LIMIT_IN_BYTES
. If you need
to import UCRs larger than this, you need to run Celery to import them.
Here is how celery can be run locally.
- Install and run Redis
- Add Redis and Celery config sections from
superset_config.example.py
to your localsuperset_config.py
. - Run
celery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4
in the Superset virtualenv.
Superset provides a way to update HTML templates by adding a file called
tail_js_custom_extra.html
.
This file can be used to insert HTML or script for all pages.
This isn't documented in superset but can be seen in the superset's
basic template.
Install requirements for running tests:
$ pip install -e '.[test]'
$ pytest
The test runner can only run tests that do not import from Superset. The code you want to test will need to be in a module whose dependencies don't include Superset.
In order to test your feature branch on staging you need to
- Check out to
master
branch (make sure it's up to date) - Add your feature branch to the
branches
section inscripts/staging.yml
file - Make a PR to
master
- Once merged, run
scripts/rebuildstaging
(this will rebuild the staging branch on your machine) - Push the new
staging
branch to remote - Deploy
staging
(see Deploying Staging below)
- SSH to
commcare-analytics-staging.dimagi.com
- Virtual environment should already be activated
- Uninstall current superset version
$ pip uninstall hq-superset
- Install new superset version
$ pip install git+https://github.com/dimagi/commcare-analytics.git@staging
- Restart superset & celery as needed
$ sudo supervisorctl restart superset
$ sudo supervisorctl restart commcare_analytics_celery_default
You will need to create an Alembic migration for any new SQLAlchemy models that you add. The Superset CLI should allow you to do this:
$ superset db revision --autogenerate -m "Add table for Foo model"
However, problems with this approach have occurred in the past. You might have more success by using Alembic directly. You will need to modify the configuration a little to do this:
-
Copy the "HQ_DATA" database URI from
superset_config.py
. -
Paste it as the value of
sqlalchemy.url
inhq_superset/migrations/alembic.ini
. -
Edit
env.py
and comment out the following lines:hq_data_uri = current_app.config['SQLALCHEMY_BINDS'][HQ_DATA] decoded_uri = urllib.parse.unquote(hq_data_uri) config.set_main_option('sqlalchemy.url', decoded_uri)
Those changes will allow Alembic to connect to the "HD Data" database without the need to instantiate Superset's Flask app. You can now autogenerate your new table with:
$ cd hq_superset/migrations/
$ alembic revision --autogenerate -m "Add table for Foo model"
dimagi-superset
is a requirement of this commcare-analytics
package.
It is a fork of apache-superset
, and adds important features to it,
necessary for commcare-analytics
. For more information about how to
upgrade dimagi-superset
, see
Dimagi Superset Fork.