The openHPI Learning Analytics Service

The learning analytics service ("Lanalytics") of the openHPI platform.

This service was developed to enhance the functionality of the core application and is intended to be operated alongside it. It is used for processing tracked learner interaction events, enables reports, and provides specific course statistics.

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Dependencies

PostgreSQL and RabbitMQ
Elasticsearch 7
MinIO, for storing reports.

Setup

Clone this repository.
Change into the directory.
Install dependencies:
```
bundle install
```
Prepare the database, Elasticsearch, and Minio:
```
bundle exec rake db:drop db:prepare
bundle exec rake elastic:setup
bundle exec rake s3:setup # Ensure Minio is running
```
- If Minio uses the default port and credentials, no further setup is needed.
- For custom Minio configurations, override the defaults in config/xikolo.development.yml. (Refer to app/xikolo.yml for guidance.)
Start the Rails server:
```
bundle exec rails s -p 5900
```

To get a clean state during development, run:

bundle exec rake db:reset
bundle exec rake sidekiq:clear
bundle exec rake msgr:drain

Event processing

Most of the code for processing events can be found in lib/lanalytics/processing/.

The starting point for the data processing can be found in the Rails initializer config/initializers/01_lanalytics_processing_pipelines.rb. This initializer will setup the data sources and processing pipelines.

The data sources are defined in config/datasources/*.yml.
The pipelines are defined in lib/lanalytics/processing/pipelines/*.prb.

Pipelines and data sources can be activated in the config/lanalytics_pipeline_flipper.yml.

Each pipeline consists of extractor, transformer, and loader steps (ETL process), where each is responsible for a certain processing task (e.g., anonymization and data type processing). The implementation of the different classes can be found in lib/lanalytics/processing/{extractor,transformer,loader}/*.rb.

How to include a new pipeline?

Implement the new pipeline in lib/lanalytics/processing/pipelines/{new_pipelines}.prb.
Define all the desired steps like in lib/lanalytics/processing/pipelines/exp_events_pipeline.prb or implement new ETL steps.
Enable the new pipeline in config/lanalytics_pipeline_flipper.yml.
If you consume new messages, register them in config/msgr.rb.

Elasticsearch schema mapping

Get yourself familiar how mapping in Elasticsearch works:

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed.

Our Elasticsearch mapping file and additional settings can be found in config/elasticsearch/exp_events.json. It needs to be updated every time when new fields are added. On production, the setting strict is used for the mapping:

If new fields are detected, an exception is thrown and the document is rejected. New fields must be explicitly added to the mapping.

To do this, update the mapping file in this repo and make the same changes to the Elasticsearch index template mapping while increasing the version number there. To update your local setup, run bundle exec rake elastic:reset — but be careful as this deletes the index first and with it all the data.

Note: Using dynamic mapping would require a complete re-indexing if a field was added automatically and its data type should be changed afterwards. Data types of already known fields of an index cannot be changed otherwise. A simple update of the mapping is only possible if the field has not been added yet. To avoid auto-updates of the mapping by new events, we recommend using the strict mapping in production.

Event tracking

The used event schema is close to xAPI: a user does verb for resource in context with result. All events are stored redundantly in Elasticsearch and Postgres.

Capturing events

See app/assets/legacy/lanalytics/common.js#track in the core application for details.

Usage:

import track from './common';
// ...
track('my_verb', resource.uuid, resource.type, context);

Link tracking

Internal and external links of the application can be tracked. The data is stored in Elasticsearch.

The Elasticsearch data schema can be found in config/elasticsearch/link_tracking_events.json. If updated, make sure to update it for your Elasticsearch instance as well and increase the version number.

A good starting point for this is app/controllers/concerns/tracks_referrers.rb in the core application.

Metrics

All available metrics are self-documented and can be retrieved from the index endpoint, i.e., http://0.0.0.0:5900/metrics. The code for metrics is placed in lib/lanalytics/metric/.

Query Elasticsearch events

A majority of the metrics use Elasticsearch. Get familiar with the Elasticsearch API and a HTTP client, either CLI or GUI-based (e.g., Postman).

Samples

Query all available verbs (aka. event types): POST 0.0.0.0:9200/_search

{
  "size": 0,
  "aggregations": {
    "verbs": {
      "terms": {
        "size": 10000,
        "field": "verb"
      }
    }
  }
}

Get the 10 most recent events with verb: POST 0.0.0.0:9200/_search

{
  "size": 10,
  "sort": {
    "timestamp": {
      "order":"desc"
    }
  },
  "query": {
    "bool": {
      "must": [
        { "match": { "verb": "visited_item" } }
      ]
    }
  }
}

Reports

The code for reports is placed in app/models/reports/. The report UI is generated dynamically based on the exposed form_data. Check the existing reports and their tests for examples.

To generate reports:

MinIO must run properly to store reports in a S3 bucket.
Available reports must be configured to be displayed in the core application. The default configuration can be found in app/xikolo.yml (see reports.types).
A user must have the lanalytics.report.admin role to access this page.

Name		Name	Last commit message	Last commit date
Latest commit History 3,113 Commits
.gitlab/merge_request_templates		.gitlab/merge_request_templates
app		app
bin		bin
config		config
db		db
debian		debian
docker		docker
lib		lib
public		public
spec		spec
vendor/lib/geo_ip/GeoLite2-City		vendor/lib/geo_ip/GeoLite2-City
.ci.yml		.ci.yml
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env		.env
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
REVISION		REVISION
Rakefile		Rakefile
config.ru		config.ru
renovate.json5		renovate.json5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The openHPI Learning Analytics Service

License

Dependencies

Setup

Event processing

How to include a new pipeline?

Elasticsearch schema mapping

Event tracking

Capturing events

Link tracking

Metrics

Query Elasticsearch events

Samples

Reports

About

Releases

Contributors 11

Languages

License

openHPI/xikolo-learnanalytics

Folders and files

Latest commit

History

Repository files navigation

The openHPI Learning Analytics Service

License

Dependencies

Setup

Event processing

How to include a new pipeline?

Elasticsearch schema mapping

Event tracking

Capturing events

Link tracking

Metrics

Query Elasticsearch events

Samples

Reports

About

Resources

License

Stars

Watchers

Forks

Releases

Contributors 11

Languages