Skip to content

This repository contains code and datasets for a novel approach to analyzing SPARQL query logs, providing deeper insights into user interactions with knowledge graphs.

License

Notifications You must be signed in to change notification settings

MaastrichtU-IDS/KG_usage_metadata

Repository files navigation

Knowledge graph usage metadata: Insights from SPARQL log analysis

Datasets

RDF Knowledge Graphs (KGs)

Both versions were hosted on Blazegraph on a local server to analyze SPARQL schema coverage changes over time.

SPARQL Query Logs

The query logs were retrieved from multiple sources:

Calculating Schema Coverage and Usage Analysis

To calculate SPARQL Schema Coverage (SC), use the following steps:

  1. Extract all schema elements by running the code in the KG-Schema-extractors folder.
  2. Extract used schema elements from SPARQL query logs by running the code in the Schema-coverage-method folder.
  3. Compute SC (%) using the formula:
    [ SC (%) = \left( \frac{USE}{TSE} \right) \times 100 ] where:
    • TSE (Total Schema Elements): All distinct types and predicates in the KG.
    • USE (Used Schema Elements): The subset of schema elements found in user SPARQL queries.

To perform the usage pattern analysis as proposed in the paper, run the code in the KG-Usage-analysis folder.

The generated usage metadata for Bio2RDF and Wikidata KGs can be found in the generated-usage-metadata folder.

About

This repository contains code and datasets for a novel approach to analyzing SPARQL query logs, providing deeper insights into user interactions with knowledge graphs.

Resources

License

Stars

Watchers

Forks