Skip to content
Mats Sjöberg edited this page Mar 18, 2016 · 44 revisions

DiMe server API

The Digital Me platform is centered around the user's events and information elements being uploaded, downloaded and processed. The DiMe data model is explained on a separate page.

The most important parts of the API are (click the titles to see the detailed API format):

  • Uploading an event. This is the main way that applications and loggers insert time-specific data into DiMe, e.g. "the user has looked at a document", "the user's pulse is so-and-so". Some events contain linked InformationElements, such as the document that was looked at.

  • Uploading an information element. In some cases there is not associated event, e.g. you simply want to import your documents into DiMe. Then you can upload the InformationElements directly.

  • Get a single event or single information element. This is how to access a single object uploaded to DiMe previously.

  • Get multiple events or multiple information elements by simple filtering. This is how to access e.g. events generated by a particular program, or events of a given type.

  • Add and remove tags on information elements or events.

  • Search information elements or events based on their text contents. Search the user's DiMe for InformationElement or Event objects that match a given text.

  • Fetch an answer. Retrieve an answer (feature) calculated from the uploaded data. Such as the number of documents views yesterday, the average pulse between 9 and 10 this morning, etc.

API call format

All API calls are HTTP GET or POST requests to a given DiMe server endpoint:

http://<SERVER_HOST>/api/<ENDPOINT_HERE>

Where <SERVER_HOST> is replaced with the hostname and port number where the DiMe server is running, typically localhost:8080 when developing in your local machine.

The payload (if any) is always a JSON formatted object.

Authentication

At the moment we use simple HTTP username and password. In the future we will likely use OAuth 2.0. Some general API calls may not need authentication. Any calls accessing or uploading data for a given person need authentication.

API endpoints

Ping

A way to "ping" the dime-server to see if it's running, and there is network access.

Endpoint: /ping

Method: GET or POST

Authentication: not needed

Returns: A dummy JSON object.

Python example (using the requests library):

r = requests.post(server_url + '/ping',
                  timeout=10)         # e.g. in case of network problems
ok = r.status_code == requests.codes.ok

Example of response:

{
  "message": "pong"
}

Upload event

To upload an event from a logger to be stored in the DiMe server for the authenticated user.

This is the main way that applications and loggers insert time-specific data into DiMe, e.g. "the user has looked at a document", "the user's pulse is so-and-so". Some events contain linked InformationElements, such as the document that was looked at.

Endpoint: /data/event

Method: POST

Authentication: required

Returns: the uploaded JSON object as it was inserted (some fields, in particular the unique id, may be modified or added in the server)

The POST data is the JSON of the event being uploaded. It is important that it has a '@type' property that specifies the data type of the event, the data type should be one of the event classes in DiMe, which are documented in the separate DiMe data page.

Here is a Python example for a SearchEvent, which is sent to the /api/data/searchevent end point. A full working example can be found in the git repository.

# Set all the standard event fields
payload = {
    '@type':    'SearchEvent'
    'actor':    'logger-example',
    'origin':   'my_machine.hiit.fi',
    'type':     'http://www.hiit.fi/ontologies/dime/#ExampleSearchEvent',
    'start':    '2015-05-08T14:03:42+0300',
    'duration': 0,
	'query':    'dummy search'
}

requests.post(server_url + '/data/event',
                         data=json.dumps(payload),
                         headers={'content-type': 'application/json'},
                         auth=(server_username, server_password),
                         timeout=10)

Upload many events at once

In some situations you may need to upload many events at once, for example if your application generates several events per second it doesn't make sense to establish a new HTTP connection every time.

To upload an event from a logger to be stored in the DiMe server for the authenticated user.

Endpoint: /data/events

Method: POST

Authentication: required

Returns: the uploaded JSON objects in an array

The POST data is a JSON array of events objects being uploaded. The objects themselves are given as JSON exactly as described above in the section about uploading single events.

Upload information element

To directly upload an information element to be stored in the DiMe server for the authenticated user.

Endpoint: /data/informationelement

Method: POST

Authentication: required

Returns: the uploaded JSON object as it was inserted (some fields, in particular the unique id, may be modified or added in the server)

The POST data is the JSON of the information element being uploaded. It is important that it has a '@type' property that specifies the data type of the element, the data type should be one of the information element classes in DiMe, which are documented in the separate DiMe data page.

Upload many information elements at once

To upload several information elements at once, to be stored in the DiMe server for the authenticated user.

Endpoint: /data/informationelements

Method: POST

Authentication: required

Returns: the uploaded JSON objects in an array

The POST data is a JSON array of information element objects being uploaded. The objects themselves are given as JSON exactly as described above in the section about uploading single elements.

Get a single event

Get a single event based on the DiMe id.

Endpoint: /data/event/{id}

Method: GET

Authentication: required

Returns: The Event object with the corresponding id.

Here is a Python example.

r = requests.get(server_url + '/data/event/' + id,
                 headers={'content-type': 'application/json'},
                 auth=(server_username, server_password),
                 timeout=10)
data = r.json()

Get multiple events by filtering

Get multiple events based on simple filtering.

Endpoint: /data/events

Method: GET

Parameters:

Optional parameters for filtering:

  • appid: exact text matching for appId
  • elemid: the numeric id of the related information element
  • actor, origin, type, query: exact text matching, e.g. actor=Firefox
  • tag: exact tag matching (just one tag needs to match)
  • after: matches events occurring after this time stamp, the time stamp format is the same as for the start and end properties of the Data objects
  • before: matches events occurring before this time stamp (can be combined with after to get a time interval)
  • includePlainTextContent: set to 'true' if you wish to include the plainTextContent of the InformationElements linked to the Events (these are normally removed to reduce verbosity)

Authentication: required

Returns: A list of Event objects matching the filtering criteria.

An example query, that would find all events generated by "Firefox" and which are tagged with the "reknow" tag would look like:

GET /data/events?actor=Firefox&tag=reknow

Get information element

Get a single information element based on the DiMe id.

Endpoint: /data/informationelement/{id}

Method: GET

Parameters:

Optional parameters:

  • keywords: set to "tfidf", "tf", "idf" or "df" in order to include indexing terms (as tfidf, term frequency, inverse document frequency or document frequency)

Authentication: required

Returns: The InformationElement object with the corresponding id.

Get multiple information elements by filtering

Get multiple information elements based on simple filtering.

Endpoint: /data/informationelements

Parameters:

Optional parameters for filtering:

  • uri, plainTextContent, isStoredAs, type, mimeType, title: exact text matching
  • tag: exact tag matching (just one tag needs to match)

Method: GET

Authentication: required

Returns: A list of InformationElement objects matching the filtering criteria.

An example query, that would find all information elements which has the tag "dime":

GET /data/informationelements?tag=dime

Add tag

Add a single tag (POSTed object) to the information element or event with the specified id.

Endpoint: /data/informationelement/{id}/addtag OR /data/event/{id}/addtag

Method: POST

Authentication: required

Returns: The InformationElement or Event object with the tag added to it.

The Tag object POSTed is defined by the Tag java class and uploaded as a JSON object. Here is an example:

{
    text: "mytag",
    auto: true,
    date: "2016-03-16T21:22:13.000Z"
}

Add tags

Add a list of tags (POSTed object) to the information element or event with the specified id.

Endpoint: /data/informationelement/{id}/addtags OR OR /data/event/{id}/addtags

Method: POST

Authentication: required

Returns: The InformationElement or Event object with the tag added to it.

The POSTed data is a JSON list of Tag objects (see example in the previous section).

Remove tag

Remove a tag (POSTed object) from the information element with the specified id.

Endpoint: /data/informationelement/{id}/removetag OR /data/event/{id}/removetag

Method: POST

Authentication: required

Returns: The InformationElement or Event object with the tag removed from it.

Information element search

Perform a text search on existing InformationElements (documents, posts, etc) in the DiMe. The search is performed using Lucene in the DiMe backend.

Endpoint: /search?query=<QUERY TEXT>

Method: GET

Parameters:

Optional parameters:

  • limit: limit the number of results
  • includeTerms: set to "true" in order to include indexing terms

Authentication: required

Returns: An object that contains some meta-data and in the "docs" element a list of InformationElement objects together with a score indicating the relevance of the object to the search query. The list is sorted by this score, descending.

Here is a quick Python example which performs a text search for the word "dime". A full working example can be found in the git repository.

    r = requests.get(server_url + '/search?query=dime',
                     headers={'content-type': 'application/json'},
                     auth=(server_username, server_password),
                     timeout=10)

Example of returned JSON:

{
  "docs": [
    {
      "plainTextContent": "Some text content\n",
      "user": {
        "id": "5524d8ede4b06e42cc0e0aca",
        "role": "USER",
        "username": "testuser"
      },
      "uri": "file:///home/testuser/some_file.txt",
      "type": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#TextDocument",
      "mimeType": "text/plain",
      "timeCreated": 1430142736819,
      "id": "d8b5b874e4bae5a6f6260e1042281e91c69d305e",
      "timeModified": 1430142736819,
      "score": 0.75627613,
      "isStoredAs": "http://www.semanticdesktop.org/ontologies/nfo#FileDataObject"
    },
    {
      "plainTextContent": "Some other text content",
      "user": {
        "id": "5524d8ede4b06e42cc0e0aca",
        "role": "USER",
        "username": "testuser"
      },
      "uri": "file:///home/testuser/another_file.txt",
      "type": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#TextDocument",
      "mimeType": "text/plain",
      "timeCreated": 1430142737246,
      "id": "99db4832be27cff6b08a1f91afbf0401cad49d15",
      "timeModified": 1430142737246,
      "score": 0.75342464,
      "isStoredAs": "http://www.semanticdesktop.org/ontologies/nfo#FileDataObject"
    }
  ],
  "numFound": 2
}

Event search

Perform a text search on existing Events in the DiMe. Most events do not contain any text content by themselves, and for these the search is actually performed against the InformationElement objects they are linked to. For example if a document matches the search, all the events which refer to the object are returned. The search is performed using Lucene in the DiMe backend.

The end point returns a JSON list of Event objects with their linked InformationElement object included. Note: since the InformationElements may be repeated several times in the results and their text content very long, their plainTextContent field has been removed. (It can be fetched separately by id.)

Endpoint: /eventsearch?query=<QUERY TEXT>

Method: GET

Parameters:

Optional parameters:

  • limit: limit the number of results
  • includeTerms: set to "tfidf", "tf", "idf" or "df" in order to include indexing terms with each result (as tfidf, term frequency, inverse document frequency or document frequency)

Authentication: required

Returns: The same format as for the InformationElement search.

Fetch an answer

Endpoint: /answer/<FEATURENAME>

Method: GET

Authentication: required

Returns: a calculated answer (i.e. feature), typically given some GET parameters. Answers could be calculated at given times ("cron"), at each upload, or when needed ("just-in-time"). Currently only the latter implemented...

Clone this wiki locally