Skip to content
Hung-Han Chen edited this page May 19, 2016 · 31 revisions

Overview

DiMe is centered around loggers uploading events, represented by the Event class and its subclasses. For example an event could be:

  • the user has looked at a PDF document
  • the user's pulse is so-and-so

Some events contain linked objects, such as the document that was looked at. These objects, which can appear in several events, e.g. when the user closes the same document are represented by the InformationElement class and its subclasses.

The image below illustrates several events being uploaded over time. Some refer to information elements, many events may naturally refer to the same information element, e.g. opening and closing the same document.

An event is uploaded to DiMe encoded as a JSON object using the uploading API. It is also possible to upload information elements directly, although that is considered a less common use case.

The DiMe data model is implemented as a set of Java classes in the source directory: src/main/java/fi/hiit/dime/data. You can also view the class hierarchy from the JavaDoc page.

If you need to add new data objects for your logger or application please read the page on "Adding new data classes".

Common fields

In DiMe both Event and InformationElement derive from a common super-class: DiMeData, which causes them to share a set of fundamental common fields:

  • @type: the Java class, e.g. SearchEvent, this is a mandatory field,

  • id: this is an internal ID given by the DiMe server, and should not be specified by the logger unless you want to replace an existing item (of the same exact class). Typically when you upload an item, DiMe will respond with the same item with the id field filled in,

  • appId: this is an optional ID string field that can be used by the application or logger. The value can be any unique text string, and is entirely up to the application developer, but make sure it is unique. If you upload another object with the same appId later (for the same user) it will replace the old one (as long as it is of the same exact class)

Events

The actual event uploaded to DiMe needs to be one of the concrete subclasses of the Event class, which represent different types of events. To see the full list of currently implemented events check the JavaDoc page. New classes can be added as needed.

All event subclasses are ultimately documented in the JavaDoc page, but below some of the most important ones are described together with the most important fields needed. Almost all fields can be left empty if they do not make sense for the particular application.

Common fields

All events have these fields:

  • actor: the program that produced the event, e.g. "Firefox". This is a highly recommended field,

  • origin: typically the host name of the computer where the event was generated. This is a highly recommended,

  • type: detailed type of event, using the Semantic Desktop ontology, see: http://www.semanticdesktop.org/ontologies/2010/01/25/nuao. This is a highly recommended.

  • start: time stamp when the event was started. This is a highly recommended field. The time stamp is interpreted by the Jackson JSON parser which accepts for example the following formats:

    • yyyy-MM-dd'T'HH:mm:ss.SSSZ
    • yyyy-MM-dd'T'HH:mm:ss.SSS'Z
    • EEE, dd MMM yyyy HH:mm:ss zzz
    • yyyy-MM-dd
    • Epoch timestamp, i.e. the number of milliseconds since January 1st, 1970, UTC. This is the output format of DiMe as well.
  • end: time stamp when the event ended - DiMe can fill this if duration was supplied.

  • duration: duration of event in seconds - DiMe can fill this if end time was supplied.

  • tags: list of free form tag names - can be used for application specific purposes

SearchEvent

A search event, i.e. the user doing a text query. In addition to the common fields, a search event has:

  • query: the text of the query.

DesktopEvent

A desktop event, such as opening a document in the computer graphical environment.

  • targettedResource: the InformationElement object that is targetted by this event.

FeedbackEvent

An event representing an explicit feedback by the user, e.g. ranking a document as relevant.

  • targettedResource: the InformationElement object that is targetted by this event.

  • relatedEvent: a related event, e.g the SearchEvent that introduced the document that we are giving feedback to.

  • value: the feedback value, e.g. the relevance of the document.

BookmarkEvent

An event representing adding or removing a bookmark by the user, e.g. bookmarking a document for reading it later.

  • targettedResource: the InformationElement object that is targetted by this event.

  • relatedEvent: a related event, e.g the SearchEvent that introduced the document that we are giving feedback to.

  • add: whether the bookmark was added or removed.

Information elements

Many events refer to information elements, such as the document that was opened. The actual information elements uploaded to DiMe (typically as part of the event upload) are represented by the subclasses of the InformationElement class.

All information element subclasses are ultimately documented in the JavaDoc page, but below some of the most important ones are described together with the most important fields needed. Almost all fields can be left empty if they do not make sense for the particular application.

Common fields

  • uri: URI of the information element, e.g. path on computer or web URL. Either uri or plainTextContent are mandatory for information elements.

  • plainTextContent: plain text content of the information element. This is indexed for text search. Either uri or plainTextContent are mandatory for information elements.

  • isStoredAs: form of storage according to the Semantic Desktop ontology: http://www.semanticdesktop.org/ontologies/2007/03/22/nfo

  • type: detailed data type according to the Semantic Desktop ontology: http://www.semanticdesktop.org/ontologies/2007/03/22/nfo

  • tags: list of free form tag names - can be used for application specific purposes

Document

Example

Below is an example of a DesktopEvent with its corresponding Document. This example represents a user having accessed a web page, i.e. the document is the web page in this case, and the event specifies when the web page was accessed.

The example is given below in JSON, which is the data format used for uploading to and downloading from DiMe. The fields and their meaning are explained below in the comments.

{
    // this corresponds to the Java event class
    "@type": "DesktopEvent",	
    // the program that produced the event, here the web browser
    "actor": "Firefox", 
	// a unique id, if left empty DiMe will generate a random id
	"appId": "f9654c54d7f38acfe179b04de8c0554ea1d6481b", 
	// typically the host name of the computer where the event was generated
	"origin": "hp8x-15.cs.helsinki.fi", 
	// time stamp when the event was started
	"start": "2015-08-11T12:56:53Z", 
	// type using the Semantic Desktop ontology
	"type": "http://www.semanticdesktop.org/ontologies/2010/01/25/nuao#UsageEvent"
	// the contained InformationElement
	"targettedResource": {
	    // the Java class
	    "@type": "Document", 
		// a unique id, if left empty DiMe will generate a random id
		"appId": "d74fa5afb9c04e148fc75a640348f8648c17812b",  
		// 
		"isStoredAs": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#RemoteDataObject",
		// mime type, optional
		"mimeType": "text/html", 
		// the plain text, used for search
		"plainTextContent": "The revolution has begun...", 
		// title of the document
		"title": "Revolution of Knowledge Work | Revolution of Knowledge Work", 
		// type using the Semantic Desktop ontology
		"type": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Document", 
		// the URI of the web page or document
		"uri": "http://www.reknow.fi/" 
    },
}

WebDocument

An example of a DesktopEvent made by the DiMe browser extension with its corresponding WebDocument, which is an extended version of Document for describing information on webpages extracted by the extension.

{
    // this corresponds to the Java event class
    "@type": "DesktopEvent",
    // the program that produced the event, here the web browser
    "actor": "DiMe browser extension",   
	// time stamp when the event was started
	"start": "2015-08-11T12:56:53Z", 
	// type using the Semantic Desktop ontology
	"type": 'http://www.semanticdesktop.org/ontologies/2010/01/25/nuao/#UsageEvent',
	"targettedResource": {
	    "@type": "WebDocument",
	    // title of the Web page
	    "title": "Join, or Die - Wikipedia, the free encyclopedia", 
		// type using the Semantic Desktop ontology
		"isStoredAs": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#RemoteDataObject",
		// the plain text in the webpage
		"plainTextContent": "From Wikipedia, the free encyclopedia...."
		mimeType: 'text/html',
		// the URI of the web page or document
		"uri": "https://en.wikipedia.org/wiki/Join,_or_Die"
		type: 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo/#HtmlDocument',
		//a list of 8 tags defined by Tag class, the 8 tags are the most frequent terms on the page
		tags: [{"@type": "Tag", "text": "wikipedia"}, ...],
		//a list of terms in the webpage, ranked by frequency
		frequentTerms: ["wikipedia", "hurrican", ...],
		//abstract/excerpt of the page
		abstract: ''
		//a string of plain HTML with class/id/styles removed  
		HTML: "<p>From Wikipedia, the free encyclopedia</p> ...",
		//a list of imgages in the page
		imgURLs: [{'url':'http://.../a.jpg', 'text': 'a pic'},...],
		//a list of hyperlinks in the page
		hyperLinks: [{'url': 'http://.../', 'text': 'a link'},...],
		//a list of Open Graph protocol http://ogp.me/
		OpenGraphProtocol: {
            "image": "https://www.facebook.com/images/fb_icon_325x325.png",
            "url": "https://www.facebook.com/",
            "site_name": "Facebook",
            "locale": "en_US"
        },
        //a list of HTML meta tags http://www.w3schools.com/tags/tag_meta.asp
		MetaTags: [{name: 'description', content: 'Free Web tutorials'}],
    },
}
Clone this wiki locally