-
Notifications
You must be signed in to change notification settings - Fork 7
Data
DiMe is centered around loggers uploading events, represented by the
Event
class and its subclasses. For example an event could be:
- the user has looked at a PDF document
- the user's pulse is so-and-so
Some events contain linked objects, such as the document that was
looked at. These objects, which can appear in several events, e.g.
when the user closes the same document are represented by the
InformationElement
class and its subclasses.
The image below illustrates several events being uploaded over time. Some refer to information elements, many events may naturally refer to the same information element, e.g. opening and closing the same document.
An event is uploaded to DiMe encoded as a JSON object using the uploading API. It is also possible to upload information elements directly, although that is considered a less common use case.
The DiMe data model is implemented as a set of Java classes in the
source directory: src/main/java/fi/hiit/dime/data
. You
can also view the class hierarchy from the JavaDoc page
(requires a HIIT username that is member of the reknow group).
The actual event uploaded to DiMe needs to be one of the concrete
subclasses of the Event
class, which represent different types of
events. To see the full list of currently implemented events check the
JavaDoc page. New classes can be added as needed.
All event subclasses are ultimately documented in the JavaDoc page, but below some of the most important ones are described together with the most important fields needed. Almost all fields can be left empty if they do not make sense for the particular application.
All events have these fields:
-
@type
: the Java class, e.g.SearchEvent
, this is a mandatory field, -
actor
: the program that produced the event, e.g. "Firefox". This is a highly recommended field, -
origin
: typically the host name of the computer where the event was generated. This is a highly recommended, -
type
: detailed type of event, using the Semantic Desktop ontology, see: http://www.semanticdesktop.org/ontologies/2010/01/25/nuao. This is a highly recommended. -
start
: time stamp when the event was started. This is a highly recommended field. -
end
: time stamp when the event ended - DiMe can fill this if duration was supplied. -
duration
: duration of event in seconds - DiMe can fill this if end time was supplied. -
tags
: list of free form tag names - can be used for application specific purposes
A search event, i.e. the user doing a text query. In addition to the common fields, a search event has:
-
query
: the text of the query.
A desktop event, such as opening a document in the computer graphical environment.
-
targettedResource
: the InformationElement object that is targetted by this event.
An event representing an explicit feedback by the user, e.g. ranking a document as relevant.
-
targettedResource
: the InformationElement object that is targetted by this event. -
relatedEvent
: a related event, e.g the SearchEvent that introduced the document that we are giving feedback to. -
value
: the feedback value, e.g. the relevance of the document.
Many events refer to information elements, such as the document that
was opened. The actual information elements uploaded to DiMe
(typically as part of the event upload) are represented by the
subclasses of the InformationElement
class.
All information element subclasses are ultimately documented in the JavaDoc page, but below some of the most important ones are described together with the most important fields needed. Almost all fields can be left empty if they do not make sense for the particular application.
-
@type
: the Java class, e.g.Document
, this is a mandatory field. -
uri
: URI of the information element, e.g. path on computer or web URL. Eitheruri
orplainTextContent
are mandatory for information elements. -
plainTextContent
: plain text content of the information element. This is indexed for text search. Eitheruri
orplainTextContent
are mandatory for information elements. -
isStoredAs
: form of storage according to the Semantic Desktop ontology: http://www.semanticdesktop.org/ontologies/2007/03/22/nfo -
type
: detailed data type according to the Semantic Desktop ontology: http://www.semanticdesktop.org/ontologies/2007/03/22/nfo -
tags
: list of free form tag names - can be used for application specific purposes
-
mimeType
: mime type of the document, see: https://en.wikipedia.org/wiki/MIME#Content-Type -
title
: the title of the document.
Below is an example of a DesktopEvent
with its corresponding
Document
. This example represents a user having accessed a web page,
i.e. the document is the web page in this case, and the event
specifies when the web page was accessed.
The example is given below in JSON, which is the data format used for uploading to and downloading from DiMe. The fields and their meaning are explained below in the comments.
{
// this corresponds to the Java event class
"@type": "DesktopEvent",
// the program that produced the event, here the web browser
"actor": "Firefox",
// a unique id, if left empty DiMe will generate a random id
"id": "f9654c54d7f38acfe179b04de8c0554ea1d6481b",
// typically the host name of the computer where the event was generated
"origin": "hp8x-15.cs.helsinki.fi",
// time stamp when the event was started
"start": "2015-08-11T12:56:53Z",
// type using the Semantic Desktop ontology
"type": "http://www.semanticdesktop.org/ontologies/2010/01/25/nuao#UsageEvent"
// the contained InformationElement
"targettedResource": {
// the Java class
"@type": "Document",
// a unique id, if left empty DiMe will generate a random id
"id": "d74fa5afb9c04e148fc75a640348f8648c17812b",
//
"isStoredAs": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#RemoteDataObject",
// mime type, optional
"mimeType": "text/html",
// the plain text, used for search
"plainTextContent": "The revolution has begun...",
// title of the document
"title": "Revolution of Knowledge Work | Revolution of Knowledge Work",
// type using the Semantic Desktop ontology
"type": "http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Document",
// the URI of the web page or document
"uri": "http://www.reknow.fi/"
},
}
If you need a new data class, it should be implemented as a subclass of the existing ones. It is probably easiest to take a look at the current class structure, and the [implementation of the existing classes][data-data].
A few things need to be taken into account since we are using an SQL database backend and we are using hibernate as the persistence framework. These are mentioned briefly below.
You need to add some annotations to certain types of data object members.
- Text strings which may contain arbitrarily long strings supplied by the user should be defined as "longtext". The default string definition may be limited to only 255 characters (depending a bit on the SQL database used. For example:
@Column(columnDefinition="longtext")
public String plainTextContent;
- Simple self-defined objects can be embedded if they don't require their own table in the database, i.e. they are intrinsically part of the main object and should not be accessible on their own. For this use the @Embedded annotation. The class itself needs to be tagged as @Embeddable. For example:
@Embedded
public Location location;
- Collections of simple (embeddable) objects need to be annotated with @ElementCollection. For example:
@ElementCollection(targetClass = String.class)
public Set<String> tags;
- Unidirectional many-to-one relationships, e.g. an event may have a link to a single information element, but many events may point to the same information element. It is called unidirectional if the information is only stored in the "many" end using the @ManyToOne annotation. For example:
@ManyToOne(fetch = FetchType.EAGER, cascade=CascadeType.ALL)
@JoinColumn(name = "resource_id")
public InformationElement targettedResource;
- Unidirectional one-to-many relationship. Use the @OneToMany annotations. E.g. a message may have many attachments:
@OneToMany(cascade=CascadeType.ALL)
@JoinColumn(name="message_id", referencedColumnName="id")
public List<InformationElement> attachments;