Labelling

The Labelling process requires a segmentation to be selected as the default

### Labelling Overview ### As described in the publication of [Gehring, Tiago V., et al.](http://www.nature.com/articles/srep14562), the segmentation process generates a large number of segments to be classified, labelling all of them manually becomes intractable. For this reason, in order to classify all the segments, a semi-supervised clustering algorithm is used which requires only a small number of manually labelled segments as an input in order to classify the rest. ### The Classification Panel ###

labelling overview

Browse Trajectories Opens a new window in which the user can visualize the segments in order to label them (for more information refer to Browse Trajectories).
Default Labels will list all the different labelling of this project. In case a labelling is not shown press the Refresh button. The default labels specifies with which labelling the user wants to proceed and its name specifies the number of entered labels and in which segmentation it is applied (for example labels_1301_250_09-19OCT2016.mat means that 1301 were given to the segments of the segmentation with segment length of 250cm and overlap 90%, while the rest of the name followed by the ' - ' is a custom made note).
Labelling Quality runs the classification procedure using 10-fold cross validation ten times and generates three graphs which indicates the labelling quality (for more information refer to Labelling Quality).

### The Classification Panel ### ### Labelling Quality ###

In the publication of Gehring, Tiago V., et al. in order to classify the segments the algorithm was needed a pre-defined number of clusters. In order to find a number that would yield optimal results the classification procedure was running a couple of times (8 times, for number of clusters 30 to 120 with increment of 10) with different number of clusters and for each different number of clusters the 10-fold cross validation was used to generate three metrics for this specific classification.

Classification Error(%): The percentage of classification error.
Undefined Segments (%): The percentage of segments belonging to clusters that could not be mapped to a single class.
Trajectory Coverage(%): The percentage of the full swimming paths that are covered by at least one segment of a known class.

quality metrics

Afterwards the number of clusters was chosen based on the classification with the lowest error and undefined segments and the maximum coverage.

In this version the the program this procedure is used only as an indication of the labelling quality, meaning that in case of high error and low coverage the user may consider to provide more labels.

Introduction

Home

How to Use

Appendix

Version History

How to Use (version 3)

How to Use (version 1 and 2)

Labelling

Contents

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally