You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+34-34
Original file line number
Diff line number
Diff line change
@@ -1,34 +1,41 @@
1
1
Multimedia Geotagging
2
2
======
3
3
4
-
Contains the implementation of algorithms that estimate the geographic location of multimedia items based on their textual content and metadata. It includes the <ahref="http://ceur-ws.org/Vol-1263/mediaeval2014_submission_44.pdf">participation</a> in the <ahref="http://www.multimediaeval.org/mediaeval2014/placing2014/">MediaEval Placing Task 2014</a>. The project's paper can be found <ahref="http://link.springer.com/chapter/10.1007/978-3-319-18455-5_2">here</a>.
4
+
This repository contains the implementation of algorithms that estimate the geographic location of multimedia items based on their textual content. The approach is described in <ahref="http://ceur-ws.org/Vol-1436/Paper58.pdf">here</a> and <ahref="http://link.springer.com/chapter/10.1007/978-3-319-18455-5_2">here</a>. It was submitted in <ahref="http://www.multimediaeval.org/mediaeval2016/placing/">MediaEval Placing Task 2016</a>.
5
5
6
6
7
7
8
8
<h2>Main Method</h2>
9
9
10
10
The approach is a refined language model, including feature selection and weighting schemes and heuristic techniques that improves the accuracy in finer granularities. It is a text-based method, in which a complex geographical-tag model is built from the tags, titles and the locations of a massive amount of geotagged images that are included in a training set, in order to estimate the location of each query image included in a test set.
11
11
12
-
The main approach comprises two major processing steps, an offline and an online. A pre-processing step fist applied in all images. All punctuation and symbols are removed (e.g. “.%!&”), all characters are transformed to lower case and then all images from the training set with empty tags and title are filtered.
12
+
The main approach comprises two major processing steps, an offline and an online.
* remove accents, punctuations and symbols (e.g. “.%!&”)
19
+
* discard terms consisting of numerics or less than three characters
20
+
16
21
* Language Model
17
22
* divide earth surface in rectangular cells with a side length of 0.01°
18
-
* calculate tag-cell probabilities based on the users that used the tag inside the cell
23
+
* calculate term-cell probabilities based on the users that used the term inside the cell
19
24
20
25
* Feature selection
21
-
* cross-validation scheme using the training set only
22
-
* rank tags based on their accuracy for predicting the location of items in the withheld fold
23
-
* select tags that surpass a predefined threshold
26
+
* calculate locality score of every term in the dataset
27
+
* locality is based on the term frequency and the neighbor users that have used it in the cell distribution
28
+
* the final set of selected terms is formed from the terms with locality score greater than zero
24
29
25
30
* Feature weighting using spatial entropy
26
-
* calculate entropy values applying the Shannon entropy formula in the tag-cell probabilities
27
-
* build a Gaussian weight function based on the values of the spatial tag entropy
31
+
* calculate spatial entropy values of every term applying the Shannon entropy formula in the term-cell probabilities
32
+
* spatial entropy weights derives from a Gaussian weight function over the spatial entropy of terms
33
+
* locality weights derives from the relative position in the rank of terms based on their locality score
34
+
* combine locality and spatial entropy weight to generate the final weights
28
35
29
36
<h3>Online Processing Step</h3>
30
37
31
-
* Language Model based estimation
38
+
* Language Model based estimation (prior-estimation)
32
39
* the probability of each cell is calculated
33
40
* Most Likely Cell (MLC) considered the cell with the highest probability and used to produce the estimation
34
41
@@ -46,45 +53,38 @@ The main approach comprises two major processing steps, an offline and an online
46
53
In order to make possible to run the project you have to set all necessary argument in <ahref="https://github.com/socialsensor/multimedia-geotagging/blob/master/config.properties">configurations</a>, following the instruction for every argument. The default values may be used.
47
54
48
55
49
-
_Input File_
50
-
The dataset's records, that are fed to the algorithm as training and test set, have to be in the following format. The different metadatas are separated with _tab_ character.
51
-
52
-
imageID imageHashID userID title tags machineTags lon lat description
53
-
54
-
`imageID`: the ID of the image<br>
55
-
`imageHashID`: the Hash ID of the image that was provided by the organizers (optional)<br>
56
-
`userID`: the ID of the user that uploaded the image<br>
57
-
`title`: image's title<br>
58
-
`tags`: image's tags<br>
59
-
`machineTags`: image's machine tags<br>
60
-
`lon`: image's longitude<br>
61
-
`lat`: image's latitude<br>
62
-
`description`: image's description, if it is provided.
56
+
_Input File_<br>
57
+
The imput files must be in the same format as <ahref="https://webscope.sandbox.yahoo.com/catalog.php?datatype=i&did=67">YFCC100M dataset</a>.
63
58
64
59
65
-
_Output File of the Offline Step_
66
-
At the end of the training process, the algorithm creates a folder named `TagCellProbabilities` and inside the folder another folder named `scale_(s)`, named appropriately based on the scale `s` of the language model's cells. The format of this file is the following.
60
+
_Output Files_<br>
61
+
At the end of the training process, the algorithm creates a folder named `TermCellProbs` and inside the folder another folder named `scale_(s)`, named appropriately based on the scale `s` of the language model's cells. The format of this file is the following.
67
62
68
-
tag ent-value cell1-lon_cell1-lat>cell1-probcell2-lon_cell2-lat>cell2-prob...
63
+
term cell1-lon_cell1-lat>cell1-prob>cell1-users cell2-lon_cell2-lat>cell2-prob>cell2-users...
69
64
70
-
`tag`: the actual name of the tag<br>
71
-
`ent-value`: the value of the tag's entropy<br>
65
+
`term`: the actual name of the term<br>
72
66
`cellx`: the x most probable cell.<br>
73
67
`cellx-lon_cellx-lat`: the longitude and latitude of center of the `cellx`, which is used as cell ID<br>
74
-
`cellx-prob`: the probability of the `cellx` for the specific tag
68
+
`cellx-prob`: the probability of the `cellx` for the specific tag<br>
69
+
`cellx-users`: the number of users that used the specific term in the `cellx`
75
70
76
-
The output of the cross-validation scheme is a file named `tagAccuracies_range_1.0` found in the projects directory. The output file contains the tags with their accuracies in the range of 1km and it is used for the feature selection.
71
+
The output of the feature weighting scheme is a folder with name `Weights` containing two files one for locality weight and one for spatial entropy weights, namely `locality_weights` and `spatial_entropy_weights`, respectively. Each row contains a term and its corresponding weight, separated with a tab.
77
72
78
-
The files that are described above are given as input in the Language Model estimation process. During this process, a folder named `resultsLM` and inside that folder two files named `resultsLM_scale(s)`are created, where are included the MLCs of the query images. Every row contains the imageID and the MLC, separated with a `;`, of the image that corresponds in the respective line in the training set. Also, a file named `confidence_associated_tags` is created in root the root directory, containing the confidence and associated tags with the MLC for every query image.
73
+
The files that are described above are given as input in the Language Model estimation process. During this process, a folder named `resultsLM` and inside that folder two files named `resultsLM_scale(s)`are created, where are included the MLCs of the query images. Every row contains the imageID and the MLC (tab-separated) of the image that corresponds in the respective line in the test set. Also, a file named `resultsLM_scale(s)_conf_evid` is created in the same folder, containing the confidence and evidences that lead to estimated MLC, for every query image.
79
74
80
75
Having estimated the MLCs for both granularity grids, the files are fed to the Multiple Resolution Grids technique, which produce a file named `resultsLM_mg(cs)-(fs)`, where `(cs)` and `(fs)` stands for coarser and finer granularity grid, respectively. Every row of this file contains the image id, the MLC of the coarser language model and the result of the Multiple Resolution Grids technique, separated with a `>`.
81
76
82
-
In conclusion, the file that is created by the Multiple Resolution Grids technique is used for the final processes of the algorithm, Similarity Search. During this process, a folder named `resultSS` is created, containing the similarity values and the location of the images that containing in the MLG of every image in the test set. The final results are saved in the file specified in the arguments, and the records in each row are the ID of the query image, the estimated latitude, the estimated longitude and the distance between the real and the estimated locations, all separated with the symbol `;`.
77
+
In conclusion, the file that is created by the Multiple Resolution Grids technique is used for the final processes of the algorithm, Similarity Search. During this process, a folder named `resultSS` is created, containing the similarity values and the location of the images that containing in the MLG of every image in the test set. The final results are saved in the file specified in the arguments, and the records in each row are the ID of the query image, the real longitude and latitude, the estimated longitude and latitude, and they are tab-separated.
83
78
84
-
<h3>Demo Version</h3>
79
+
<h3>Evaluation Framework</h3>
85
80
86
-
There have been developed a <ahref="https://github.com/socialsensor/multimedia-geotagging/tree/demo">demo version</a> and a <ahref="https://github.com/socialsensor/multimedia-geotagging/tree/storm">storm module</a> of the approach .
81
+
This <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/tree/develop/src/main/java/gr/iti/mklab/mmcomms16">pacage</a> contains the implemetations of the sampling strategies described in the <ahref="http://dl.acm.org/citation.cfm?doid=2983554.2983558">MMCommons 2016 paper</a>. In order to run the evaluation framework you have to set all necessary argument in <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/blob/master/eval.properties">configuration file</a>, following the instruction for every argument. To run the code, the <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/blob/master/src/test/java/gr/iti/mklab/main/Evaluation.java">Evaluation class</a> have to be executed.
82
+
83
+
Additionally, in this <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/blob/master/samples/">folder</a>, the <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/blob/master/samples/samples.zip">zip file</a> that contains the generated collections from the different sampling strategies and the <ahref="https://github.com/MKLab-ITI/multimedia-geotagging/blob/master/samples/building_concepts.txt">file</a> of the building concepts can be found. Keep in mind that the geographical uniform sampling, the user uniform sampling and text diversity sampling generates different files in every code execution because they involve random selections and permutations.
84
+
85
+
<h3>Demo Version</h3>
87
86
87
+
There have been developed a <ahref="https://github.com/socialsensor/multimedia-geotagging/tree/demo">demo version</a> and a <ahref="https://github.com/socialsensor/multimedia-geotagging/tree/storm">storm module</a> of the approach.
88
88
89
89
<h3>Contact for further details about the project</h3>
0 commit comments