-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
408 lines (359 loc) · 16.9 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
<!DOCTYPE html>
<html lang="en">
<head>
<br>
<!-- Basic Page Needs
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<meta charset="utf-8">
<title>CamoVid</title>
<meta name="description" content="">
<meta name="author" content="">
<!-- Mobile Specific Metas
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- FONT
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<link href="https://fonts.googleapis.com/css?family=Raleway:400,300,600" rel="stylesheet" type="text/css">
<!-- CSS
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<link rel="stylesheet" href="css/normalize.css">
<link rel="stylesheet" href="css/skeleton.css">
<link rel="stylesheet" href="css/footable.standalone.min.css">
<!-- Favicon
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<link rel="icon" type="image/png" href="files/chameleon_icon.png">
<!-- Google icon -->
<link rel="stylesheet" href="https://fonts.googleapis.com/icon?family=Material+Icons">
<!-- Analytics -->
<script>
(function (i, s, o, g, r, a, m) {
i['GoogleAnalyticsObject'] = r; i[r] = i[r] || function () {
(i[r].q = i[r].q || []).push(arguments)
}, i[r].l = 1 * new Date(); a = s.createElement(o),
m = s.getElementsByTagName(o)[0]; a.async = 1; a.src = g; m.parentNode.insertBefore(a, m)
})(window, document, 'script', 'https://www.google-analytics.com/analytics.js', 'ga');
ga('create', 'UA-86869673-1', 'auto');
ga('send', 'pageview');
</script>
<script type="text/javascript" async=""
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.4/MathJax.js?config=TeX-MML-AM_CHTML"></script>
<!-- Hover effect: https://codepen.io/nxworld/pen/ZYNOBZ -->
<style>
h1 {
text-align: center;
margin-bottom: 20px;
}
.icon {
width: 72px;
/* Adjust size as needed */
vertical-align: bottom;
margin-right: 5px;
/* Adjust spacing as needed */
}
/* img {
display: block;
} */
.column-50 {
float: left;
width: 50%;
}
.row-50:after {
content: "";
display: table;
clear: both;
}
.floating-teaser {
float: left;
width: 30%;
text-align: center;
padding: 15px;
}
.venue strong {
color: #99324b;
}
.benchmark {
width: 100%;
max-width: 960px;
overflow: scroll;
overflow-y: hidden;
}
</style>
</head>
<body>
<!-- Primary Page Layout
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
<div class="container">
<h4 style="text-align:center"><img src="files/chameleon_logo.png" alt="Icon" class="icon">CamoVid60K: A Large-Scale Video Dataset for Moving Camouflaged Animals Understanding</h4>
<p align="center" , style="margin-bottom:12px;">
<a class="simple" href="https://tuananh1007.github.io/">Tuan-Anh Vu</a><sup>1,3</sup>
<a class="simple" href="https://zhengziqiang.github.io/">Ziqiang Zheng</a><sup>1</sup>
<a class="simple" href="https://openreview.net/profile?id=~Chengyang_Song2">Chengyang Song</a><sup>2</sup>
<a class="simple" href="https://tsingqguo.github.io/">Qing Guo</a><sup>3</sup>
<a class="simple" href="https://www.a-star.edu.sg/cfar/about-cfar/management/prof-ivor-tsang">Ivor
Tsang</a><sup>3</sup>
<a class="simple" href="https://saikit.org/">Sai-Kit Yeung</a><sup>1</sup>
</p>
<p align="center" style="margin-bottom:20px;">
<sup>1</sup>The Hong Kong University of Science and Technology, Hong Kong SAR
<br>
<sup>2</sup>Ocean University of China, China
<span style="display:inline-block; width: 32px"></span>
<sup>3</sup>CFAR & IHPC, A*STAR, Singapore
<!-- <span style="display:inline-block; width: 32px"></span>
<sup>4</sup>Trinity College Dublin, Ireland
<br> -->
<!-- <sup>#</sup>co-first author
<span style="display:inline-block; width: 32px"></span>
<sup>📧</sup>corresponding author -->
<br>
</p>
<!-- <div class="venue">
<p align="center"><h6 style="text-align:center"><b>CV4Animals: Computer Vision for Animal Behavior</b></h6>
<strong>(Oral Presentation)</strong>
</p>
</div> -->
<br>
<figure>
<img src="files/Category_full.png" style="width:100%"></img>
</figure>
<p align="center">Category distribution and some visual examples (extracted animal masks) of our dataset. </p>
<div id="teaser" class="container" style="width:100%; margin:0; padding:0">
<h5>Abstract</h5>
<p align="justify">
We have been witnessing remarkable success led by the power of neural networks driven by a significant scale of
training data in handling various computer vision tasks.
However, less attention has been paid to monitoring the camouflaged animals, the masters of hiding themselves in
the background.
Performing robust and precise camouflaged animal segmentation is not trivial even for domain experts because of
their consistent appearance with backgrounds.
Even though several efforts were made to perform camouflaged animal image segmentation, there is only some work
on camouflaged animal video segmentation to the best of the author's knowledge.
Biologists usually favor videos with redundant information and temporal consistencies to perform biological
monitoring and understanding of the behavior and events of animals.
The scarcity of such labeled video data is the most hindering issue.
To address these challenges, we present <b>CamoVid60K</b>, a diverse, large-scale, and accurately annotated
video dataset of camouflaged animals.
This dataset comprises <b>218</b> videos with <b>62,774</b> finely annotated frames, covering <b>70</b> animal
categories, which surpasses all previous datasets in terms of the number of videos/frames and species included.
<b>CamoVid60K</b> also offers more diverse downstream tasks in CV, such as camouflaged animal classification,
detection, and task-specific segmentation (semantic, referring, motion), etc.
We have benchmarked several state-of-the-art algorithms on the proposed <b>CamoVid60K</b> dataset, and the
experimental results provide valuable insights into future research directions.
Our dataset stands as a novel and challenging testing set to stimulate more powerful camouflaged animal video
segmentation algorithms, and there is still a large room for further improvement.
</p>
</div>
<div class="section">
<h5>Materials</h5>
<div class="container" style="width:95%">
<!-- Icon row -->
<div class="row">
<div class="three columns">
<a href="files/NeurIPS24_Camo_Vid_Dataset_Preprint.pdf"><img
style="border: 1px solid #ddd; border-radius: 4px; padding: 2px; width: 108px;"
src="files/page1.png"></a>
</div>
<!-- <div class="four columns">
<a href="files/CamoVid60K_CV4Animal_Poster.pdf"><img
style="border: 1px solid #ddd; border-radius: 4px; padding: 2px; width: 200px;"
src="files/poster_thumbnail.png"></a>
</div> -->
<div class="four columns">
<a href="https://camovid.hkustvgd.com"><img
style="border: 1px solid #ddd; border-radius: 4px; padding: 2px; width: 108px;"
src="files/Dataset.png"></a>
</div>
</div>
<!-- Link row -->
<div class="row">
<div class="three columns">
<a href="files/NeurIPS24_Camo_Vid_Dataset_Preprint.pdf">Paper</a>
</div>
<!-- <div class="four columns">
<a href="files/CamoVid60K_CV4Animal_Poster.pdf">Poster</a>
</div> -->
<div class="four columns">
<a href="https://camovid.hkustvgd.com">Dataset (comming soon)</a>
</div>
</div>
</div>
</div>
<br>
<div id="teaser" class="container" style="width:100%; margin:0; padding:0">
<h5>Our CamoVid60K dataset</h5>
<center>
<div class="caption">
<p align="justify">
Camouflage is a powerful biological mechanism for avoiding detection and identification. In nature,
camouflage tactics are employed to deceive the sensory and cognitive processes of both preys and predators.
Wild animals utilize these tactics in various ways, ranging from blending themselves into the surrounding
environment to employing disruptive patterns and colouration. Identifying camouflage is pivotal in many
wildlife surveillance applications, as it assists in locating hidden individuals for study and protection.
</p>
<p align="justify">
Concealed scene understanding (CSU) is a hot computer vision topic aiming to learn discriminative features
that can be used to discern camouflaged target objects from their surroundings. The MoCA dataset is the most
extensive compilation of videos featuring camouflaged objects, yet it only provides detection labels.
Consequently, researchers often evaluate the efficacy of sophisticated segmentation models by transforming
segmentation masks into detection bounding boxes. With the recent advent of MoCA-Mask, there’s been a shift
towards video segmentation in concealed scenes. However, despite these advancements, the data annotations
remain insufficient in both volume and accuracy for developing a reliable video model capable of effectively
handling complex concealed situations. The below table compares our proposed dataset with previous ones,
showing that CamoVid60K surpasses all previous datasets in terms of the number of videos/frames and species
included.
</p>
</div>
</center>
<center>
<img src="files/Comparison.png" style="width:100%"></img>
</center>
<br>
<p align="center"><b>Comparison with existing video animal datasets.</b> Class.: Classification Label, B.Box:
Bounding Box, Motion: Motion of Animal, Coarse OF: Coarse Optical Flow, Expres.: Expression. </p>
<p align="justify"><b>Note that,</b> MVK dataset mostly consists of normal marine animals with only some
camouflaged animals. The frequency of annotations refers to how often each frame is annotated. For instance,
MoCA-Mask provides annotations for every five frames, resulting in 4,691 annotated frames. In contrast, our
CamoVid60K dataset offers a significantly larger volume of data with more frequent annotations and a wider
variety of annotation types.</p>
<br>
<center>
<img src="files/Data_pipeline.png" style="width:100%"></img>
</center>
<br>
<p align="center"><b>CamoVid60K data pipeline.</b> Stage I includes data curation, filtering irrelevant videos,
and
extracting all frames. Stage II includes data annotation, generation, and filtering.</p>
<center>
<img src="files/dataset_organization.png" style="width:60%"></img>
</center>
<br>
<p align="center">Data organization of our dataset.</p>
<center>
<img src="files/WordCloud.png" style="width:100%"></img>
</center>
<br>
<p align="center">Word cloud of category distribution of camouflaged animals.</p>
<center>
<img src="files/radial_chart.png" style="width:100%"></img>
</center>
<br>
<p align="center">Taxonomic structure of our dataset.</p>
</div>
<div id="teaser" class="container" style="width:100%; margin:0; padding:0">
<h5>Visualizations</h5>
<p align="center">Please see <a href="more_results.html">this page</a> for more results.</p>
<style>
.video-mouseover {
-webkit-filter: brightness(100%);
-moz-filter: brightness(100%);
-o-filter: brightness(100%);
-ms-filter: brightness(100%);
filter: brightness(100%);
}
.video-mouseover:hover {
-webkit-filter: brightness(70%);
-moz-filter: brightness(70%);
-o-filter: brightness(70%);
-ms-filter: brightness(70%);
filter: brightness(70%);
cursor: pointer;
}
</style>
<div class="row ">
<div class="six columns" id="vid0" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/arabian_horn_viper.jpg">
<source src='files/arabian_horn_viper.webm' type='video/webm'>
</video>
<p>Arabian Horn Viper</p>
</div>
<div class="six columns" id="vid1" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/arctic_fox_1.jpg">
<source src='files/arctic_fox_1.webm' type='video/webm'>
</video>
<p>Arctic Fox</p>
</div>
</div>
<div class="row ">
<div class="six columns" id="vid2" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/flatfish_0.jpg">
<source src='files/flatfish_0.webm' type='video/webm'>
</video>
<p>Flat Fish</p>
</div>
<div class="six columns" id="vid3" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/flounder_5.jpg">
<source src='files/flounder_5.webm' type='video/webm'>
</video>
<p>Flounder</p>
</div>
</div>
<div class="row ">
<div class="six columns" id="vid2" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/eastern_screech_owl_1.jpg">
<source src='files/eastern_screech_owl_1.webm' type='video/webm'>
</video>
<p>Eastern Screech Owl</p>
</div>
<div class="six columns" id="vid3" align="center">
<video width='100%' autoplay="false" preload="false" loop class="video-mouseover"
poster="files/grasshopper_2.jpg">
<source src='files/grasshopper_2.webm' type='video/webm'>
</video>
<p>Grasshopper</p>
</div>
</div>
</div>
<div class="section">
<h5>Citation</h5>
<pre style="margin:0">
<code>@inproceedings{tavu2024camovid,
title={A Large-Scale Video Dataset for Moving Camouflaged Animals Understanding},
author={Tuan-Anh Vu, Zheng Ziqiang, Chengyang Song, Qing Guo, Ivor Tsang, Sai-Kit Yeung},
booktitle={preprint},
year={2024}
}</code>
</pre>
</div>
<!-- -->
<br>
<div class="section">
<h5>Acknowledgements</h5>
<p>
This work is supported by an internal grant from HKUST (R9429). This work is partially done when Tuan-Anh Vu was
a research resident at CFAR & IHPC, A*STAR, Singapore. The website is modified from this <a
href="https://tuananh1007.github.io/RFNet-4D/">template</a>.
</p>
<script type="text/javascript" id="clustrmaps"
src="//clustrmaps.com/map_v2.js?d=nTZuh9Dq8LfoVGJmFbKa3DwQhT-EInzajSZP6GdqXBE&cl=ffffff&w=a"></script>
</div>
</div>
<script type="text/javascript" src="../js/jquery.min.js"></script>
<script type="text/javascript" src="../js/footable.min.js"></script>
<script>
$('.video-mouseover').hover(function () {
if (this.hasAttribute("controls")) {
this.removeAttribute("controls")
} else {
this.setAttribute("controls", "controls")
}
});
</script>
<script type="text/javascript">
jQuery(function ($) {
$('.table').footable();
});
</script>
<!-- End Document
–––––––––––––––––––––––––––––––––––––––––––––––––– -->
</body>
</html>