-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SODA-A dataset #2575
base: main
Are you sure you want to change the base?
Add SODA-A dataset #2575
Conversation
@shaunyuan22 Thank you for this nice dataset and all the work. We aim to make the dataset more easily usable in torchgeo, and would appreciate it if you have any comments, corrections etc. |
|
||
if self.bbox_orientation == 'oriented': | ||
# TODO different keys for oriented and horizontal boxes | ||
sample['boxes'] = boxes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamjstewart what should we do for oriented bounding boxes, Kornia only has boxes_xyxy
and boxes_xywh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kornia box keys are bbox_xyxy
and bbox_xywh
Open question how to deal with the polygons into a common oriented bounding box schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked about oriented bboxes on Slack. We may need to add support to Kornia for this ourselves. Until Kornia supports it natively, I guess it doesn't matter what the format looks like. But let's use the same key names that Kornia uses.
@@ -49,6 +49,7 @@ Dataset,Task,Source,License,# Samples,# Classes,Size (px),Resolution (m),Bands | |||
`SKIPP'D`_,R,"Fish-eye","CC-BY-4.0","363,375",-,64x64,-,RGB | |||
`SkyScript`_,IC,"NAIP, orthophotos, Planet SkySat, Sentinel-2, Landsat 8--9",MIT,5.2M,-,100--1000,0.1--30,RGB | |||
`So2Sat`_,C,Sentinel-1/2,"CC-BY-4.0","400,673",17,32x32,10,"SAR, MSI" | |||
`SODAA`_,OD,Aerial,"CC-BY-SA","2513",9,"varying","RGB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`SODAA`_,OD,Aerial,"CC-BY-SA","2513",9,"varying","RGB" | |
`SODA`_,OD,Aerial,"CC-BY-SA","2513",9,"varying","RGB" |
I see CC-BY-SA here, but I wish we knew which CC-BY-SA version. I usually use a size range instead of "varying".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shaunyuan22 do you know?
P.S. We are adding TorchGeo data loaders for your excellent SODA-A dataset, hopefully this makes it even easier for people to use and cite your paper!
If you use this dataset in your research, please cite the following paper: | ||
|
||
* https://ieeexplore.ieee.org/document/10168277 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention that pyarrow is required? Is this a requirement from the dataset authors or from you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hesitant to add extra dependencies unless they are absolutely required. Does it add a significant speed boost?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, Ill change it to csv, parquet is more performant so today's standard, but that will remove the dependency.
This PR adds the SODA-A dataset. Dataset rehosted on HF.
Dataset features:
Dataset format:
TODOS:
Annotations/train/01874.json
:Example plot:
data:image/s3,"s3://crabby-images/dfc21/dfc21b0014a2c9ae1510f29e52b298e70d7d5962" alt="soda"