With major contributions from versatran01.
Data downloader and data converter for DeepMind GQN dataset https://github.com/deepmind/gqn-datasets to use with other libraries than TensorFlow
Don't hesitate to make a pull request.
Dependencies
You need to install:
Download the tfrecord dataset
If you want to download the entire dataset:
gsutil -m cp -R gs://gqn-dataset/<dataset> .If you want to download a proportion of the dataset only:
python download_gqn.py <dataset> <proportion>Convert the raw dataset
Command line options:
usage: convert2file.py [-h] [-b BATCH_SIZE] [-n FIRST_N] [-m MODE]
base_dir dataset
Convert gqn tfrecords to gzip files.
positional arguments:
base_dir base directory of gqn dataset
dataset datasets to convert, eg. shepard_metzler_5_parts
optional arguments:
-h, --help show this help message and exit
-b BATCH_SIZE, --batch-size BATCH_SIZE
number of sequences in each output file
-n FIRST_N, --first-n FIRST_N
convert only the first n tfrecords if given
-m MODE, --mode MODE whether to convert train or testConvert all records with all sequences in sm5 train (400 records, 2000 seq each):
python convert2file.py ~/gqn_dataset shepard_metzler_5_partsConvert first 20 records with batch size of 128 in sm5 test:
python convert2file.py ~/gqn_dataset shepard_metzler_5_parts -n 20 -b 128 -m testSize of the datasets:
| Names | Sizes |
|---|---|
| total | 1.45 Tb |
| ------------- | -------------- |
| jaco | 198.97 Gb |
| mazes | 136.23 Gb |
| rooms_free_camera_no_object_rotations | 255.75 Gb |
| rooms_free_camera_with_object_rotations | 598.75 Gb |
| rooms_ring_camera | 250.89 Gb |
| shepard_metzler_5_parts | 21.09 Gb |
| shepard_metzler_7_parts | 23.68 Gb |