Pytorch implementation for ICRA 2022 paper StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects. [PDF] [Video] [Website]
StructFormer rearranges unknown objects into semantically meaningful spatial structures based on high-level language instructions and partial-view point cloud observations of the scene. The model use multi-modal transformers to predict both which objects to manipulate and where to place them.
The source code is released under the NVIDIA Source Code License. The dataset is released under CC BY-NC 4.0.
pip install -r requirements.txt
pip install -e .
h5py==2.10
: this specific version is needed.omegaconfg==2.1
: some functions used in this repo are from newer versions
The code has been tested on ubuntu 18.04 with nvidia driver 460.91, cuda 11.0, python 3.6, and pytorch 1.7.
Source code in the StructFormer package is mainly organized as:
- data loaders
data
- models
models
- training scripts
training
- inference scripts
evaluation
Parameters for data loaders and models are defined in OmegaConf
yaml files stored in configs
.
Trained models are stored in /experiments
- Set the package root dir:
export STRUCTFORMER=/path/to/StructFormer
- Download pretrained models from this link and unzip to the
$STRUCTFORMER/models
folder - Download the test split of the dataset from this link and unzip to the
$STRUCTFORMER/data_new_objects_test_split
cd $STRUCTFORMER/scripts/
python run_full_pipeline.py \
--dataset_base_dir $STRUCTFORMER/data_new_objects_test_split \
--object_selection_model_dir $STRUCTFORMER/models/object_selection_network/best_model \
--pose_generation_model_dir $STRUCTFORMER/models/structformer_circle/best_model \
--dirs_config $STRUCTFORMER/configs/data/circle_dirs.yaml
Where {model_name}
is one of structformer_no_encoder
, structformer_no_structure
, object_selection_network
, structformer
, and {structure}
is one of circle
, line
, tower
, or dinner
:
cd $STRUCTFORMER/src/structformer/evaluation/
python test_{model_name}.py \
--dataset_base_dir $STRUCTFORMER/data_new_objects_test_split \
--model_dir $STRUCTFORMER/models/{model_name}_{structure}/best_model \
--dirs_config $STRUCTFORMER/configs/data/{structure}_dirs.yaml
Where {structure}
is as above:
cd $STRUCTFORMER/src/structformer/evaluation/
python test_object_selection_network.py \
--dataset_base_dir $STRUCTFORMER/data_new_objects_test_split \
--model_dir $STRUCTFORMER/models/object_selection_network/best_model \
--dirs_config $STRUCTFORMER/configs/data/{structure}_dirs.yaml
- Download vocabulary list
type_vocabs_coarse.json
from this link and unzip to the$STRUCTFORMER/data_new_objects
. - Download all data for circle and unzip to the
$STRUCTFORMER/data_new_objects
.
Where {model_name}
is one of structformer_no_encoder
, structformer_no_structure
, object_selection_network
, structformer
, and {structure}
is one of circle
, line
, tower
, or dinner
:
cd $STRUCTFORMER/src/structformer/training/
python train_{model_name}.py \
--dataset_base_dir $STRUCTFORMER/data_new_objects \
--main_config $STRUCTFORMER/configs/{model_name}.yaml \
--dirs_config STRUCTFORMER/configs/data/{structure}_dirs.yaml
cd $STRUCTFORMER/src/structformer/training/
python train_object_selection_network.py \
--dataset_base_dir $STRUCTFORMER/data_new_objects \
--main_config $STRUCTFORMER/configs/object_selection_network.yaml \
--dirs_config $STRUCTFORMER/configs/data/circle_dirs.yaml
If you find our work useful in your research, please cite:
@inproceedings{structformer2022,
title = {StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects},
author = {Liu, Weiyu and Paxton, Chris and Hermans, Tucker and Fox, Dieter},
year = {2022},
booktitle = {ICRA 2022}
}