Skip to content

Multiple affordance detection using the interaction tensor and deep scene saliency

License

Notifications You must be signed in to change notification settings

eduard626/interaction-tensor-affordances

Repository files navigation

Multiple affordance detection code and data.

This repo contains the implementation of our paper "Geometric Affordance Perception: Leveraging Deep 3D Saliency With the Interaction Tensor", from Eduardo Ruiz and Walterio Mayol-Cuevas at the University of Bristol, UK.

Demo

If you use our code or method in your work, please cite the following:

@article{Ruiz-Mayol2020,
author={Ruiz, Eduardo and Mayol-Cuevas, Walterio},   	 
title={Geometric Affordance Perception: Leveraging Deep 3D Saliency With the Interaction Tensor},      
journal={Frontiers in Neurorobotics},      
volume={14},      
pages={45},     
year={2020},             
doi={10.3389/fnbot.2020.00045},      	
issn={1662-5218},   

Contents

  1. Dependencies
  2. Data
  3. Saliency training
  4. Descriptor computation
  5. Multiple affordance prediction
  6. Multiple affordance prediction

Dependencies

You can find all the dependencies for training our saliency-based method in the file:

conda_environment.yml

The setup was last tested with Ubuntu 16.04 on June 2020.

Data

As per our paper, we train affordance saliency on pointclouds extracted with our method from the ICRA'18 paper, code here. Essentially, we run single affordance prediction on 20 RGB-D scenes, which are a combination of our own captures and publicly available data. These pointclouds are available of upon request, but you can also use your own and follow our previous work to produce data.

For testing multiple affordance prediction, we employed a subset of the ScanNet dataset, we used the subset of scenes with the following ids: scannet_subset Should you require the same data, please follow the instructions of ScanNet project.

We provide a smaller training/validation dataset (and some auxiliary data) to get you started. This data can be found in the data dir of mPointNet

+-- README.md
+-- _mPointNet/
|   +-- _data/
|   |   +-- _new_data_centered/
|   |   |   +-- MultilabelDataSet_splitTest.h5
|   |   |   +-- MultilabelDataSet_splitTrain.h5
|   |   |   +-- ...   
|   +-- log/
|   +-- dump/
Single-affordance Tensors

The 84 tensors for all the affordances considered for our experiments is available here. If want to start building your own, have a look at our previous work. You will need CAD models for training the affordances/interactions of interest.

Multiple-affordance multi-label datasets

We here provide a script to generate multiple-affordance datasets from sets of single affordance predictions, method used for our paper.

You need to run single-affordance predictions on an input scene of your choice. For that, you will need single-affordance tensors (from above) and the code from our previous method, which you can get here. The code will store results on the data directory. Once you have computed individual affordance predictions you can run the following script

python multiple_affordance_dataset.py 

This script will look for results from individual predictions and use that data to create a multi-affordance multi-label dataset suitable for training saliency. The code will create one small dataset per scene, you can use as many of these small datasets as you need in order to gather enough data to train saliency, for instance.

python create_split.py 90 10

This will look for all the small datasets put them together and create a 90/10 split for training and validation.

Saliency training

Saliency training is based on the PointNet++ architecture with the modifications discussed in the paper. The code for this stage of our method can be found in the mPointNet directory. Please refer to the original PointNet++ instructions on how to compile the tf_operators, you should be fine compiling and running everything with the provided conda environment from above.

In order to train the network you want to run:

python train_affordances.py

With the provided dataset, this command will train the network with the default configuration as in our paper, e.g. 84 affordances + background, cross-entropy loss with L2-normalization, batch size=32, sampling 1024 points, etc.

In order to evaluate the network and extract scene saliency you need to run:

python evaluate_affordances.py

Which will write results and useful data into the dump/ directory

Descriptor Computation

With the learned saliency and our Interaction Tensors we compute a multiple-affordance descriptor for fast affordance prediction. The code to compute this descriptor needs the data crated by the previous step and single-affordance interaction tensors (see point 2 above). To compute the descriptor run:

python backprojection.py

This code projects the saliency learned by network back into the corresponding tensors. The resulting descriptor will be written to the data/ directory.

Multiple Affordance Prediction

Once the descriptor has been computed, you can carry on and detect multiple affordance at fast rates with our C++ code. The code for that is the same as in our previous work, the difference being the descriptor that you feed in (e.g. the one produced by the previous step). Instructions and code for predictions are found here

Support

If you have questions or problems with the code, please create an issue in this repo.

Author

Eduardo Ruiz

About

Multiple affordance detection using the interaction tensor and deep scene saliency

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published