SAP3D: The More You See in 2D, the More You Perceive in 3D

We present SAP3D, which reconstructs the 3D shape and texture of an object with a variable number of real input images. The quality of 3D shape and texture improve with more views

The More You See in 2D, the More You Perceive in 3D
Xinyang Han*¹, Zelin Gao*², Angjoo Kanazawa¹, Shubham Goel^†3, Yossi Gandelsman^†1
¹ UC Berkeley, ² Zhejiang University, ³ Avataar
CVPR 2024 (Highlight)

project page | arxiv | bibtex

Installation

See installation instructions.

Dataset Preparation

See Preparing Datasets for SAP3D.

Method Overview

Overview of SAP3D. We first compute coarse relative camera poses using an off-the-shelf model. We fine-tune a view-conditioned 2D diffusion model on the input images and simultaneously refine the camera poses via optimization. The resulting instance-specific diffusion model and camera poses enable 3D reconstruction and novel view synthesis from an arbitrary number of input images.

Pipeline

This pipeline encompasses 3 stages for pose estimation and reconstruction:

Pose Estimation Initialization: We use scaled-up RelposePP to initialize the poses for the input images.
Pose Refinement and Diffusion Model TTT: Enhancing the pose estimation with refinement and personalizing diffusion model.
3D Reconstruction: Reconstruct the 3D object based on estimated poses and finetuned diffusion model.

System Requirements

Memory Considerations: To ensure a smooth operation, your system should have at least 38GB of available memory.

Initial Setup

Configuring the Working Directory: Please set your ROOT_DIR as environment variable before launching the pipeline using command like echo 'export ROOT_DIR=Your_ROOT_DIR' >> ~/.bashrc.

Reconstruction and Evaluation

Reconstructing Individual Objects: To process a specific object, kindly use the command below:

sh run_pipeline.sh GSO_demo OBJECT_NAME INPUT_VIEWS GPU_INDEX

For instance:

sh run_pipeline.sh GSO_demo Crosley_Alarm_Clock_Vintage_Metal 5 0

Batch Processing: To execute the pipeline for all examples in the dataset/data/train/GSO_demo directory, please run:

python run_pipeline.py --object_type GSO_demo

Results and Numbers

Our process yields comprehensive data sets, stored and accessible as follows:

2D NVS Outputs: Accessible in the directory camerabooth/experiments_nvs/GSO_demo.
3D NVS Outputs: Found within folders named similarly to 3D_Recon/threestudio/experiments_GSO_demo_view_5_nerf.
Evaluation Metrics: Quantitative results are comprehensively stored in the results folder.

In our commitment to replicability and transparency, we have assembled a detailed repository of results for all test objects within results_standard/GSO_demo. Recognizing the considerable computational demand required (8 A100 GPUs across 1-2 days), we pragmatically suggest the processing of a selective subset of the data. This approach is designed to both confirm your system’s configuration and permit a meaningful, comparative analysis of the results.

To generate the tables for better visualize the numbers for different settings, run:

python results_standard/run/summarize.py

Gradio Demo

For using gradio interface to easily reconstruct in the wild objects, you could run gradio demo/sap3d/app.py. (This would take up to an hour to get the results)

Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation.

@inproceedings{han2024more,
  title={The More You See in 2D the More You Perceive in 3D},
  author={Han, Xinyang and Gao, Zelin and Kanazawa, Angjoo and Goel, Shubham and Gandelsman, Yossi},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={20912--20922},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
3D_Recon/threestudio		3D_Recon/threestudio
SyncDreamer		SyncDreamer
camerabooth		camerabooth
dataset		dataset
demo/sap3d		demo/sap3d
docs		docs
relposepp		relposepp
results_standard		results_standard
run		run
.gitignore		.gitignore
README.md		README.md
environment_sap3d.yml		environment_sap3d.yml
environment_zero123.yml		environment_zero123.yml
run_pipeline.py		run_pipeline.py
run_pipeline.sh		run_pipeline.sh
run_pipeline_demo.sh		run_pipeline_demo.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAP3D: The More You See in 2D, the More You Perceive in 3D

Installation

Dataset Preparation

Method Overview

Pipeline

System Requirements

Initial Setup

Reconstruction and Evaluation

Results and Numbers

Gradio Demo

Citation

About

Releases

Packages

Languages

jameskuma/sap3d

Folders and files

Latest commit

History

Repository files navigation

SAP3D: The More You See in 2D, the More You Perceive in 3D

Installation

Dataset Preparation

Method Overview

Pipeline

System Requirements

Initial Setup

Reconstruction and Evaluation

Results and Numbers

Gradio Demo

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages