Skip to content

Commit e6f923c

Browse files
release
1 parent f23bae2 commit e6f923c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+7338
-3
lines changed

.gitignore

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
code/run_logs
2+
3+
# Byte-compiled / optimized / DLL files
4+
__pycache__/
5+
*.py[cod]
6+
*$py.class
7+
8+
# C extensions
9+
*.so
10+
11+
# Distribution / packaging
12+
.Python
13+
build/
14+
develop-eggs/
15+
dist/
16+
downloads/
17+
eggs/
18+
.eggs/
19+
lib/
20+
lib64/
21+
parts/
22+
sdist/
23+
var/
24+
wheels/
25+
pip-wheel-metadata/
26+
share/python-wheels/
27+
*.egg-info/
28+
.installed.cfg
29+
*.egg
30+
MANIFEST
31+
32+
# PyInstaller
33+
# Usually these files are written by a python script from a template
34+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
35+
*.manifest
36+
*.spec
37+
38+
# Installer logs
39+
pip-log.txt
40+
pip-delete-this-directory.txt
41+
42+
# Unit test / coverage reports
43+
htmlcov/
44+
.tox/
45+
.nox/
46+
.coverage
47+
.coverage.*
48+
.cache
49+
nosetests.xml
50+
coverage.xml
51+
*.cover
52+
*.py,cover
53+
.hypothesis/
54+
.pytest_cache/
55+
56+
# Translations
57+
*.mo
58+
*.pot
59+
60+
# Django stuff:
61+
*.log
62+
local_settings.py
63+
db.sqlite3
64+
db.sqlite3-journal
65+
66+
# Flask stuff:
67+
instance/
68+
.webassets-cache
69+
70+
# Scrapy stuff:
71+
.scrapy
72+
73+
# Sphinx documentation
74+
docs/_build/
75+
76+
# PyBuilder
77+
target/
78+
79+
# Jupyter Notebook
80+
.ipynb_checkpoints
81+
82+
# IPython
83+
profile_default/
84+
ipython_config.py
85+
86+
# pyenv
87+
.python-version
88+
89+
# pipenv
90+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
91+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
92+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
93+
# install all needed dependencies.
94+
#Pipfile.lock
95+
96+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
97+
__pypackages__/
98+
99+
# Celery stuff
100+
celerybeat-schedule
101+
celerybeat.pid
102+
103+
# SageMath parsed files
104+
*.sage.py
105+
106+
# Environments
107+
.env
108+
.venv
109+
env/
110+
venv/
111+
ENV/
112+
env.bak/
113+
venv.bak/
114+
115+
# Spyder project settings
116+
.spyderproject
117+
.spyproject
118+
119+
# Rope project settings
120+
.ropeproject
121+
122+
# mkdocs documentation
123+
/site
124+
125+
# mypy
126+
.mypy_cache/
127+
.dmypy.json
128+
dmypy.json
129+
130+
# Pyre type checker
131+
.pyre/
132+
133+
exps/*
134+
exps*
135+
evals*
136+
data/DTU
137+
data/BlendedMVS
138+
data/Replica
139+
data/tnt_advanced
140+
data/
141+
142+
code/tmp_build
143+
144+
code/.idea/
145+
.DS_Store
146+
._.DS_Store
147+
.idea/
148+
149+
*.png
150+
*.ply
151+
*.txt
152+
*.jpg
153+
*.npy
154+
*.npz
155+
*.tar
156+
uploadtnt_*/
157+
158+
*.json
159+
*.csv
160+
dtu_eval/Offical_DTU_Dataset/
161+
media/

README.md

Lines changed: 120 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,16 +28,133 @@ We demonstrate that state-of-the-art depth and normal cues extracted from monocu
2828
</p>
2929
<br>
3030

31-
### Code coming soon
31+
# Setup
32+
33+
## Installation
34+
Clone the repository and create an anaconda environment called monosdf using
35+
```
36+
git clone [email protected]:autonomousvision/monosdf.git
37+
cd monosdf
38+
39+
conda create -y -n monosdf python=3.8
40+
conda activate monosdf
41+
42+
conda install pytorch torchvision cudatoolkit=11.3 -c pytorch
43+
conda install cudatoolkit-dev=11.3 -c conda-forge
44+
45+
pip install -r requirements.txt
46+
```
47+
The hash encoder will be compiled on the fly when running the code.
48+
49+
## Dataset
50+
For downloading the preprocessed data, run the following script. The data for the DTU, Replica, Tanks and Temples is adapted from [VolSDF](https://github.com/lioryariv/volsdf), [Nice-SLAM](https://github.com/cvg/nice-slam), and [Vis-MVSNet](https://github.com/jzhangbs/Vis-MVSNet), respectively.
51+
```
52+
bash scripts/download_dataset.sh
53+
```
54+
# Training
55+
56+
Run the following command to train monosdf:
57+
```
58+
cd ./code
59+
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf CONFIG --scan_id SCAN_ID
60+
```
61+
where CONFIG is the config file in `code/confs`, and SCAN_ID is the id of the scene to reconstruct.
62+
63+
We provide example commands for training DTU, ScanNet, and Replica dataset as follows:
64+
```
65+
# DTU scan65
66+
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/dtu_mlp_3views.conf --scan_id 65
67+
68+
# ScanNet scan 1 (scene_0050_00)
69+
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/scannet_mlp.conf --scan_id 1
70+
71+
# Replica scan 1 (room0)
72+
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/replica_mlp.conf --scan_id 1
73+
```
74+
75+
We created individual config file on Tanks and Temples dataset so you don't need to set the scan_id. Run training on the courtroom scene as:
76+
```
77+
CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/tnt_mlp_1.conf
78+
```
79+
80+
We also generated high resolution monocular cues on the courtroom scene and it's better to train with more gpus. First download the dataset
81+
```
82+
bash scripts/download_highres_TNT.sh
83+
```
84+
85+
Then run training with 8 gpus:
86+
```
87+
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8 python -m torch.distributed.launch --nproc_per_node 8 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/tnt_highres_grids_courtroom.conf
88+
```
89+
Of course, you can also train on all other scenes with multi-gpus.
90+
91+
# Evaluations
92+
93+
## DTU
94+
First, download the ground truth DTU point clouds:
95+
```
96+
bash scripts/download_dtu_ground_truth.sh
97+
```
98+
then you can evaluate the quality of extracted meshes (take scan 65 for example):
99+
```
100+
python evaluate_single_scene.py --input_mesh scan65_mesh.ply --scan_id 65 --output_dir dtu_scan65
101+
```
102+
103+
We also provide script for evaluating all DTU scenes:
104+
```
105+
python evaluate.py
106+
```
107+
Evaluation results will be saved to ```evaluation/DTU.csv``` by default, please check the script for more details.
108+
109+
## Replica
110+
Evaluate on one scene (take scan 1 room0 for example)
111+
```
112+
cd replica_eval
113+
python evaluate_single_scene.py --input_mesh replica_scan1_mesh.ply --scan_id 1 --output_dir replica_scan1
114+
```
115+
116+
We also provided script for evaluating all Replica scenes:
117+
```
118+
cd replica_eval
119+
python evaluate.py
120+
```
121+
please check the script for more details.
122+
123+
## ScanNet
124+
```
125+
cd scannet_eval
126+
python evaluate.py
127+
```
128+
please check the script for more details.
129+
130+
## Tanks and Temples
131+
You need to submit the reconstruction results to the [official evaluation server](https://www.tanksandtemples.org), please follow their guidance. We also provide an example of our submission [here](https://drive.google.com/file/d/1Cr-UVTaAgDk52qhVd880Dd8uF74CzpcB/view?usp=sharing) for reference.
132+
133+
# Custom dataset
134+
We provide an example of how to preprocess scannet to monosdf format. First, run the script to subsample training images, normalize camera poses, and etc.
135+
```
136+
cd preprocess
137+
python scannet_to_monosdf.py
138+
```
139+
140+
Then, we can extract monocular depths and normals (please install [omnidata model](https://github.com/EPFL-VILAB/omnidata) before running the command):
141+
```
142+
python extract_monocular_cues.py --task depth --img_path ../data/custom/scan1 --output_path ../data/custom/scan1 --omnidata_path YOUR_OMNIDATA_PATH --pretrained_models PRETRAINED_MODELS
143+
python extract_monocular_cues.py --task normal --img_path ../data/custom/scan1 --output_path ../data/custom/scan1 --omnidata_path YOUR_OMNIDATA_PATH --pretrained_models PRETRAINED_MODELS
144+
```
145+
146+
147+
# Acknowledgements
148+
This project is built upon [VolSDF](https://github.com/lioryariv/volsdf). We use pretrained [Omnidata](https://omnidata.vision) for monocular depth and normal extraction. Cuda implementation of Multi-Resolution hash encoding is based on [torch-ngp](https://github.com/ashawkey/torch-ngp). Evaluation scripts for DTU, Replica, and ScanNet are taken from [DTUeval-python](https://github.com/jzhangbs/DTUeval-python), [Nice-SLAM](https://github.com/cvg/nice-slam) and [manhattan-sdf](https://github.com/zju3dv/manhattan_sdf) respectively. We thank all the authors for their great work and repos.
32149

33-
<br>
34150

151+
# Citation
35152
If you find our code or paper useful, please cite
36153
```bibtex
37154
@article{Yu2022MonoSDF,
38155
author = {Yu, Zehao and Peng, Songyou and Niemeyer, Michael and Sattler, Torsten and Geiger, Andreas},
39156
title = {MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction},
40-
journal = {arXiv:2022.00665},
157+
journal = {Advances in Neural Information Processing Systems (NeurIPS)},
41158
year = {2022},
42159
}
43160
```

code/confs/dtu_grids_3views.conf

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
train{
2+
expname = dtu_grids_3views
3+
dataset_class = datasets.scene_dataset.SceneDatasetDN
4+
model_class = model.network.MonoSDFNetwork
5+
loss_class = model.loss.MonoSDFLoss
6+
learning_rate = 5.0e-4
7+
lr_factor_for_grid = 1.0
8+
num_pixels = 1024
9+
checkpoint_freq = 100
10+
plot_freq = 10
11+
split_n_pixels = 1024
12+
}
13+
plot{
14+
plot_nimgs = 1
15+
resolution = 512
16+
grid_boundary = [-1.2, 1.2]
17+
}
18+
loss{
19+
rgb_loss = torch.nn.L1Loss
20+
eikonal_weight = 0.1
21+
smooth_weight = 0.005
22+
depth_weight = 0.1
23+
normal_l1_weight = 0.05
24+
normal_cos_weight = 0.05
25+
end_step = 12800
26+
}
27+
dataset{
28+
data_dir = DTU
29+
img_res = [384, 384]
30+
scan_id = 65
31+
center_crop_type = center_crop_for_dtu
32+
num_views = 3
33+
}
34+
model{
35+
feature_vector_size = 256
36+
scene_bounding_sphere = 5.0
37+
38+
Grid_MLP = True
39+
40+
implicit_network
41+
{
42+
d_in = 3
43+
d_out = 1
44+
dims = [ 256, 256]
45+
geometric_init = True
46+
bias = 0.6
47+
skip_in = [4]
48+
weight_norm = True
49+
multires = 6
50+
use_grid_feature = True
51+
divide_factor = 5.0 # 1.5 for replica, 6 for dtu, 3.5 for tnt, 1.5 for bmvs, we need it to normalize the points range for multi-res grid
52+
}
53+
rendering_network
54+
{
55+
mode = idr
56+
d_in = 9
57+
d_out = 3
58+
dims = [ 256, 256] #, 256, 256]
59+
weight_norm = True
60+
multires_view = 4
61+
}
62+
density
63+
{
64+
params_init{
65+
beta = 0.1
66+
}
67+
beta_min = 0.0001
68+
}
69+
ray_sampler
70+
{
71+
near = 2.0
72+
N_samples = 64
73+
N_samples_eval = 128
74+
N_samples_extra = 32
75+
eps = 0.1
76+
beta_iters = 10
77+
max_total_iters = 5
78+
}
79+
}

0 commit comments

Comments
 (0)