GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Xiao Fu*, Wei Yin*, Mu Hu*, Kaixuan Wang, Yuexin Ma, Ping Tan, Shaojie Shen, Dahua Lin† , Xiaoxiao Long†
- Equal contribution; † Corresponding authors
Arxiv Preprint, 2024
We test our codes under the following environment: Ubuntu 22.04, Python 3.9.18, CUDA 11.8
.
- Clone this repository.
git clone git@github.com:fuxiao0719/GeoWizard.git
cd GeoWizard
- Install packages
conda create -n geowizard python=3.9
conda activate geowizard
pip install -r requirements.txt
cd geowizard
Place your images in a directory input/example
(for example, where we have prepared several cases), and run the following inference. The depth and normal outputs will be stored in output/example
.
python run_infer.py \
--input_dir ${input path} \
--output_dir ${output path} \
--ensemble_size ${ensemble size} \
--denoise_steps ${denoising steps} \
--domain ${data type}
# e.g.
python run_infer.py \
--input_dir input/example \
--output_dir output \
--ensemble_size 3 \
--denoise_steps 10 \
--domain "indoor"
Inference settings: --domain
: Data type. Options: "indoor", "outdoor", and "object". Note that "object" is best for background-free objects, like that in objaverse. We find that "indoor" will suit in most scenarios. Default: "indoor". --ensemble_size
and --denoise_steps
: trade-off arguments for speed and performance, more ensembles and denoising steps to get higher accuracy. Default: 3 and 10.
- Add inference code for 3D reconstruction.
- Add training codes.
- Test on more different local environments.
We also encourage readers to follow these concurrent exciting works.
- Marigold: a finetuned diffusion model for estimating monocular depth.
- Wonder3D: generate multi-view normal maps and color images and reconstruct high-fidelity textured mesh.
- HyperHuman: a latent structural diffusion and a structure-guided refiner for high-resolution human generation.
- GenPercept: a finetuned UNet for a lot of downstream image understanding tasks.
@article{fu2024geowizard,
title={GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image},
author={Fu, Xiao and Yin, Wei and Hu, Mu and Wang, Kaixuan and Ma, Yuexin and Tan, Ping and Shen, Shaojie and Lin, Dahua and Long, Xiaoxiao},
journal={arXiv preprint arXiv:2403.12013},
year={2024}
}