ZeroRF: Fast Sparse View 360° Reconstruction with Zero Pretraining
CVPR 2024
UC San Diego
*Equal contribution


overview

ZeroRF is able to perform novel view synthesis from few views (6 as shown in the figure) with exceptional quality, while also being fast, obtaining competitive results within 2 minutes and finishing in around 25 minutes at the full $800^2$ resolution. For common resolutions like $256^2$ or $320^2$ in 3D generation applications, ZeroRF reconstructs an object from sparse-view generations in only 30 seconds.

6-View Reconstruction Speed Comparison

Abstract

We present ZeroRF, a novel per-scene optimization method addressing the challenge of sparse view 360° reconstruction in neural field representations. Current breakthroughs like Neural Radiance Fields (NeRF) have demonstrated high-fidelity image synthesis but struggle with sparse input views. Existing methods, such as Generalizable NeRFs and per-scene optimization approaches, face limitations in data dependency, computational cost, and generalization across diverse scenarios. To overcome these challenges, we propose ZeroRF, whose key idea is to integrate a tailored Deep Image Prior into a factorized NeRF representation. Unlike traditional methods, ZeroRF parametrizes feature grids with a neural network generator, enabling efficient sparse view 360° reconstruction without any pretraining or additional regularization. Extensive experiments showcase ZeroRF's versatility and superiority in terms of both quality and speed, achieving state-of-the-art results on benchmark datasets. ZeroRF's significance extends to applications in 3D content generation and editing.

Method

overview

Architecture of ZeroRF. It parametrizes TensoRF-VM tensors with randomly-initialized deep generator networks (Sec. 4.3), with the input to the networks set to a frozen Gaussian noise on start of training. The system performs per-scene optimization using the standard volume rendering procedure with a plain rendering loss.

6-View Reconstruction (25min)

Text / Image to 3D (30s)

overview

Acknowledgements

The website template was borrowed from Lior Yariv.