Data Augmentation for NeRFs in the Low Data Limit

Northwestern University, Center for Robotics & Biosystems

Only our data augmentation method generates reasonable scene reconstructions without visual artifacts. Hallucinations, occlusions, and other visual artifacts are common when training with sparse data. a) shows ground truth, b) shows our method, c) shows hallucinations (data augmented by FisherRF method), and d) shows white spots (data collected by Entropy method).

Abstract

Current methods based on Neural Radiance Fields fail in the low data limit, particularly when training on incomplete scene data.

Prior works augment training data only in next-best-view applications, which lead to hallucinations and model collapse with sparse data. In contrast, we propose adding a set of views during training by rejection sampling from a posterior uncertainty distribution, generated by combining a volumetric uncertainty estimator with spatial coverage. We validate our results on partially observed scenes; on average, our method performs 39.9% better with 87.5% less variability across established scene reconstruction benchmarks, as compared to state of the art baselines. We further demonstrate that augmenting the training set by sampling from any distribution leads to better, more consistent scene reconstruction in sparse environments.

This work is foundational for robotic tasks where augmenting a dataset with informative data is critical in resource-constrained, a priori unknown environments.

See how our augmentation compares with others!

FisherRF
Ours
Entropy
Ours
Uniform
Ours

Method Comparisons

Our method achieves the best median performance and lowest interquartile range compared to any other method across any other scene, except for material SSIM vs. Entropy, which has a lower interquartile range. Evaluation results of standard image quality metrics across our method and three other SOTA baselines. Each metric score was evaluated across the 200 images in the evaluation dataset for each of the three scenes. A higher score is better for PSNR and SSIM, and a lower score is better for LPIPS. We achieve the best median performance and the lowest interquartile range compared to any method across each scene, except for material SSIM vs. Entropy. Our method performs better with a statistical significance of p <0.05 and a Bonferroni correction of 3, except for lego LPIPS vs. Uniform, chair LPIPS vs FisherRF, and SSIM vs Uniform and FisherRF.

Related Links

This work builds off a lot of excellent work.

Our architecture uses Nerfstudio, a great Python package for end-to-end NeRF training.

FisherRF introduces an idea similar to ours, but fails in the low data limit with partially observed scenes.

Lee et al. initially introduce the idea of calculating Entropy using NeRF architectures, but is limited to modeling only in-distribution uncertainty.

There are probably many more by the time you are reading this. Check out Dr. Irshad's amazing list of Robotic NeRF papers, and this 2024 survey on NeRF papers in Robotics.