Plenoptic PNG: Real-time Neural Radiance Fields in 150KB

Jae Yong Lee1*, Yuqun Wu1, Chuhang Zou2, Derek Hoiem1, Shenlong Wang1,
1UIUC 2Amazon Inc. *Currently in Apple Inc.


Abstract

The goal of Plenoptic PNG (PPNG) is to encode a 3D scene into an extremely compact representation from 2D images and to enable its transmittance, decoding and rendering in real-time across various platforms. Despite the progress in NeRFs and Gaussian Splats, their large model size and specialized renderers make it challenging to distribute free-viewpoint 3D content as easily as images. To address this, we have designed a novel 3D representation that encodes the plenoptic function into sinusoidal function indexed dense volumes. This approach facilitates feature sharing across different locations, improving compactness over traditional spatial voxels. The memory footprint of the dense 3D feature grid can be further reduced using spatial decomposition techniques. This design combines the strengths of spatial hashing functions and voxel decomposition, resulting in a model size as small as 150 KB for each 3D scene. Moreover, PPNG features a lightweight rendering pipeline with only 300 lines of code that decodes its representation into standard GL textures and fragment shaders. This enables real-time rendering using the traditional GL pipeline, ensuring universal compatibility and efficiency across various platforms without additional dependencies.

Training

Training PPNG models can be done with a CUDA capable devices. We implemented PPNG 1, 2 and 3 on tiny-cuda-nn, and ported on Instant-NGP platform. Typically, the training takes around up to 5 minute (PPNG-3) to 13 minutes (PPNG-1) to complete. We provide code to translate trained model weights (.ingp files) into customized format (.ppng files). The translation is does not require huge computation and is done near instantly.

Usage

Once trained and translated, PPNG can be easily integrated to browsers supporting WebGL 2. All Demo pages are implemented based on the custom component ppng-viewer that can be used as follows:

 <ppng-viewer src="/path_to_ppng_file.ppng" width="400" height="400" ></ppng-viewer>

Real-Time Demo

We provide real-time demo of PPNG on various datasets. For each page, we have 3 different models with different quality and performance trade-offs.

  • P1 represents PPNG-1, our lightest, CP-decmposed version of PPNG-3 (with 128KB model size).
  • P2 represents PPNG-2, an intermediate, tri-plane decomposed version of PPNG-3 (with 2.46 MB model size).
  • P3 represents original PPNG-3 without any decomposition (with 32.8 MB model size).

Each model has RLE encoded occupancy grid cache which are around 15~100 KB for object datasets and 400~500 KB for 360 datasets.
(Note that P3 is not available for some datasets scenes due to size limits in github pages.)

To demonstrate our efficiency, we provide 2 different ways to load the model. All Objects section loads all objects in the dataset at the same time. It will load 4 to 8 objects simultaneously depending on the dataset.
(Warning! This may take a while to load for P3!)
Single Object section loads single object in each dataset with larger render size.

All Objects

Loads all objects in the dataset at the same time!
Synthetic NeRF
P1 P2 P3
Synthetic NSVF
P1 P2 P3
Blended MVS
P1 P2 P3
Tanks and Temples
P1 P2 P3
MIPNeRF 360
P1 P2 P3

Single Object

Loads single object with larger render size.