- A novel framework for volumetric scene prefiltering and level-of-detail rendering.
- Accurate far-field aggregated voxel appearance with an efficient factorization.
- Preserve both local and global spatial correlation for accurate visibility.

Level-of-detail (LoD) rendering is a classic topic in computer graphics. The basic idea is simple: when we render a large scene, it is often too expensive (and unnecessary) to render the full version of it. Instead, it is better to only render a simplified version to either match the image resolution or according to some other budget. To achieve this, we need to build simplified versions of the original scene. A common technique is mesh simplification, and a modern example of using it the Nanite system in UE5. However, mesh simplification does not take care of materials or appearance. On the other hand, volumetric methods, like ours, consider both geometries and materials simultaneously by converting the scene into volumes at different resolutions. Then, during rendering, a suitable level is picked with the goal of saving memory/performance but also preserving the original appearance.

We voxelize the scene at different resolutions such that given pixel footprint, we can find an appropriate resolution that matches the footprint. For each voxel, we model its far-field appearance by the Aggregate BSDF (or ABSDF). With some assumptions, we propose a closed-form factorization of the ABSDF. Because we have multiple voxels, we need to accumulate the outgoing radiance from multiple voxels. Here we need to handle the correlation that exists at different ranges. To preserve the local spatial correlation, we use a truncated ellipsoid primitive that describes the intra-voxel geometric distribution. To preserve the long-range correlation, we compute global aggregated visibility functions. These components lead to accurate voxel accumulation.

The general light transport of a voxel can be described as an 8D function of incident/outgoing positions and directions. We apply the far-field assumption meaning that
cameras and light sources are all sufficiently far, and we can safely drop the positional dependency. The resulting far-field appearance is a 4D function which
we call the **Aggregated Bidirectional Scattering Distribution Function** (ABSDF).

In the paper, we talk about exactly how to derive the ABSDF. In general, the ABSDF could be complicated. Therefore, we present a factorization to allow efficient evaluation (and importance sampling).

It is kinda crazy to think about it, but why are voxels always cubes? Essentially, it makes the simplest assumption that geometries are uniformly distributed inside each voxel. However, this assumption does not really hold when we want to abstract real geometries as voxels. Let’s say we want to voxelize a big diagonal plane. Because the voxelized version has “thickness”, some rays will “double count” the contributions from more than one voxels and this results in artifacts. Maybe it’s because we don’t have enough resolution? Well, not really until we have infinite resolution.

The fundamental problem is that geometries are not distributed uniformly inside a voxel. However, simple voxels fail to capture this information and result in
systematic error. To improve the accuracy of voxel accumulation, we fit a bounding ellipsoid for the geometries in each voxel. We define the new
primitive as the intersection of the cube and the ellipsoid, so we call it the **truncated ellipsoid primitive**. The new primitive is much more effective at
adapting to different geometric distributions: when the voxel includes a flat surface, the primitive now provides a much tighter fit; when the voxel includes
a lot of random geometries, it falls back to a cube. Of course, it’s not perfect, but it is a very cost-effective improvement.

Compared to the reference, LoD with cube voxels results in artifacts on the red plane. The truncated ellipsoid primitives produce artifact-free results with a tighter silhouette.

The problem of spatial correlation also exists in long range. Consider the 2D example below:

We have 2 configurations. Both configurations have 3 voxels and we are observing from the left. If we only consider the “occupancy” of individual voxels, then both configurations are virtually identical. All we knows is that first two voxels have 50% occupancy, and the last one has 100% occupancy. However, the two configurations end up having different results. This is because in the first configuration, the first two voxels are negatively correlated because the green surface does not block the brown one at all. In the second configuration, they are positively correlated because the green surface perfectly blocks the brown surface. Note that regular exponential transmittance (aka “alpha blending”) does not work at all.

Our solution is to precompute the **global aggregated visibility**. For each voxel we record the average visibility from points on the surfaces inside a voxel
through the entire scene along a direction. In this way, we obtain true correlated visibility. However, this method comes at a price because global visibility
is high-dimensional. In the paper, we provide compression strategies to keep the memory requirement in check.

We show accurate LoD rendering using our representation, closely matching the ground truth at all scales. More results and comparisons in the paper and the supplemental video.

```
@article{zhou2024sceneagn,
title={Efficient Scene Appearance Aggregation for Level-of-Detail Rendering},
author={Yang Zhou and Tao Huang and Ravi Ramamoorthi and Pradeep Sen and Ling-Qi Yan},
year={2024},
eprint={2409.03761},
archivePrefix={arXiv},
}
```