SIGGRAPH 2016 Course: Part 2

← Part 1

This course note is the second set of the 2016 SIGGRAPH course on Frequency Analysis of Light Transport. If you haven’t checked the previous part, I recommand you do so.

Representations of the Fourier Spectrum

Most of the application we will review here rely on different representation of the spectrum. Since it is not possible to compute it exactly, we always rely on approximation. The covariance is one of many representation that we can use.

For example it is possible to extract the axis aligned bounding box of the spectrum \([B_x, B_u]\) as done by Metha et al. [2012,2013,2014] and Yan et al. [2015]. Note that this correspond to the bandwidth of the spectrum. It is possible to be less conservative when the spectrum has an infinite bandwidth but not much energy in the tail of the distribution using a percentile of the spectrum as done by Bagher et al. [2011].

Another interesting representation arise when we want to study a spectrum that summarize many of the same shear operator (see Egan et al. [2009,2011,2011). For example blockers at different distance from us. The Fourier transform is then enclosed in a wedge shape defined by the min and max distance to the blockers.

Lastly, we can represent the spectrum using second order statistics such as the covariance matrix or its inverse known as the information matrix [Belcour et al. 2013].

I all those case, two methods are possible to determine the spectrum during rendering: either we can use a closed form method, or a tracing method. We will use a closed form method when the transport operators are known before hand (to study the last bounce with a BRDF assuming a directional light source for example). There, we determine the resulting spectrum by concatenating the operator and expressing them as a single expression. On the other hand, we will use the tracing method when the transport operators sequence is unknown (such as multiple bounces). There, we update the representation using the atomic operators as we perform rendering.

Adaptive Sampling & Reconstruction

Adaptive sampling & reconstruction algorithms adapt the Nyquist-Shannong sampling theorem to predict, from the Fourier analysis, a per pixel sampling rate and an associated reconstruction filter (for example in Durand et al. [2005], Soler et al. [2009]). In some other cases, we adapt the required sampling rate for integration (to integrate motion for example, see Egan et al. [2009] or Belcour et al. [2013]).

The sampling rate in image space is directly extracted from the bandwidth of the Fourier spectrum using the Nyquist rate: \(N = \frac{1}{2 B_i}\).

Note: \(B_i\) is the bandwidth in pixel space. To get it, we use the spatio-angular bandwidth \([B_x, B_u]\) and depending on the sensor type, we scale either the angular of the spatial component to convert frequency measured in \(m^{-1}\) to \(\mbox{pixel}^{-1}\). For the case of the pinehole camera, the convertion formula is:

\[B_i = \frac{\tan(f_H)}{H} B_u,\]

where \(f_H\) is the angular field of view in either width or height and \(H\) is either the screen width of height in pixels.

Example: Adaptive Lens Integration - In this example, we will see how we can render and image with defocus with a minimal number of samples. This example is similar to the method of Soler et al. [2009]. Using this example, we will show the different technique available to generate the picture. Indeed, there are multiple ways to use the frequency analysis to adapt the samples’ count and reconstruct the final image.

First, our setup is the following (and shown in the figure below), we have a sensor combining an array of pixels and a lens. This sensor looks at a scene illuminated by a light source. To generate a picture from this setup, we need to integrate every ray of light passing throught the lens and hitting a pixel. We can model this integration using the XU domain. Here for every point, the abscissa correspond to the sub-pixel position and the ordinate to the position on the lens.

Since the lens and the pixel array are plane parallel, we can look at the frequency content at the pixel array location. The direction componnent of the Fourier domain will correspond to the lens position.

Image space vs XYUVT space reconstruction - There exist two kind of reconstruction methods in the litterature: the one that perform a pixel filter on pixels (or the projected samples positions) [Durand et al. 2005, Belcour et al. 2013]; or the one that perform the reconstruction from the more global space of integration (we refer as XYUVT -for image, lens and time- here though it can cover some other integration domains) [Egan et al. 2009, Egan et al. 2011, and others]. Because the later work on a high dimensional domain, the reconstruction cost has to be accounted. It can be reduced using the method of Yan et al. [2015] that decompose the reconstruction into a set of 1D convolutions.

Sheared filters - It is to note that the wedge form produces the same filters as the covariance representation. Indeed, the resulting filter is the product of Gaussians along the various dimensions. But those Gaussian are correlated and can be expressed as a single Gaussian that would be resulting from the covariance analysis (see below and in Vaidyanathan et al. [2015]).

Mathematically, the filter \(h(\delta x, \delta y)\) is expressed as [Yan et al. 2015]:

\[h(x, y) = e^{- k_x (x - x_0)^2} e^{- k_y (y - y_0(x, x_0))^2},\]

where \(y_0 = \eta (x - x_0)\) is the sheared center of the second Gaussian. By expressing the product of those Gaussian, we can regroup the terms and express the filter using the following form:

\[h(x, y) = e^{- [x - x_0, y] \, \Sigma^{-1} \, [x - x_0, y]^T}, \; \mbox{where} \; \Sigma^{-1} = \left[\begin{array}{cc} k_x + k_y & \eta k_y \\ \eta k_y & k_y \end{array}\right].\]

This form is a Gaussian that can be extracted using the covariance formulation. All the optimization procedure described by Yan et al. [2015] to improve the efficiency of reconstruction can thus be applied to most of the anisotropic methods (Gaussian, Covariance).

Up-Sampling

Up-sampling is a particular case of adaptive sampling and reconstruction where shading samples are not evaluated at the same time, but in a hierarchical way (see Bagher et al. [2011]). The idea is to render an image using a mipmap, starting for the coarse scale to the finer scale and only computing pixels when the frequency content in it corresponds to the frequency content.

In up-sampling methods, the bandwidth of the Fourier spectrum is used to predict at what mipmap level a group pixel should be shaded. This strategy ensures that when merging the different levels, not pixel got unshaded. Image courtesy of Bagher et al. [2011]

This technique corresponds to splating pixel samples on the screen relative their frequency content. Using a pixel hierarchy ensures that no pixel will be left unshaded.

Lehtinen et al. [2011] applied splatting to reconstruct defocus and motion blur from a small set of samples in screen space. In terms of frequency analysis, this work assumes that the samples are emitted by diffuse surfaces and reproject them along the sheared line defined by motion and lens. Here, instead of splatting on a pixel quad, samples are splatted along planes in the combined domain of image space, time and lens (called XYUVT).

Density Estimation

Density estimation is a reconstruction method notoriously used in Photon Mapping and Progressive Photon Mapping to estimate approximately a density from a discrete set of samples. Traditionnaly, density estimation is done using a nearest neighbor search. Starting from a query point, all samples falling into a disk of radius \(r\) around the query point are used to estimate the reflected radiance.

Adaptive density estimation use a different radius for each query point (black cross) to gather photons and avoid overblurring of some features. For example, the caustic generated by the glass sphere here would be overblurred by non-adaptive density estimation.

Using Frequency Analysis, the radius using to gather samples can be adapted depending on the frequency content of samples aggregated. The optimal radius, for a threshold error \(\epsilon\), is defined by:

\[r_{\mbox{opt}} = \sqrt{ \frac{2 \epsilon}{\alpha \nabla^2(L)} },\]

where \(\alpha\) is a characteristic of the density estimation kernel and the Hessian of the radiance can be retreived from the Fourier transform.

This idea of adapting reconstruction kernels has been applied by Belcour et al. for surface based density [2011] and volumetric density estimation [2014]. Similarly, Kaplanyan and Dachsbacher [2013] also used the Hessian of the radiance to adapt the gathering radius. However, they used a fixed kernel density estimation to approximate the Hessian in the first place.

Antialiasing

In some case, when we are rendering high frequency elements such as texture or envmaps and we have to average to get the resulting appearance, it is possible to use an antialiasing (or prefiltering) model to avoid super sampling. This problem has a vast history in graphics. For example, Greene and Heckbert [1986] introduced elliptical filter to account for the pixel’s projection in texture space.

An antialiasing reflectance model works as follow: given a kernel defined over a surface footprint (here \(\mathcal{F}\)) we need to evaluate or approximate the integral of a spatially varying BSDF model over the spatially varying geometry (for example using a normal-map or a height-map). To do so, an antialiased BRDF model has an extra variable: the surface footprint.

Antialiased surface appearance approximate the integrated surface appearance of a high frequency surface. In this example case, a scratch normal map is averaged in the footprint \(\mathcal{F}\) to approximate the BRDF and avoid using supersampling to integrate it.

In terms of Fourier analysis, evaluating the antialiased BRDF is equivalent to do a low-pass filter to the finest scale BRDF. Concretly, in the Fourier domain, we are removing high frequency part of the finest scale BRDF. The limit at which frequency are cut off is given by the kernel. It is said that the kernel bandlimits the frequency content of the BRDF.

It is possible to bandlimit other signals than the BRDF. For example, Krivanek and Colbert [2008] showed that it was possible to bandlimit the envmap signal when doing importance sampling. They showed that the size of the kernel was directly linked to the importance function.