Mip-Splatting Alias-Free 3D Gaussian Splatting CVPR 2024 Paper
Mip-Splatting Alias-Free 3D Gaussian Splatting CVPR 2024 Paper
Zehao Yu1,2 Anpei Chen†,1,2 Binbin Huang3 Torsten Sattler4 Andreas Geiger1,2
1 2 3
University of Tübingen Tübingen AI Center ShanghaiTech University
4
Czech Technical University in Prague
https://niujinshuchong.github.io/mip-splatting
(a) Faithful Representation (b) Degenerate Representation
3D Object
3D Gaussian
Faithful Faithful
Rendering Dilated Rendering
2D Gaussian 2D Gaussian
Erosion
(Brake cable
too thin)
High frequency
artifacts due to
Dilation
Increased
degenerate (thin)
(Spokes too thick due to Decreased
Focal length Focal length 3D Gaussians
screen space dilation) Brightening
Figure 1. 3D Gaussian Splatting [18] renders images by representing 3D Objects as 3D Gaussians which are projected onto the image
plane followed by 2D Dilation in screen space as shown in (a). Its intrinsic shrinkage bias leads to degenerate 3D Gaussians that exceed
the sampling limit as illustrated by the δ function in (b) while rendering similarly in 2D due to the dilation operation. However, when
changing the sampling rate (via the focal length or camera distance), we observe strong dilation effects (c) and high frequency artifacts (d).
Abstract 1. Introduction
Recently, 3D Gaussian Splatting has demonstrated im- Novel View Synthesis (NVS) plays a critical role in com-
pressive novel view synthesis results, reaching high fidelity puter graphics and computer vision, with various appli-
and efficiency. However, strong artifacts can be observed cations including virtual reality, cinematography, robotics,
when changing the sampling rate, e.g., by changing focal and more. A particularly significant advancement in this
length or camera distance. We find that the source for this field is the Neural Radiance Field (NeRF) [28], introduced
phenomenon can be attributed to the lack of 3D frequency by Mildenhall et al. in 2020. NeRF utilizes a multi-
constraints and the usage of a 2D dilation filter. To ad- layer perceptron (MLP) to represent geometry and view-
dress this problem, we introduce a 3D smoothing filter to dependent appearance effectively, demonstrating remark-
constrains the size of the 3D Gaussian primitives based able novel view rendering quality. Recently, 3D Gaussian
on the maximal sampling frequency induced by the input Splatting (3DGS) [18] has gained attention as an appealing
views. It eliminates high-frequency artifacts when zooming alternative to both MLP [28] and feature grid-based repre-
in. Moreover, replacing 2D dilation with a 2D Mip filter, sentations [4, 11, 24, 32, 46]. 3DGS stands out for its im-
which simulates a 2D box filter, effectively mitigates alias- pressive novel view synthesis results, while achieving real-
ing and dilation issues. Our evaluation, including scenarios time rendering at high resolutions. This effectiveness and
such a training on single-scale images and testing on mul- efficiency, coupled with the potential integration into the
tiple scales, validates the effectiveness of our approach. standard rasterization pipeline of GPUs represents a signif-
icant step towards practical usage of NVS methods.
† Corresponding author. Specifically, 3DGS represents complex scenes as a set
19447
of 3D Gaussians, which are rendered to screen space
8× Resolution
through splatting-based rasterization. The attributes of each
3D Gaussian, i.e., position, size, orientation, opacity, and
color, are optimized through a multi-view photometric loss.
Thereafter, a 2D dilation operation is applied in screen
space for low-pass filtering. Although 3DGS has demon-
strated impressive NVS results, it produces artifacts when
Full Resolution
camera views diverge from those seen during training, such
as zoom in and zoom out, as illustrated in Figure 1. We
find that the source for this phenomenon can be attributed
to the lack of 3D frequency constraints and the usage of a
2D dilation filter. Specifically, zooming out leads to a re-
duced size of the projected 2D Gaussians in screen space,
Resolution
while applying the same amount of dilation results in dila-
tion artifacts. Conversely, zooming in causes erosion arti-
1/4
facts since the projected 2D Gaussians expand, yet dilation
remains constant, causing erosion and resulting in incorrect
3DGS [18] 3DGS + EWA [59] Mip-Splatting Reference
gaps between Gaussians in the 2D projection.
Figure 2. We trained all the models on single-scale (full resolu-
To resolve these issues, we propose to regularize the 3D
tion here) images and rendered images with different resolutions
representation in 3D space. Our key insight is that the high- by changing focal length. While all methods show similar per-
est frequency that can be reconstructed of a 3D scene is formance at training scale, we observe strong artifacts in previous
inherently constrained by the sampling rates of the input work [18, 59] when changing the sampling rate. By contrast, our
images. We first derive the multi-view frequency bounds of Mip-Splatting renders faithful images across different scales.
each Gaussian primitive based on the training views accord-
ing to the Nyquist-Shannon Sampling Theorem [33, 45].
when changing sampling rates.
By applying a low-pass filter to the 3D Gaussian primitives
• We introduce a 3D smoothing filter for 3DGS to effec-
in 3D space during the optimization, we effectively restrict
tively regularize the maximum frequency of 3D Gaus-
the maximal frequency of the 3D representation to meet the
sian primitives, resolving the artifacts observed in out-of-
Nyquist limit. Post-training, this filter becomes an intrinsic
distribution renderings of prior methods [18, 59].
part of the scene representation, remaining constant regard-
• We replace the 2D dilation filter with a 2D Mip filter to
less of viewpoint changes. Consequently, our method elim-
address aliasing and dilation artifacts.
inates the artifacts presents in 3DGS [18] when zooming in,
• Experiments on challenging benchmark datasets [2, 28]
as shown in the 8× higher resolution image in Figure 2.
demonstrate the effectiveness of Mip-Splatting when
Nonetheless, rendering the reconstructed scene at lower modifying the sampling rate.
sampling rates (e.g., zooming out) results in aliasing. Pre-
vious work [1–3, 17] address aliasing by employing cone
2. Related Work
tracing and applying pre-filtering to the input positional or
feature encoding, which is not applicable to 3DGS. Thus, Novel View Synthesis: NVS is the process of generating
we introduce a 2D Mip filter (à la “mipmap”) specifically new images from viewpoints different from those of the
designed to ensure alias-free reconstruction and render- original captures [12, 22]. NeRF [28], which leverages vol-
ing across different scales. Our 2D Mip filter mimics the ume rendering [10, 21, 25, 26], has become a standard tech-
2D box filter inherent to the actual physical imaging pro- nique in the field. NeRF utilizes MLPs [5, 27, 34] to model
cess [29, 37, 48], by approximating it with a 2D Gaussian scenes as continuous functions, which, despite their com-
low pass filter. In contrast to previous work [1–3, 17] that pact representation, impede rendering speed due to the ex-
rely on the MLP’s ability to interpolate multi-scale signals pensive MLP evaluation that is required for each ray point.
during training with multi-scale images, our closed-form Subsequent methods [16, 40, 41, 52, 54] distill a pretrained
modification to the 3D Gaussian representation results in NeRF into a sparse representation, enabling real-time ren-
excellent out-of-distribution generalization: Training at a dering of NeRFs. Further advancements have been made
single sampling rate enables faithful rendering at various to improve the training and rendering of NeRF with ad-
sampling rates different from those used during training as vanced scene representations [4, 6, 11, 18, 19, 24, 32, 46,
demonstrated by the 1/4× down-sampled image in Figure 2. 51]. In particular, 3D Gaussians Splatting (3DGS) [18]
In summary, we make the following contributions: demonstrated impressive novel view synthesis results, while
• We analyze and identify the root of 3DGS’s artefacts achieving real-time rendering at high-definition resolutions.
19448
Importantly, 3DGS represents the scene explicitly as a col- the original data for supervision. In contrast, our approach
lection of 3D Gaussians and uses rasterization instead of is based on 3DGS [18] and determines the necessary low-
ray tracing. Nevertheless, 3DGS focuses on in-distribution pass filter size based on pixel size, allowing for alias-free
evaluation where training and testing are conducted at sim- rendering at scales unobserved during training.
ilar sampling rates (focal length/scene distance). In this
paper, we study the out-of-distribution generalization of 3. Preliminaries
3DGS, training models at a single scale and evaluating it
In this section, we first review the sampling theorem in Sec-
across multiple scales.
tion 3.1, laying the foundation for understanding the alias-
Primitive-based Differentiable Rendering: Primitive- ing problem. Subsequently, we introduce 3D Gaussian
based rendering techniques, which rasterize geometric Splatting (3DGS) [18] and its rendering process in Sec-
primitives onto the image plane, have been explored ex- tion 3.2.
tensively due to their efficiency [13, 14, 38, 44, 59, 60].
Differentiable point-based rendering methods [20, 36, 39, 3.1. Sampling Theorem
43, 49, 53, 57] offer great flexibility in representing in-
The Sampling Theorem, also known as the Nyquist-
tricate structures and are thus well-suited for novel view
Shannon Sampling Theorem [33, 45], is a fundamental con-
synthesis. Notably, Pulsar [20] stands out for its efficient
cept in signal processing and digital communication that de-
sphere rasterization. The more recent 3D Gaussian Splat-
scribes the conditions under which a continuous signal can
ting (3DGS) work [18] utilizes anisotropic Gaussians [59]
be accurately represented or reconstructed from its discrete
and introduces a tile-based sorting for rendering, achiev-
samples. To accurately reconstruct a continuous signal from
ing remarkable frame rates. Despite its impressive results,
its discrete samples without loss of information, the follow-
3DGS exhibits strong artifacts when rendering at a differ-
ing conditions must be met:
ent sampling rate. We address this issue by introducing a
3D smoothing filter to constrain the maximal frequencies of Condition 1 The continuous signal must be band-limited
the 3D Gaussian primitive representation, and a 2D Mip fil- and may not contain any frequency components above a
ter that approximates the box filter of the physical imaging certain maximum frequency ν.
process for alias-free rendering.
Condition 2 The sampling rate ν̂ must be at least twice the
Anti-aliasing in Rendering: There are two principal strate-
highest frequency present in the continuous signal: ν̂ ≥ 2ν.
gies to combat aliasing: super-sampling, which increases
the number of samples [7], and prefiltering, which ap- In practice, to satisfy the constraints when reconstructing
plies low-pass filtering to the signal to meet the Nyquist a signal from discrete samples, a low-pass or anti-aliasing
limit [8, 15, 31, 47, 50, 59]. For example, EWA splat- filter is applied to the signal before sampling. The filter
ting [59] applies a Gaussian low pass filter to the projected eliminates any frequency components above ν̂2 and attenu-
2D Gaussian in screen space to produce a band limited out- ates high-frequency content that could lead to aliasing.
put respecting the Nyquist frequency of the image. While
we also apply a band-limited filter to the Gaussian primi- 3.2. 3D Gaussian Splatting
tives, our band-limited filter is applied in 3D space and the
Prior works [18, 59] propose to represent a 3D scene as a set
filter size is fully determined by the training images not the
of scaled 3D Gaussian primitives {Gk |k = 1, · · · , K} and
images to be rendered. While our 2D Mip filter is also a
render an image using volume splatting. The geometry of
Gaussian low pass filter in screen space, it approximates the
each scaled 3D Gaussian Gk is parameterized by an opac-
box filter of the physical imaging process, approximating
ity (scale) αk ∈ [0, 1], center pk ∈ R3×1 and covariance
a single pixel. Conversely, the EWA filter limits the fre-
matrix Σk ∈ R3×3 defined in world space:
quency signal’s bandwidth to the rendered image, and the
size of the filter is chosen empirically. A critical differ-
\cG _k(\bx ) = e^{-\frac {1}{2} (\bx -\bp _k)^T \bSigma _k^{-1}(\bx -\bp _k)} (1)
ence to [59] is that we tackle the reconstruction problem,
optimizing the 3D Gaussian representation via inverse ren- To constrain Σk to the space of valid covariance matrices, a
dering while EWA splatting only considers the rendering semi-definite parameterization Σk = Ok sk sTk OTk is used.
problem. Here, s ∈ R3 is a scaling vector and O ∈ R3×3 is a rotation
Recent neural rendering methods integrate pre-filtering matrix, parameterized by a quaternion [18].
to mitigate aliasing [1–3, 17, 58]. Mip-NeRF [1], for in- To render an image for a given view point defined by ro-
stance, introduced an integrated position encoding (IPE) to tation R ∈ R3×3 and translation t ∈ R3 , the 3D Gaussians
attenuate high-frequency details. A similar idea is adapted {Gk } are first transformed into camera coordinates:
for feature grid-based representations [3, 17, 58]. Note that
these approaches require multi-scale images extracted from \bp '_k = \bR \, \bp _k + \bt , \quad \bSigma '_k = \bR \,\bSigma _k \, \bR ^T (2)
19449
Afterwards, they are projected to ray space via a local affine the Gaussian, the higher the frequency it represents, result-
transformation ing in systematically underestimation of its scale.
\bSigma ''_k = \bJ _k\, \bSigma '_k \, \bJ _k^T (3) While this does not affect rendering at similar sampling
rates (cf. Figure 1 (a) vs. (b)), it leads to erosion effects
where the Jacobian matrix Jk is an affine approximation to when zooming in or moving the camera closer. This is be-
the projective transformation defined by the center of the 3D cause the dilated 2D Gaussians become smaller in screen
Gaussian p′k . By skipping the third row and column of Σ′′k , space. In this case, the rendered image exhibits high-
we obtain a 2D covariance matrix Σ2D k in ray space, and we frequency artifacts, rendering object structures thinner than
use Gk2D to refer to the corresponding scaled 2D Gaussian, they actually appear as illustrated in Figure 1 (d).
see [18] for details. Conversely, screen space dilation also negatively affects
Finally, 3DGS [18] utilizes spherical harmonics to model rendering when decreasing the sampling rate as illustrated
view-dependent color ck and renders image via alpha in Figure 1 (c) which shows a zoomed-out version of (a). In
blending according to the primitive’s depth order 1, . . . , K: this case, dilation spreads radiance in a physically incorrect
way across pixels. Note that in (c), the area covered by
the projection of the 3D object is smaller than a pixel, yet
\bc (\bx ) = \sum ^K_{k=1} \bc _k\,\alpha _k\,\cG ^{2D}_k(\bx ) \prod _{j=1}^{k-1} (1 - \alpha _j\,\cG ^{2D}_j(\bx )) (4)
the dilated Gaussian is not attenuated, accumulating more
light than what physically reaches the pixel. This leads to
Dilation: To avoid degenerate cases where the projected 2D increased brightness and dilation artifacts which strongly
Gaussians are too small in screen space, i.e., smaller than a degrade the appearance of the bicycle wheels’ spokes.
pixel, the projected 2D Gaussians are dilated as follows: The aforementioned scale ambiguity becomes particu-
larly problematic in representations involving millions of
\cG ^{2D}_k(\bx ) = e^{-\frac {1}{2} (\bx -\bp _k)^T \, {(\bSigma ^{2D}_k + \, s \, \bI )}^{-1} \, (\bx -\bp _k)} \label {eqn:dilation} (5) Gaussians. However, simply discarding screen space dila-
tion results in optimization challenges for complex scenes,
where I is a 2D identity matrix and s is a scalar dilation such as those present in the Mip-NeRF 360 dataset [2],
hyperparameter. Note that this operator adjusts the scale of where a large number of small Gaussians are created by the
the 2D Gaussian while leaving its maximum unchanged. As density control mechanism [18], exceeding GPU capacity.
this effect is similar to that of dilation operators in morphol- Moreover, even if a model can be successfully trained with-
ogy, we called it a 2D screen space dilation operation* . out dilation, decreasing the sampling rate results in aliasing
Reconstruction: As the rendering process is fast and dif- effects due to the lack of anti-aliasing [59].
ferentiable, the 3D Gaussian parameters can be efficiently
optimized using a multi-view loss. During optimization, 3D
5. Mip Gaussian Splatting
Gaussians are adaptively added and deleted to better repre- To overcome these challenges, we make two modifications
sent the scene. We refer the reader to [18] for details. to the original 3DGS model. In particular, we introduce a
3D smoothing filter that limits the frequency of the 3D rep-
4. Sensitivity to Sampling Rate resentation to below half the maximum sampling rate deter-
mined by the training images, eliminating high frequency
In traditional forward splatting, the centers pk and colors ck
artifacts when zooming in. Moreover, we demonstrate that
of Gaussian primitives are predetermined, whereas the 3D
replacing 2D screen space dilation with a 2D Mip filter,
Gaussian covariance Σk are chosen empirically [42, 59].
which approximates the box filter inherent to the physical
In contrast, 3DGS [18], optimizes all parameters jointly
imaging process, effectively mitigates aliasing and dilation
through an inverse rendering framework by backpropagat-
issues. In combination, Mip-Splatting enables alias-free
ing a multi-view photometric loss.
renderings† across various sampling rates. We now discuss
We observe that this optimization suffers from ambigu-
the the 3D smoothing and the 2D Mip filters in detail.
ities as illustrated in Figure 1 which shows a simple exam-
ple involving one object and an image sensor with 5 pixels. 5.1. 3D Smoothing Filter
Consider the 3D object in (a), its approximation by a 3D
3D radiance field reconstruction from multi-view observa-
Gaussian and its projection into screen space (blue pixel).
tions is a well-known ill-posed problem as multiple dis-
Due to screen space dilation (Eq. 5) with a Gaussian kernel
tinctly different reconstructions can result in the same 2D
(size ≈ 1 pixel), the degenerate 3D Gaussian represented by
projections [2, 55, 56]. Our key insight is that the high-
a Dirac δ function in (b) leads to a similar image. In order to
est frequency of a reconstructed 3D scene is limited by
represent high frequency details in real world scenes, the di-
† Note that we use alias to refer to multiple artifacts discussed in the
lated 2D Gaussians would become small, since the smaller
paper, including dilation, erosion, oversmoothing, high-frequency artifacts
* The dilation operation is not mentioned in original paper. and aliasing itself.
19450
we recompute the maximal sampling rate of each Gaussian
primitive every m iterations as we found the 3D Gaussians
centers remain relatively stable throughout the training.
3D Smoothing: Given the maximal sampling rate ν̂k for a
Camera 4
(smaller d)
primitive, we aim to constrain the maximal frequency of the
3D representation. This is achieved by applying a Gaussian
Camera 1 Camera 2 Camera 3
(smaller f) (larger f) low-pass filter Glow to each 3D Gaussian primitive Gk before
Camera 5
(larger d) projecting it onto screen space:
Figure 3. Sampling limits. A pixel corresponds to sampling inter- \cG _k(\bx )_{\text {reg}} = (\cG _k \otimes \cG _{\text {low}})(\bx ) \label {eqn:filter_3d_simple} (8)
val T̂ . We band-limit the 3D Gaussians by the maximal sampling
This operation is efficient as convolving two Gaussians with
rate (i.e., minimal sampling interval) among all observations. This
example shows 5 cameras at different depths d and with different covariance matrices Σ1 and Σ2 results in another Gaussian
focal lengths f . Here, camera 3 determines the minimal T̂ and with variance Σ1 + Σ2 . Hence,
hence the maximal sampling rate ν̂.
\cG _k(\bx )_\text {reg} = \sqrt {\frac {|\bSigma _k|}{|\bSigma _k + \frac {s}{\hat {\nu }_k^2} \cdot \bI |}} \, \, e^{-\frac {1}{2} (\bx -\bp _k)^T \, (\bSigma _k + \frac {s}{\hat {\nu }_k^2} \cdot \bI )^{-1} \, (\bx -\bp _k)} \label {eqn:filter_3d_full}
the sampling rate defined by the training views. Following
Nyquist’s theorem 3.1, we aim to constrain the maximum (9)
frequency of the 3D representation during optimization. Here, s is a scalar hyperparameter to control the size of the
Multiview Frequency Bounds: Multi-view images are 2D filter. Note that the scale ν̂sk of the 3D filters for each prim-
projections of a continuous 3D scene. The discrete image itive are different as they depend on the training views in
grid determines where we sample points from the continu- which they are visible. By employing 3D Gaussian smooth-
ous 3D signal. This sampling rate is intrinsically related to ing, we ensure that the highest frequency component of any
the image resolution, camera focal length, and the scene’s Gaussian does not exceed half of its maximal sampling rate
distance from the camera. For an image with focal length for at least one camera. Note that Glow becomes an intrin-
f in pixel units, the sampling interval in screen space is 1. sic part of the 3D representation, remaining constant post-
When this pixel interval is back-projected to the 3D world training.
space, it results in a world space sampling interval T̂ at a 5.2. 2D Mip Filter
given depth d, with sampling frequency ν̂ as its inverse:
While our 3D smoothing filter effectively mitigates high-
\hat {T} = \frac {1}{\hat {\nu }} = \frac {d}{f} \label {eqn:sampling_rate} (6) frequency artifacts [18, 59], rendering the reconstructed
scene at lower sampling rates (e.g., zooming out or mov-
As posited by Nyquist’s theorem Section 3.1, given samples ing the camera further away) would still lead to aliasing. To
drawn at frequency ν̂, reconstruction algorithms are able to overcome this, we replace the screen space dilation filter of
reconstruct components of the signal with frequencies up to 3DGS by a 2D Mip filter.
ν̂ f More specifically, we replicate the physical imaging pro-
2 , or 2d . Consequently, a primitive smaller than 2T̂ may
result in aliasing artifacts during the splatting process, since cess [37, Section 8], where photons hitting a pixel on the
its size is below twice the sampling interval. camera sensor are integrated over the pixel’s area. While an
To simplify, we approximate depth d using the center of ideal model would use a 2D box filter in image space, we
the primitive pk , and disregard the impact of occlusion for approximate it with a 2D Gaussian filter for efficiency
sampling interval estimation. Since the sampling rate of a
primitive is depth-dependent and differs across cameras, we \cG ^{2D}_k(\bx )_{\text {mip}} = \sqrt {\frac {|\bSigma ^{2D}_k|}{|\bSigma ^{2D}_k + s \bI |}} \, \, e^{-\frac {1}{2} (\bx -\bp _k)^T \, (\bSigma ^{2D}_k + s \bI )^{-1} \, (\bx -\bp _k)} \label {eqn:ewa_filter}
determine the maximal sampling rate for primitive k as
(10)
where s is chosen to cover a single pixel in screen space.
\hat {\nu }_k = \text {max}\left (\left \{ \mathds {1}_n(\bp _k) \cdot \frac {f_n}{d_n}\right \}^{N}_{n=1}\right ) (7) While our Mip filter shares similarities with the EWA
filter [59], their underlying principles are distinct. Our Mip
where N is the total number of images, 1n (p) is an indi- filter is designed to replicate the box filter in the imaging
cator function that assesses the visibility of a primitive. It process, targeting an exact approximation of a single pixel.
is true if the Gaussian center pk falls within the view frus- Conversely, the EWA filter’s role is to limit the frequency
tum of the n-th camera. Intuitively, we choose the sampling signal’s bandwidth, and the size of the filter is chosen empir-
rate such that there exists at least one camera that is able ically. The EWA paper [15, 59] even advocates for an iden-
to reconstruct the respective primitive. This process is il- tity covariance matrix, effectively occupying a 3x3 pixel re-
lustrated in Figure 3 for N = 5. In our implementation, gion on the screen. However, this approach leads to overly
19451
PSNR ↑ SSIM ↑ LPIPS ↓
Full Res. 1/2 Res. 1/4 Res. 1/8 Res. Avg. Full Res. 1/2 Res. 1/4 Res. 1/8 Res. Avg. Full Res. 1/2 Res. 1/4 Res. 1/8 Res Avg.
NeRF w/o \mathcal {L}_\text {area} [1, 28] 31.20 30.65 26.25 22.53 27.66 0.950 0.956 0.930 0.871 0.927 0.055 0.034 0.043 0.075 0.052
NeRF [28] 29.90 32.13 33.40 29.47 31.23 0.938 0.959 0.973 0.962 0.958 0.074 0.040 0.024 0.039 0.044
MipNeRF [1] 32.63 34.34 35.47 35.60 34.51 0.958 0.970 0.979 0.983 0.973 0.047 0.026 0.017 0.012 0.026
Plenoxels [11] 31.60 32.85 30.26 26.63 30.34 0.956 0.967 0.961 0.936 0.955 0.052 0.032 0.045 0.077 0.051
TensoRF [4] 32.11 33.03 30.45 26.80 30.60 0.956 0.966 0.962 0.939 0.956 0.056 0.038 0.047 0.076 0.054
Instant-NGP [32] 30.00 32.15 33.31 29.35 31.20 0.939 0.961 0.974 0.963 0.959 0.079 0.043 0.026 0.040 0.047
Tri-MipRF [17]* 32.65 34.24 35.02 35.53 34.36 0.958 0.971 0.980 0.987 0.974 0.047 0.027 0.018 0.012 0.026
3DGS [18] 28.79 30.66 31.64 27.98 29.77 0.943 0.962 0.972 0.960 0.960 0.065 0.038 0.025 0.031 0.040
3DGS [18] + EWA [59] 31.54 33.26 33.78 33.48 33.01 0.961 0.973 0.979 0.983 0.974 0.043 0.026 0.021 0.019 0.027
Mip-Splatting (ours) 32.81 34.49 35.45 35.50 34.56 0.967 0.977 0.983 0.988 0.979 0.035 0.019 0.013 0.010 0.019
Table 1. Multi-scale Training and Multi-scale Testing on the Blender dataset [28]. Our approach achieves state-of-the-art performance
in most metrics. It significantly outperforms 3DGS [18] and 3DGS + EWA [59]. ∗ indicates that we retrain the model.
smooth results when zooming out as we will show in our that involves training on full-resolution images and render-
experiments. ing at various resolutions (i.e. 1×, 1/2, 1/4, and 1/8) to mimic
zoom-out effects. In the absence of a public benchmark
6. Experiments for this setting, we trained all baseline methods ourselves.
We use NeRFAcc [23]’s implementation for NeRF [28],
We first present the implementation details of Mip- Instant-NGP [32], and TensoRF [4] for its efficiency. Of-
Splatting. We then assess its performance on the Blender ficial implementations were employed for Mip-NeRF [1],
dataset [28] and the challenging Mip-NeRF 360 dataset [2]. Tri-MipRF [17], and 3DGS [18]. The quantitative results,
Finally, we discuss the limitations of our approach. as presented in Table 2, indicate that our method signifi-
6.1. Implementation cantly outperforms all existing state-of-the-art methods. A
qualitative comparison is provided in Figure 4. Methods
We build our method upon the popular open-source 3DGS based on 3DGS [18] capture fine details more effectively
code base [18]‡ . Following [18], we train our models for than Mip-NeRF [1] and Tri-MipRF [17], but only at the
30K iterations across all scenes and use the same loss func- original training scale. Notably, our method surpasses both
tion, Gaussian density control strategy, schedule and hyper- 3DGS [18] and 3DGS + EWA [59] in rendering quality at
parameters. For efficiency, we recompute the sampling rate lower resolutions. In particular, 3DGS [18] exhibits dila-
of each 3D Gaussian every m = 100 iterations. We choose tion artifacts. EWA splatting [59] uses a large low pass filter
the variance of our 2D Mip filter as 0.1, approximating a to limit the frequency of the rendered images, resulting in
single pixel, and the variance of our 3D smoothing filter as oversmoothed images, which becomes particularly apparent
0.2, totaling 0.3 for a fair comparison with 3DGS [18] and at lower resolutions.
3DGS + EWA [59] which replaces the dilation of 3DGS
with the EWA filter. 6.3. Evaluation on the Mip-NeRF 360 Dataset
6.2. Evaluation on the Blender Dataset Single-scale Training and Multi-scale Testing: To simu-
late zoom-in effects, we train models on data downsampled
Multi-scale Training and Multi-scale Testing: Following by a factor of 8 and rendered at successively higher reso-
previous work [1, 17], we train our model with multi-scale lutions (1×, 2×, 4×, and 8×). In the absence of a public
data and evaluate on multi-scale data. Similar to [1, 17] benchmark for this setting, we trained all baseline meth-
where rays of full resolution images are sampled more fre- ods ourselves. We use the official implementation for Mip-
quently compared to lower resolution images, we sample NeRF 360 [1] and 3DGS [18] and use a community reim-
40 percent of full resolution images and 20 percent from plementation for Zip-NeRF [3]§ as the code is not avail-
other image resolutions each. Our quantitative evaluation able. The results in Table 3 show that our method performs
is shown in Table 1. Our approach attains comparable or comparable to prior work at the training scale (1×) and sig-
superior performance compared to state-of-the-art methods nificantly exceeds all state-of-the-art methods at higher res-
such as Mip-NeRF [1] and Tri-MipRF [17]. Notably, our olutions. As depicted in Figure 5, our method generates
method outperforms 3DGS [18] and 3DGS + EWA [59] by high fidelity imagery without high-frequency artifacts. No-
a substantial margin, owing to its 2D Mip filter. tably, both Mip-NeRF 360 [2] and Zip-NeRF [3] exhibit
Single-scale Training and Multi-scale Testing: Contrary subpar performance at increased resolutions, likely due to
to prior work that evaluates models trained on single-scale their MLPs’ inability to extrapolate to out-of-distribution
data at the same scale, we consider an important new setting frequencies. While 3DGS [18] introduces notable erosion
‡ https://github.com/graphdeco-inria/gaussian-splatting § https://github.com/SuLvXiangXin/zipnerf-pytorch
19452
Full
1/2
1/4
1/8
Full
1/2
1/4
1/8
Full
1/2
1/4
1/8
Mip-NeRF [1] Tri-MipRF [17] 3DGS [18] 3DGS [18] + EWA [59] Mip-Splatting (ours) GT
Figure 4. Single-scale Training and Multi-scale Testing on the Blender Dataset [28]. All methods are trained at full resolution and
evaluated at different (smaller) resolutions to mimic zoom-out. Methods based on 3DGS capture fine details better than Mip-NeRF [1] and
Tri-MipRF [17] at training resolution. Mip-Splatting surpasses both 3DGS [18] and 3DGS + EWA [59] at lower resolutions.
PSNR ↑ SSIM ↑ LPIPS ↓
Full Res. 1/2 Res. 1/4 Res. 1/8 Res. Avg. Full Res. 1/2 Res. 1/4 Res. 1/8 Res. Avg. Full Res. 1/2 Res. 1/4 Res. 1/8 Res Avg.
NeRF [28] 31.48 32.43 30.29 26.70 30.23 0.949 0.962 0.964 0.951 0.956 0.061 0.041 0.044 0.067 0.053
MipNeRF [1] 33.08 33.31 30.91 27.97 31.31 0.961 0.970 0.969 0.961 0.965 0.045 0.031 0.036 0.052 0.041
TensoRF [4] 32.53 32.91 30.01 26.45 30.48 0.960 0.969 0.965 0.948 0.961 0.044 0.031 0.044 0.073 0.048
Instant-NGP [32] 33.09 33.00 29.84 26.33 30.57 0.962 0.969 0.964 0.947 0.961 0.044 0.033 0.046 0.075 0.049
Tri-MipRF [17] 32.89 32.84 28.29 23.87 29.47 0.958 0.967 0.951 0.913 0.947 0.046 0.033 0.046 0.075 0.050
3DGS [18] 33.33 26.95 21.38 17.69 24.84 0.969 0.949 0.875 0.766 0.890 0.030 0.032 0.066 0.121 0.063
3DGS [18] + EWA [59] 33.51 31.66 27.82 24.63 29.40 0.969 0.971 0.959 0.940 0.960 0.032 0.024 0.033 0.047 0.034
Mip-Splatting (ours) 33.36 34.00 31.85 28.67 31.97 0.969 0.977 0.978 0.973 0.974 0.031 0.019 0.019 0.026 0.024
Table 2. Single-scale Training and Multi-scale Testing on the Blender Dataset [28]. All methods are trained on full-resolution images
and evaluated at four different (smaller) resolutions, with lower resolutions simulating zoom-out effects. While Mip-Splatting yields
comparable results at training resolution, it significantly surpasses previous work at all other scales.
artifacts due to dilation operations, 3DGS + EWA [59] per- firms our method’s effectiveness to handle various settings.
forms better while still yielding pronounced high-frequency
artifacts. In contrast, our method avoids such artifacts, 6.4. Limitations
yielding aesthetically pleasing images that more closely re-
semble ground truth. It’s important to remark that rendering Our method employs a Gaussian filter as an approximation
at higher resolutions is a super-resolution task, and models to a box filter for efficiency. However, this approximation
should not hallucinate high-frequency details absent from introduces errors, particularly when the Gaussian is small
the training data. in screen space. This issue correlates with our experimen-
tal findings, where increased zooming out leads to larger
Single-scale Training and Same-scale Testing: We further errors, as evidenced in Table 2. Additionally, there is a
evaluate our method on the Mip-NeRF 360 dataset [2] fol- slight increase in training overhead as the sampling rate for
lowing the widely used setting, where models are trained each 3D Gaussian must be calculated every m = 100 it-
and tested at the same scale, with indoor scenes down- erations. Currently, this computation is performed using
sampled by a factor of two and outdoor scenes by four. PyTorch [35] and a more efficient CUDA implementation
As shown in Table 4, our method performs on par with could potentially reduce this overhead. Designing a better
3DGS [18] and 3DGS + EWA [59] in this challenging data structure for precomputing and storing the sampling
benchmark, without any decrease in performance. This con- rate, as it depends solely on the camera poses and intrin-
19453
Mip-NeRF 360 [2] Zip-NeRF [3] 3DGS [18] 3DGS [18] + EWA [59] Mip-Splatting (ours) GT
Figure 5. Single-scale Training and Multi-scale Testing on the Mip-NeRF 360 Dataset [2]. All models are trained on images down-
sampled by a factor of eight and rendered at full resolution to demonstrate zoom-in/moving closer effects. In contrast to prior work,
Mip-Splatting renders images that closely approximate ground truth. Please also note the high-frequency artifacts of 3DGS + EWA [59].
PSNR ↑ SSIM ↑ LPIPS ↓
1× Res. 2× Res. 4× Res. 8× Res. Avg. 1× Res. 2× Res. 4× Res. 8× Res. Avg. 1× Res. 2× Res. 4× Res. 8× Res. Avg.
Instant-NGP [32] 26.79 24.76 24.27 24.27 25.02 0.746 0.639 0.626 0.698 0.677 0.239 0.367 0.445 0.475 0.382
mip-NeRF 360 [2] 29.26 25.18 24.16 24.10 25.67 0.860 0.727 0.670 0.706 0.741 0.122 0.260 0.370 0.428 0.295
zip-NeRF [3] 29.66 23.27 20.87 20.27 23.52 0.875 0.696 0.565 0.559 0.674 0.097 0.257 0.421 0.494 0.318
3DGS [18] 29.19 23.50 20.71 19.59 23.25 0.880 0.740 0.619 0.619 0.715 0.107 0.243 0.394 0.476 0.305
3DGS [18] + EWA [59] 29.30 25.90 23.70 22.81 25.43 0.880 0.775 0.667 0.643 0.741 0.114 0.236 0.369 0.449 0.292
Mip-Splatting (ours) 29.39 27.39 26.47 26.22 27.37 0.884 0.808 0.754 0.765 0.803 0.108 0.205 0.305 0.392 0.252
Table 3. Single-scale Training and Multi-scale Testing on the Mip-NeRF 360 Dataset [2]. All methods are trained on the smallest scale
(1×) and evaluated across four scales (1×, 2×, 4×, and 8×), with evaluations at higher sampling rates simulating zoom-in effects. While
our method yields comparable results at the training resolution, it significantly surpasses all previous work at all other scales.
19454
References for efficient anti-aliasing neural radiance fields. In Proc. of
the IEEE International Conf. on Computer Vision (ICCV),
[1] Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter 2023. 2, 3, 6, 7
Hedman, Ricardo Martin-Brualla, and Pratul P. Srinivasan.
[18] Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler,
Mip-nerf: A multiscale representation for anti-aliasing neu-
and George Drettakis. 3d gaussian splatting for real-time
ral radiance fields. ICCV, 2021. 2, 3, 6, 7, 8
radiance field rendering. ACM Transactions on Graphics, 42
[2] Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P.
(4), 2023. 1, 2, 3, 4, 5, 6, 7, 8
Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded
anti-aliased neural radiance fields. CVPR, 2022. 2, 4, 6, 7, 8 [19] Jonas Kulhanek and Torsten Sattler. Tetra-nerf: Represent-
ing neural radiance fields using tetrahedra. arXiv preprint
[3] Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P.
arXiv:2304.09987, 2023. 2
Srinivasan, and Peter Hedman. Zip-nerf: Anti-aliased grid-
based neural radiance fields. Proc. of the IEEE International [20] Christoph Lassner and Michael Zollhofer. Pulsar: Effi-
Conf. on Computer Vision (ICCV), 2023. 2, 3, 6, 8 cient sphere-based neural rendering. In Proceedings of
[4] Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and the IEEE/CVF Conference on Computer Vision and Pattern
Hao Su. Tensorf: Tensorial radiance fields. 2022. 1, 2, 6, 7 Recognition, pages 1440–1449, 2021. 3
[5] Zhiqin Chen and Hao Zhang. Learning implicit fields for [21] Marc Levoy. Efficient ray tracing of volume data. ACM
generative shape modeling. 2019. 2 Transactions on Graphics (TOG), 9(3):245–261, 1990. 2
[6] Zhang Chen, Zhong Li, Liangchen Song, Lele Chen, Jingyi [22] Marc Levoy and Pat Hanrahan. Light field rendering. In
Yu, Junsong Yuan, and Yi Xu. Neurbf: A neural fields repre- Seminal Graphics Papers: Pushing the Boundaries, Volume
sentation with adaptive radial basis functions. In Proceed- 2, pages 441–452. 2023. 2
ings of the IEEE/CVF International Conference on Com- [23] Ruilong Li, Hang Gao, Matthew Tancik, and Angjoo
puter Vision, pages 4182–4194, 2023. 2 Kanazawa. Nerfacc: Efficient sampling accelerates nerfs. In
[7] Robert L. Cook. Stochastic sampling in computer graphics. Proceedings of the IEEE/CVF International Conference on
ACM Trans. Graph., 5(1):51–72, 1986. 3 Computer Vision (ICCV), pages 18537–18546, 2023. 6
[8] Franklin C. Crow. Summed-area tables for texture mapping. [24] Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and
In Proceedings of the 11th Annual Conference on Computer Christian Theobalt. Neural sparse voxel fields. NeurIPS,
Graphics and Interactive Techniques, page 207–212, New 2020. 1, 2
York, NY, USA, 1984. Association for Computing Machin- [25] Nelson Max. Optical models for direct volume rendering.
ery. 3 IEEE Transactions on Visualization and Computer Graphics,
[9] Boyang Deng, Jonathan T. Barron, and Pratul P. Srinivasan. 1(2):99–108, 1995. 2
JaxNeRF: an efficient JAX implementation of NeRF, 2020.
[26] Nelson Max and Min Chen. Local and global illumination in
8
the volume rendering integral. Technical report, Lawrence
[10] Robert A Drebin, Loren Carpenter, and Pat Hanrahan. Vol- Livermore National Lab.(LLNL), Livermore, CA (United
ume rendering. ACM Siggraph Computer Graphics, 22(4): States), 2005. 2
65–74, 1988. 2
[27] Lars Mescheder, Michael Oechsle, Michael Niemeyer, Se-
[11] Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong
bastian Nowozin, and Andreas Geiger. Occupancy networks:
Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels:
Learning 3d reconstruction in function space. 2019. 2
Radiance fields without neural networks. In CVPR, 2022. 1,
2, 6, 8 [28] Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik,
[12] Steven J Gortler, Radek Grzeszczuk, Richard Szeliski, and Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf:
Michael F Cohen. The lumigraph. In Seminal Graphics Pa- Representing scenes as neural radiance fields for view syn-
pers: Pushing the Boundaries, Volume 2, pages 453–464. thesis. In ECCV, 2020. 1, 2, 6, 7, 8
2023. 2 [29] Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla,
[13] Markus Gross and Hanspeter Pfister. Point-based graphics. Pratul P. Srinivasan, and Jonathan T. Barron. NeRF in the
Elsevier, 2011. 3 dark: High dynamic range view synthesis from noisy raw
[14] Jeffrey P Grossman and William J Dally. Point sample ren- images. CVPR, 2022. 2
dering. In Rendering Techniques’ 98: Proceedings of the [30] Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, Peter
Eurographics Workshop in Vienna, Austria, June 29—July 1, Hedman, Ricardo Martin-Brualla, and Jonathan T. Barron.
1998 9, pages 181–192. Springer, 1998. 3 MultiNeRF: A Code Release for Mip-NeRF 360, Ref-NeRF,
[15] Paul S Heckbert. Fundamentals of texture mapping and im- and RawNeRF, 2022. 8
age warping. 1989. 3, 5 [31] Klaus Mueller, Torsten Moller, J Edward Swan, Roger Craw-
[16] Peter Hedman, Pratul P Srinivasan, Ben Mildenhall, fis, Naeem Shareef, and Roni Yagel. Splatting errors and
Jonathan T Barron, and Paul Debevec. Baking neural ra- antialiasing. IEEE Transactions on Visualization and Com-
diance fields for real-time view synthesis. In Proceedings puter Graphics, 4(2):178–191, 1998. 3
of the IEEE/CVF International Conference on Computer Vi- [32] Thomas Müller, Alex Evans, Christoph Schied, and Alexan-
sion, pages 5875–5884, 2021. 2 der Keller. Instant neural graphics primitives with a multires-
[17] Wenbo Hu, Yuling Wang, Lin Ma, Bangbang Yang, Lin Gao, olution hash encoding. ACM Trans. Graph., 41(4):102:1–
Xiao Liu, and Yuewen Ma. Tri-miprf: Tri-mip representation 102:15, 2022. 1, 2, 6, 7, 8
19455
[33] Harry Nyquist. Certain topics in telegraph transmission the- for splatting. In Proceedings. Visualization’97 (Cat. No.
ory. Transactions of the American Institute of Electrical En- 97CB36155), pages 197–204. IEEE, 1997. 3
gineers, 1928. 2, 3 [48] Richard Szeliski. Computer vision: algorithms and applica-
[34] Jeong Joon Park, Peter Florence, Julian Straub, Richard A. tions. Springer Nature, 2022. 2
Newcombe, and Steven Lovegrove. Deepsdf: Learning con- [49] Olivia Wiles, Georgia Gkioxari, Richard Szeliski, and Justin
tinuous signed distance functions for shape representation. Johnson. SynSin: End-to-end view synthesis from a sin-
2019. 2 gle image. In Proceedings of the IEEE/CVF Conference on
[35] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, Computer Vision and Pattern Recognition (CVPR), 2020. 3
James Bradbury, Gregory Chanan, Trevor Killeen, Zem- [50] Lance Williams. Pyramidal parametrics. page 1–11, New
ing Lin, Natalia Gimelshein, Luca Antiga, Alban Desmai- York, NY, USA, 1983. Association for Computing Machin-
son, Andreas Kopf, Edward Yang, Zachary DeVito, Mar- ery. 3
tin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit [51] Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin
Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: Shu, Kalyan Sunkavalli, and Ulrich Neumann. Point-nerf:
An imperative style, high-performance deep learning library. Point-based neural radiance fields. In Proceedings of the
In Advances in Neural Information Processing Systems 32, IEEE/CVF Conference on Computer Vision and Pattern
pages 8024–8035. Curran Associates, Inc., 2019. 7 Recognition, pages 5438–5448, 2022. 2
[36] Songyou Peng, Chiyu ”Max” Jiang, Yiyi Liao, Michael [52] Lior Yariv, Peter Hedman, Christian Reiser, Dor Verbin,
Niemeyer, Marc Pollefeys, and Andreas Geiger. Shape as Pratul P. Srinivasan, Richard Szeliski, Jonathan T. Barron,
points: A differentiable poisson solver. In Advances in Neu- and Ben Mildenhall. Bakedsdf: Meshing neural sdfs for real-
ral Information Processing Systems (NeurIPS), 2021. 3 time view synthesis. arXiv, 2023. 2, 8
[37] Steve Hollasch Peter Shirley, Trevor David Black. Ray trac- [53] Wang Yifan, Felice Serena, Shihao Wu, Cengiz Öztireli, and
ing in one weekend. https://raytracing.github.io/books/ Olga Sorkine-Hornung. Differentiable surface splatting for
RayTracingInOneWeekend.html, 2023. 2, 5 point-based geometry processing. ACM Transactions on
[38] Hanspeter Pfister, Matthias Zwicker, Jeroen Van Baar, and Graphics (proceedings of ACM SIGGRAPH ASIA), 38(6),
Markus Gross. Surfels: Surface elements as rendering primi- 2019. 3
tives. In Proceedings of the 27th annual conference on Com- [54] Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng,
puter graphics and interactive techniques, pages 335–342, and Angjoo Kanazawa. Plenoctrees for real-time rendering
2000. 3 of neural radiance fields. In Proceedings of the IEEE/CVF
[39] Sergey Prokudin, Qianli Ma, Maxime Raafat, Julien International Conference on Computer Vision, pages 5752–
Valentin, and Siyu Tang. Dynamic point fields. In Proceed- 5761, 2021. 2
ings of the IEEE/CVF International Conference on Com- [55] Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sat-
puter Vision (ICCV), pages 7964–7976, 2023. 3 tler, and Andreas Geiger. Monosdf: Exploring monocu-
[40] Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas lar geometric cues for neural implicit surface reconstruc-
Geiger. Kilonerf: Speeding up neural radiance fields with tion. Advances in Neural Information Processing Systems
thousands of tiny mlps. 2021. 2 (NeurIPS), 2022. 4
[41] Christian Reiser, Richard Szeliski, Dor Verbin, Pratul P. [56] Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen
Srinivasan, Ben Mildenhall, Andreas Geiger, Jonathan T. Koltun. Nerf++: Analyzing and improving neural radiance
Barron, and Peter Hedman. Merf: Memory-efficient radi- fields. arXiv:2010.07492, 2020. 4, 8
ance fields for real-time view synthesis in unbounded scenes. [57] Yufeng Zheng, Wang Yifan, Gordon Wetzstein, Michael J.
SIGGRAPH, 2023. 2 Black, and Otmar Hilliges. Pointavatar: Deformable point-
[42] Liu Ren, Hanspeter Pfister, and Matthias Zwicker. Object based head avatars from videos. In Proceedings of the
space ewa surface splatting: A hardware accelerated ap- IEEE/CVF Conference on Computer Vision and Pattern
proach to high quality point rendering. In Computer Graph- Recognition (CVPR), 2023. 3
ics Forum, pages 461–470. Wiley Online Library, 2002. 4 [58] Yiyu Zhuang, Qi Zhang, Ying Feng, Hao Zhu, Yao Yao, Xi-
[43] Darius Rückert, Linus Franke, and Marc Stamminger. Adop: aoyu Li, Yan-Pei Cao, Ying Shan, and Xun Cao. Anti-aliased
Approximate differentiable one-pixel point rendering. arXiv neural implicit surfaces with encoding level of detail. arXiv
preprint arXiv:2110.06635, 2021. 3 preprint arXiv:2309.10336, 2023. 3
[44] Miguel Sainz and Renato Pajarola. Point-based rendering [59] Matthias Zwicker, Hanspeter Pfister, Jeroen Van Baar, and
techniques. Computers & Graphics, 28(6):869–879, 2004. 3 Markus Gross. Ewa volume splatting. In Proceedings Visu-
[45] Claude E Shannon. Communication in the presence of noise. alization, 2001. VIS’01., pages 29–538. IEEE, 2001. 2, 3, 4,
Proceedings of the IRE, 1949. 2, 3 5, 6, 7, 8
[46] Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel [60] Matthias Zwicker, Hanspeter Pfister, Jeroen Van Baar, and
grid optimization: Super-fast convergence for radiance fields Markus Gross. Surface splatting. In Proceedings of the
reconstruction. In Proc. IEEE Conf. on Computer Vision and 28th annual conference on Computer graphics and interac-
Pattern Recognition (CVPR), 2022. 1, 2 tive techniques, pages 371–378, 2001. 3
[47] J Edward Swan, Klaus Mueller, Torsten Moller, N Shareel,
Roger Crawfis, and Roni Yagel. An anti-aliasing technique
19456