(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder

CGLAB 이명규Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder (1/46) CGLAB 이명규
2019/07/26
재귀적 Denoising AE를 통한
MC렌더링 이미지 시퀀스의
실시간 복원 기법
Interactive Reconstruction of
Monte Carlo Image Sequences using a
Recurrent Denoising Autoencoder

CGLAB 이명규Interactive Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder (2/46)
I N D E X
01
02
03
04
05
Introduction
Recurrent AE
Proposed Method
Experiments
Conclusion

Introduction
Part 01
1. 논문소개
2. 관련 연구 요약
3. Monte Carlo Rendering

↳
논문소개1-1
• 발표 : SIGGRAPH 2017
• 저자 : Chakravarty R. Alla Chaitanya et al.
(NVIDIA, University of Montreal and McGill University)
• 인용횟수 : 63회
• Monte Carlo 렌더링에서 낮은 spp로 인해 발생되는 노이즈를
Recurrent AutoEncoder로 Denoising하는 연구
저널정보 및 논문소개

↳
https://www.youtube.com/watch?v=YjjTPV2pXY0
논문소개1-1

논문소개1-1

↳
관련 연구 요약1-2
• Offline Denoising for MC Rendering
• Non-linear image space filters to indirect diffuse illumination (Jenson et al.)
• Looking at the frequency analysis of light transport (Egan et al.)
• Train parameters of a non-local means filter using ML
Good quality, but slow.
• Interactive Denoising for MC Rendering
• Separate direct/indirect illumination and filter the latter using edge-avoiding
filters
• edge-avoiding À-trous wavelets, adaptive manifolds, guided image filters
Local detail may be lost
Related Works – Image Denoising

↳
관련 연구 요약1-2
• Image Restoration using Deep Learning
• Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections
[Mao et al.]
• The denoising of images corrupted with Gaussian noise is an active research topic.
• But in this paper, some samples have a very high energy while most areas appear black.
• Video Super Resolution
• Using RNNs(Huang et al. 2015) or LSTM block in bottleneck of the AE(Pătrăucean et al.)
Related Works – Reconstruction of Images

↳
Monte Carlo Rendering1-3
Monte Carlo Integral

↳
Monte Carlo Integral
https://www.cs.rpi.edu/~cutler/classes/advancedgraphics/S08/lectures/17_monte_carlo.pdf

↳
Monte Carlo Rendering
Bidirectional Reflectance Distribution Function

↳

↳
Monte Carlo Rendering Our Problem

↳

Recurrent AE
Part 02
1. AutoEncoder
2. RCNN(Recurrent CNN)
3. Recurrent AE
4. Additional Slides

↳
AutoEncoder2-1
Linear vs. Non-Linear Dimension Reduction
https://www.jeremyjordan.me/autoencoders/
“Why using AE in this paper?”

↳
AutoEncoder2-1
Concept of AutoEncoder
• AE는 입력 데이터를 하위 차원 매니폴드*로 매핑하기 위한 vector field를 학습
• 즉 고차원 공간 속에 분포하는 저차원의 Manifold hypothesis를 알아서 찾는것이 목표
*Manifold : 데이터가 분포하고 있는 공간의 표면을 의미(locally homeomorphic to euclidean space)
Self-supervised Learning
(Input을 target으로 사용)
Bottleneck
(Important features)
Target→
Predicted→

↳
RCNN(Recurrent CNN)2-2
Concept of RNN
• 시퀀스 데이터 모델링을 위해 등장
• Hidden state(≈기억)를 갖고 있는 것이 기존 네트워크와의 차이
• 네트워크의 hidden state는 현재 state까지 요약된 입력 데이터와 같음
• 새로운 입력이 들어올 때마다 hidden state가 수정됨
New
Hidden state
Input data

↳
Concept of RNN
https://pythonkim.tistory.com/57
ℎ𝑡𝑡 = 𝑓𝑓𝑤𝑤(ℎ𝑡𝑡−1, 𝑥𝑥𝑡𝑡)
New State
Some function with
parameters 𝑾𝑾
Old State
Input vector at
some time step

↳
Concept of RNN
https://dreamgonfly.github.io/rnn/2017/09/04/understanding-rnn.html
𝒐𝒐
𝑾𝑾 𝑾𝑾 𝑾𝑾
𝑺𝑺𝒊𝒊 𝒊𝒊𝒊𝒊𝒊𝒊 𝑺𝑺𝒕𝒕−𝟏𝟏 𝑺𝑺𝒕𝒕 𝑺𝑺𝒕𝒕+𝟏𝟏
𝒙𝒙𝒕𝒕−𝟏𝟏 𝒙𝒙𝒕𝒕−𝟏𝟏 𝒙𝒙𝒕𝒕−𝟏𝟏
𝑼𝑼 𝑼𝑼 𝑼𝑼
𝑽𝑽 • 𝒙𝒙𝒕𝒕 ∈ 𝑹𝑹𝟐𝟐𝟐𝟐
• 𝑼𝑼 ∈ 𝑹𝑹𝟐𝟐𝟐𝟐×𝟏𝟏𝟏𝟏𝟏𝟏
• 𝑺𝑺𝒕𝒕 ∈ 𝑹𝑹𝟏𝟏𝟏𝟏𝟏𝟏
• 𝑾𝑾 ∈ 𝑹𝑹𝟏𝟏𝟏𝟏𝟏𝟏×𝟏𝟏𝟏𝟏𝟏𝟏
• 𝑽𝑽 ∈ 𝑹𝑹𝟏𝟏𝟏𝟏𝟏𝟏×𝟏𝟏𝟏𝟏
• 𝒐𝒐 ∈ 𝑹𝑹𝟏𝟏𝟏𝟏

https://www.youtube.com/watch?v=UNmqTiOnRfg

https://www.youtube.com/watch?v=UNmqTiOnRfg
1
0
0
0
1
1
0
0
0
1
0
0
0
0
1
1
1
0
1
0
1
2
1
0
0
0
0
1
0
최댓값=1, 나머지=0
0
0
1

↳
Concept of RNN
• 입력과 출력 단계의 거리가 멀수록 Vanishing Gradient 발생
• RNN은 가장 최근의 입력을 제일 강하게 기억하기 때문
• RCNN = RNN + CNN
• RNN의 단점을 보완한 LSTM 등이 제안되었으나 본 논문에서는
vanilla RNN 사용
• 이미지 내의 크고 다양한 사이즈의 영역에 부정적인 영향을 미칠 수
있다고 판단

↳
Recurrent AE2-3
Why AE + RCNN?
• AE구조에 RCNN을 접목하면 Temporal stability가 증가됨
• 또한 AE를 사용하면 이를 end-to-end learning으로 감독 없이
자동으로 auxiliary pixel features를 잘 활용하도록 학습이 가능
• auxiliary pixel features : depth, normal 등등

↳
Additional slides2-4
What is Auxiliary Pixel Features?
• G-buffer에 저장되는 정보로 Scene의 Geometry에 대한 정보들을 포함
• Rasterization path→reconstruction algorithm으로 보내는 정보
• HDR RGB image, Depth, Roughness, View-space shading normal
• 3(FP16)+4(1 FP16+3 FP8)=7 scalar values per pixel

↳
2-4
What is Geometry Buffer?
• Per-pixel lighting에 필요한 모든 정보를 저장하는 buffer
• Normal, Position, Diffuse/Specular Albedo, …
Additional slides

Proposed Method
Part 03
1. Interactive Path Tracer
2. Network Architecture
3. Training Data
4. Loss Functions
5. Analysis

↳
3-1
Overview
I n t e r a c t i v e Pa t h
T r a c e r
Visible Surface
Rasterization
Tracing path using
NVIDIA OptiX GPU
Ray Tracer
• 1-sample unidirectionally path tracer 사용 (Indirect bounce = 1)
• 1-Direct lighting path(Cam→surface→light),
1-Indirect path(Cam→surface→ surface→ light)로 구성
• DoF, Motion blur는 G-buffer에서 노이즈를 유발하므로 Post process에서 처리
• Sampling light source
• Sampling scattering directions
(low-discrepancy Halton sequences)
• Apply path space regularization to
glossy & specular materials

↳
3-2
기존 Image Restoration 방식의 문제
Network Architecture
• CNN with hierarchical skip connection[Mao et al.]의 문제점
• Full resolution(1080p)에서 매우 느림
• MC Rendering에서는 흔한 Spatially very sparse samples에는 취약함
• Frame들이 독립되어 있어 temporally unstable한 결과물이 나옴
Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections - https://arxiv.org/pdf/1606.08921.pdf

↳
3-2
Network Overview – AE+RCNN
• Denoising AE와 함께 시간 개념을 더하기 위해 RNN구조를 적용
• Recurrent connections를 통해 시간 경과에 따른 조명 정보를 누적
• 입력이 encoder에서 더 sparse하므로 Encoder에만 recurrent 구조를 적용

↳
3-2
Network Overview – AE+RCNN
7View space shading
Normals
(FP8, 2ch)
Depth map
(FP16, 1ch)
Material’s Roughness
map
(FP8, 1ch)
Noisy HDR RGB
(FP16, 3ch)
1024*1024

↳
3-3
Trian Data Overview
Training Data
• 7개의 fly-through 동영상 시퀀스 사용
• 시퀀스 내에서 무작위 시간 범위를 골라 sub-sequence 학습에 사용
• 무작위로 앞/뒤 재생 및 다양한 카메라 움직임
• Data Augmentation
• 무작위로 선택된 시퀀스에 대해 90/180/270도 무작위 회전 적용
• 각 색상 채널별로 0~2 범위에서 무작위 색상 변조를 한 후 모든 시퀀스에 적용
(채널 독립성과 input-target 간의 linear한 관계를 더 잘 학습하게 함)

↳
3-4
Overview of Loss Functions
Loss Functions
ℒ = 𝑤𝑤𝑠𝑠ℒ𝑠𝑠 + 𝑤𝑤𝑔𝑔ℒ𝑔𝑔 + 𝑤𝑤𝑡𝑡ℒ𝑡𝑡
𝒘𝒘𝒔𝒔, 𝒘𝒘𝒈𝒈, 𝒘𝒘𝒕𝒕: Weights
Spatial 𝑳𝑳𝟏𝟏 Loss
Gradient domain
𝑳𝑳𝟏𝟏 Loss
Temporal
𝑳𝑳𝟏𝟏 Loss

↳
3-4
Loss with Isolated Images
Loss Functions
𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 ℒ𝑠𝑠 =
𝟏𝟏
𝑵𝑵
�
𝒊𝒊
𝑵𝑵
𝑷𝑷𝒊𝒊 − 𝑻𝑻𝒊𝒊
• L2대신 L1 loss를 사용하면 reconstructed image에서 splotchy artifacts를
감소시킬 수 있음.
𝑷𝑷𝒊𝒊: Predicted, 𝑻𝑻𝒊𝒊: Target

↳
3-4
Loss with Isolated Images
Loss Functions
𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 ℒ𝑔𝑔 =
𝟏𝟏
𝑵𝑵
�
𝒊𝒊
𝑵𝑵
𝜵𝜵𝑷𝑷𝒊𝒊 − 𝜵𝜵𝑻𝑻𝒊𝒊
• ∇는 HFEN(High Frequency Error Norm)으로 계산
• Edge detection에 Laplacian 방식을 사용
• 노이즈에 취약하기 때문에 Gaussian filter(𝝈𝝈 = 𝟏𝟏. 𝟓𝟓)로 pre-smoothing

↳
3-4
Loss that Penalize Temporal Incoherence
Loss Functions
𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡𝑡 ℒ𝑔𝑔 =
𝟏𝟏
𝑵𝑵
�
𝒊𝒊
𝑵𝑵
∂𝑷𝑷𝒊𝒊
∂𝒕𝒕
−
∂𝑻𝑻𝒊𝒊
∂𝒕𝒕

↳
3-4
𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝓛𝓛𝒔𝒔 vs. 𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪 𝑳𝑳𝑳𝑳𝑳𝑳𝑳𝑳
Loss Functions
• 𝒘𝒘𝒔𝒔, 𝒘𝒘𝒈𝒈, 𝒘𝒘𝒕𝒕의 scale은 0.8, 0.1, 0.1로 대강 맞춤
• 영상 시퀀스가 끝나갈 때 loss weight를 높게 줄수록
temporal gradient를 증폭 가능
• Gaussian Curve를 이용해 수치 변경 (0.011, 0.044, 0.135, 0.325, 0.607, 0.882, 1)
• 𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺𝑺 𝓛𝓛𝒔𝒔: 0.9335, 𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪𝑪 𝑳𝑳𝑳𝑳𝑳𝑳𝑳𝑳: 0.9417

↳
3-5
Analysis of Auxiliary Features
Analysis
• Using untextured lighting improves
the convergence speed
• Normals help network to detect
silhouettes of objects
• Add depth, roughness to get
more improvements.

↳
3-5
Network Properties Analysis
best best
Analysis

Experiments
Part 04
1. Reconstruction Quality
with low samples
2. Performance

↳
4-1
Overview of Network
Reconstruction Quality with low samples

↳
4-1 Reconstruction Quality with low samples

↳
4-2
Environment
Performance
• Train
• Training network using NVIDIA DGX-1
16H for training 500epoch(1H for preprocessing data)
• Optimizer: ADAM(lr 0.001, decay rates 𝜷𝜷𝟏𝟏 = 𝟎𝟎. 𝟗𝟗, 𝜷𝜷𝟐𝟐 = 𝟎𝟎. 𝟗𝟗𝟗𝟗)
• Initialize parameters using He et al.’s method
• Apply LeakyReLU( 𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔𝒔 𝜶𝜶 = 𝟎𝟎. 𝟏𝟏, except last layer)+MaxPooling
• Reconstruction Performance
• CUDA kernel + cuDNN 5.1
• 720p reconstruction에 54.9ms 소요(TitanX)
https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/dgx-1/dgx-1-print-infographic-738238-nvidia-web.pdf
“149,000$”

Conclusion
Part 05
1. Conclusion
2. Limitations

↳
Conclusion5-1
• Conclusion
• First application of recurrent denoising AE
• Producing noise-free and temporally coherent
animation sequence with GI(Global illumination)
• Future work
• 네트워크에 렌즈와 시간 좌표를 제공해 motion blur, DoF와 같은 효과도 처리
요약

↳
Limitations5-2
• 머리카락과 같이 정교한 geometry에서는 spp가 낮을 때
파괴된 이미지 구조를 복구하지 못함
• 학습 데이터가 적을 경우 Flickering 현상 발생
본 논문의 한계
• Right: Noisy 1 spp input RGB sequence
• Middle: Reconstructed sequence
• Left: Reference 4096 spp sequence
https://github.com/yuyingyeh/rdae

Thank you for Listening.
Email : brstar96@naver.com (or brstar96@soongsil.ac.kr)
Mobile : +82-10-8234-3179

(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder

Recommended

More Related Content

What's hot (20)

Similar to (Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder (20)

More from MYEONGGYU LEE (13)

Recently uploaded (20)

(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurrent Denoising Autoencoder