Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

Copyright © 2020 調和系工学研究室 - 北海道大学大学院情報科学研究院情報理工学部門複合情報工学分野 – All rights reserved.
DLゼミ
Towards Faster and Stabilized GAN
Training for High-fidelity Few-shot
Image Synthesis
北海道大学大学院情報科学研究院
情報理工学部門複合情報工学分野調和系工学研究室
学部4年大倉博貴

• 著者
– Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed
Elgammal
• 発表
– ICLR 2021
• 論文URL
– https://openreview.net/pdf?id=1Fqg133qRaI
• GitHub
– https://github.com/odegeasslbc/FastGAN-pytorch
論文紹介 2

• 少数データで高解像度の画像生成モデル
(GAN)を高速に学習させる技術の提案
– 軽量かつ効率的に学習可能なgenerator
– 少ないデータでもdiscriminatorを効果的に学習す
るための正則化
論文概要 3

• generatorとdiscriminatorを競合させることで
の本物のデータに近いデータを生成
GANとは 4
generator
discriminator
ノイズ
本物のデータ
本物?偽物?

• 高解像度な画像生成を行うために、generator
は深いモデルになる
– up-samplingに合わせて畳み込み層が増える
• 深いモデルを学習するには、skip-connection
を使用するResBlockが必要だが、
– 同じ解像度同士しかskip-connectionできない
– 高解像度のskip-connectionは高コスト
• Skip-Layer Excitation(SLE)を提案
generatorの設計 5

• Skip-Layer Excitation(SLE)
– 異なる解像度で行える計算コストが低いskip-
connectionを実装
提案されたgenerator 6

• 高解像度側の特徴量の各チャ
ンネルごとの重みづけを低解
像度側の特徴量に基づいて行
う
– チャンネルごとなので異なる解
像度も可
– Layer間で勾配を保つ
• 重みづけの計算では早々に空
間方向を圧縮するため軽量
Skip-Layer Excitation(SLE) 7

• discriminatorをエンコーダとして扱い、小さ
いデコーダで学習を行いたい(オートエンコー
ディング)
– うまく再構成できる画像の特徴量を抽出すること
で、少数データの過学習を防ぐ
• Self-Supervised Discriminatorを提案
discriminatorに対するアイデア 8

• Self-Supervised Discriminator
– discriminatorの特徴量から入力画像(の縮小版や部
分領域)を再構成する
提案されたdiscriminator 9

• 再構成損失とヒンジ損失[1]で学習
– ヒンジ損失の計算が最も速かった
学習の定式化 10
f : discriminatorで抽出した特徴量
x : discriminatorへの入力画像
G() : デコーダ(再構成する処理)
T() : 縮小処理や部分領域を取り出す処理
[1] Jae Hyun Lim and Jong Chul Ye. Geometric gan. arXiv preprint arXiv:1705.02894, 2017.

• StyleGAN2[1][2]
– 最先端モデル
• Baseline
– DCGAN[3]をベースに様々な技術を統合した強力
なモデル
• Ours
– BaselineにSkip-Layer ExcitationとSelf-Supervised
Discriminatorを用いたモデル
実験で扱うモデル 11
[1] Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Training generative adversarial networks with
limited data. arXiv preprint arXiv:2006.06676, 2020a.
[2] Shengyu Zhao, Zhijian Liu, Ji Lin, Jun-Yan Zhu, and Song Han. Differentiable augmentation for data-efficient gan training. arXiv
preprint arXiv:2006.10738, 2020.
[3] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial
networks. arXiv preprint arXiv:1511.06434, 2015.

• Pytorch[1]を使って実行
• 結果
– 計算時間はStyleGAN2のおよそ1/4、モデルのサイ
ズはおよそ半分
実験1 : 計算コスト比較 12
[1] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga,
and Adam Lerer. Automatic differentiation in pytorch. 2017.

• データセット
– 様々なコンテンツの256×256と1024×1024の12個
の画像セット
– FFHQ[1], Animal-Face Dog and Cat[2], Oxford-
flowers[3]など
• 評価指標
– FID[4] : 実画像と生成画像の特徴距離を測定する最
も一般的な指標
実験2 : 定量評価と定性評価 13
[1] Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of
the IEEE conference on computer vision and pattern recognition, pp. 4401–4410, 2019.
[2] Zhangzhang Si and Song-Chun Zhu. Learning hybrid image templates (hit) by information projection. IEEE Transactions on pattern
analysis and machine intelligence, 34(7):1354–1367, 2011.
[3] Maria-Elena Nilsback and Andrew Zisserman. A visual vocabulary for flower classification. In IEEE Conference on Computer Vision and
Pattern Recognition, volume 2, pp. 1447–1454, 2006.
[4] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale
update rule converge to a local nash equilibrium. In Advances in neural information processing systems, pp. 6626–6637, 2017.

• 256×256(少数のデータ)
• 1024×1024(少数のデータ)
– 高性能で計算時間も短い
– Skipよりdecodeの恩恵が大きい
実験2 : 結果(定量評価) 14

• 1024×1024(多数のデータ)
– データが増えると徐々にStyleGAN2が有利
• Self-Supervised Discriminatorの比較
– 入力再構成(auto-encoding)が最も優れている
実験2 : 結果(定量評価) 15

• 1024×1024(10時間学習)
実験2 : 結果(定性評価) 16

• 1024×1024(10時間学習)
実験2 : 結果(定性評価) 17

• 少数データで高解像度の画像生成モデル
(GAN)を高速に学習させる技術の提案
– 軽量かつ効率的に学習可能なgenerator
– 少ないデータでもdiscriminatorを効果的に学習す
るための正則化
• 提案手法は少数データに対してStyleGAN2よ
り高性能
まとめ 18

Thank you

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

Recommended

More Related Content

What's hot (20)

Similar to Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis (20)

More from harmonylab (20)

Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis