This document discusses generative adversarial networks (GANs) and their relationship to reinforcement learning. It begins with an introduction to GANs, explaining how they can generate images without explicitly defining a probability distribution by using an adversarial training process. The second half discusses how GANs are related to actor-critic models and inverse reinforcement learning in reinforcement learning. It explains how GANs can be viewed as training a generator to fool a discriminator, similar to how policies are trained in reinforcement learning.
This document discusses various methods for calculating Wasserstein distance between probability distributions, including:
- Sliced Wasserstein distance, which projects distributions onto lower-dimensional spaces to enable efficient 1D optimal transport calculations.
- Max-sliced Wasserstein distance, which focuses sampling on the most informative projection directions.
- Generalized sliced Wasserstein distance, which uses more flexible projection functions than simple slicing, like the Radon transform.
- Augmented sliced Wasserstein distance, which applies a learned transformation to distributions before projecting, allowing more expressive matching between distributions.
These sliced/generalized Wasserstein distances have been used as loss functions for generative models with promising
Deep Learningについて、日本情報システム・ユーザー協会(JUAS)のJUAS ビジネスデータ研究会 AI分科会で発表しました。その際に使用した資料です。専門家向けではなく、一般向けの資料です。
なお本資料は、2015年12月の日本情報システム・ユーザー協会(JUAS)での発表資料の改訂版となります。
This document discusses various methods for calculating Wasserstein distance between probability distributions, including:
- Sliced Wasserstein distance, which projects distributions onto lower-dimensional spaces to enable efficient 1D optimal transport calculations.
- Max-sliced Wasserstein distance, which focuses sampling on the most informative projection directions.
- Generalized sliced Wasserstein distance, which uses more flexible projection functions than simple slicing, like the Radon transform.
- Augmented sliced Wasserstein distance, which applies a learned transformation to distributions before projecting, allowing more expressive matching between distributions.
These sliced/generalized Wasserstein distances have been used as loss functions for generative models with promising
Deep Learningについて、日本情報システム・ユーザー協会(JUAS)のJUAS ビジネスデータ研究会 AI分科会で発表しました。その際に使用した資料です。専門家向けではなく、一般向けの資料です。
なお本資料は、2015年12月の日本情報システム・ユーザー協会(JUAS)での発表資料の改訂版となります。
* Satoshi Hara and Kohei Hayashi. Making Tree Ensembles Interpretable: A Bayesian Model Selection Approach. AISTATS'18 (to appear).
arXiv ver.: https://arxiv.org/abs/1606.09066#
* GitHub
https://github.com/sato9hara/defragTrees
28. Analytic LISTA (ALISTA)[8] (1/3)
• タイトル
– J. Liu, et al., “ALISTA: Analytic Weights are as Good as
Learned Weights in LISTA” (2019)
• 概要
– LISTA-CP[7]の発展型
– LISTA の収束レートは線形より改善できないことを証明(線形収束が最良)
– パラメータ数を削減した LISTA を提案(TiLISTA)
– 線形収束を達成する TiLISTA の重みは、辞書だけで
解析的に決定できると主張(ALISTA)
27
31. サーベイ[9](1/1)
• タイトル
– V. Monga, et al., “Algorithm Unrolling: Interpretable, Efficient
Deep Learning for Signal and Image Processing”
(to be appeared)
• 概要
– 信号処理・画像処理応用の網羅的なサーベイ
• 本スライドのトピックの多くはこの文献を参考
– LISTA に加え、各種最適化アルゴリズムの Unrolling Methodを紹介
• ADMM, NMF, PDHG, etc.
– 類似手法との関連性や近年のトレンドについて記載
30
32. 参考文献
• 書籍
– [1] H. H. Bauschke and P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd ed. Springer
International Publishing, 2017.
– [2] A. Beck, First-Order Methods in Optimization. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2017.
– [3] M. Elad and 玉木徹, スパースモデリング : l1/l0ノルム最小化の基礎理論と画像処理への応用. 共立出版, 2016.
• 学術論文
– [4] A. Beck and M. Teboulle, “A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems,” SIAM J.
Imaging Sci., vol. 2, no. 1, pp. 183–202, 2009, doi: 10.1137/080716542. [Online]. Available:
http://epubs.siam.org/doi/10.1137/080716542. [Accessed: 26-Jan-2020]
– [5] K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” in Proceedings of the 27th International
Conference on International Conference on Machine Learning, 2010, pp. 399–406.
– [6] Z. Wang, D. Liu, J. Yang, W. Han, and T. Huang, “Deep Networks for Image Super-Resolution with Sparse Prior,”
arXiv:1507.08905 [cs], Oct. 2015 [Online]. Available: http://arxiv.org/abs/1507.08905. [Accessed: 05-Aug-2020]
– [7] X. Chen, J. Liu, Z. Wang, and W. Yin, “Theoretical Linear Convergence of Unfolded ISTA and its Practical Weights and
Thresholds,” arXiv:1808.10038 [cs, stat], p. 11, Nov. 2018 [Online]. Available: http://arxiv.org/abs/1808.10038. [Accessed:
16-Jun-2020]
– [8] J. Liu, X. Chen, Z. Wang, and W. Yin, “ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA,” in
International Conference on Learning Representations, New Orleans, LA, 2019 [Online]. Available:
https://openreview.net/forum?id=B1lnzn0ctQ
– [9] V. Monga, Y. Li, and Y. C. Eldar, “Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image
Processing,” arXiv:1912.10557 [cs, eess], Dec. 2019 [Online]. Available: http://arxiv.org/abs/1912.10557. [Accessed: 16-
Jun-2020]
31