Minimax statistical learning with Wasserstein distances (NeurIPS2018 Reading ...Kenta Oono
This document summarizes a presentation on the paper "Minimax statistical learning with Wasserstein distances" which develops a distributionally robust risk minimization framework using Wasserstein distances. It minimizes the worst-case risk over distributions close to the true distribution, as measured by the p-Wasserstein distance. The paper shows that the excess risk rate of the proposed estimator is the same as the non-robust case, at O(n^{-1/2}). The presentation highlights the key ideas and lemmas used in the paper's analysis.
Deep learning for molecules, introduction to chainer chemistryKenta Oono
1) The document introduces machine learning and deep learning techniques for predicting chemical properties, including rule-based approaches versus learning-based approaches using neural message passing algorithms.
2) It discusses several graph neural network models like NFP, GGNN, WeaveNet and SchNet that can be applied to molecular graphs to predict characteristics. These models update atom representations through message passing and graph convolution operations.
3) Chainer Chemistry is introduced as a deep learning framework that can be used with these graph neural network models for chemical property prediction tasks. Examples of tasks include drug discovery and molecular generation.
Overview of Machine Learning for Molecules and Materials Workshop @ NIPS2017Kenta Oono
The document provided an overview of the Machine Learning for Molecules and Materials Workshop at NIPS 2017. It discussed recent advances in using machine learning for molecular and materials applications, including molecule generation with variational autoencoders and learning graph-structured molecular data with graph convolution networks. The workshop featured talks on topics such as deep learning approaches for chemistry, kernel learning with structured data, and machine learning applications in drug discovery and material informatics.
Comparison of deep learning frameworks from a viewpoint of double backpropaga...Kenta Oono
This document compares deep learning frameworks from the perspective of double backpropagation. It discusses the typical technology stacks and design choices of frameworks like Chainer, PyTorch, and TensorFlow. It also provides a primer on double backpropagation, explaining how it computes the differentiation of a loss function with respect to inputs. Code examples of double backpropagation are shown for Chainer, PyTorch and TensorFlow.
Deep learning framework Chainer was introduced. Chainer allows defining neural networks as Python programs for flexible construction. It supports both CPU and GPU computation and various deep learning libraries have been developed on top of Chainer like ChainerRL for reinforcement learning. The development team maintains and improves Chainer through frequent releases and community events.
GTC Japan 2016 Chainer feature introductionKenta Oono
This document introduces Chainer's new trainer and dataset abstraction features which provide a standardized way to implement training loops and access datasets. The key aspects are:
- Trainer handles the overall training loop and allows extensions to customize checkpoints, logging, evaluation etc.
- Updater handles fetching mini-batches and model optimization within each loop.
- Iterators handle accessing datasets and returning mini-batches.
- Extensions can be added to the trainer for tasks like evaluation, visualization, and saving snapshots.
This abstraction makes implementing training easier and more customizable while still allowing manual control when needed. Common iterators, updaters, and extensions are provided to cover most use cases.
This document discusses benchmarking deep learning frameworks like Chainer. It begins by defining benchmarks and their importance for framework developers and users. It then examines examples like convnet-benchmarks, which objectively compares frameworks on metrics like elapsed time. It discusses challenges in accurately measuring elapsed time for neural network functions, particularly those with both Python and GPU components. Finally, it introduces potential solutions like Chainer's Timer class and mentions the DeepMark benchmarks for broader comparisons.
This document provides an overview and agenda for a tutorial on deep learning implementations and frameworks. The tutorial is split into two sessions. The first session will cover basics of neural networks, common design aspects of neural network implementations, and differences between deep learning frameworks. The second session will include coding examples of different frameworks and a conclusion. Slide decks and resources will be provided on topics including basics of neural networks, common design of frameworks, and differences between frameworks. The tutorial aims to introduce fundamentals of deep learning and compare popular frameworks.
This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.
Common Design of Deep Learning FrameworksKenta Oono
The document provides an overview of a tutorial on deep learning implementations and frameworks. It discusses:
1) The agenda of the tutorial, which covers an introduction to neural networks, common designs of frameworks, and differences between frameworks.
2) Key steps in training neural networks, including preparing data, computing loss/gradients, updating parameters, and common technology used like computational graphs and automatic differentiation.
3) Common components of deep learning frameworks, such as graphical interfaces, workflow management, computational graph handling, array libraries, and hardware support like GPUs.
9. Caffeで実現された技術例例(1)
Deep Q Network*(深層学習で強化学習)
PongSpace Invader
藤⽥田康博さん
「CaffeでDeep Q-Networkを実装して深
層強化学習してみた」**
松元叡⼀一さん
PFIインターン2014 最終発表***
* Playing Atari with Deep Reinforcement Learning, Volodymyr Mnih, Koray Kavukcuoglu, David
Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller NIPS Deep Learning
Workshop, 2013
** http://d.hatena.ne.jp/muupan/20141021/1413850461
9
*** http://www.ustream.tv/recorded/53153399
11. ⽤用語:アーキテクチャ関連
• Net:Neural Net(NN)のアーキテクチャ全体
• Blob:Node、Neuron、(ややこしいが)○○層とも
• Layer:異異なる階層のBlobをつなぐモジュール
NetBlobLayer
x
1
x
N
h
1
h
H
k
1
k
M
y
1
y
M
t
1
t
M
Forward
Backward
11
31. ⾼高次元な⼊入⼒力力データを扱う⽅方法について
Q. ⾼高次元で疎な⼊入⼒力力を、効率率率的に学習を⾏行行う⽅方法はないか?
A. Sparse Blobの実装はありますが、マージはされていません
疎データを⼊入⼒力力するためのInput Blobは、Sparse Blobとして既に実装され、Pull
Requestが提案されています*。しかしまだマージされていないため、もし利利⽤用する場合、
⾃自⼰己責任で利利⽤用する必要がありそうです。
「Torch7と混同しているかもしれない」と話しましたが、疎データを⼊入⼒力力し内積計算す
るLayerはSparse Linear LayerとしてTorch7に存在していました (masterブランチ、コ
ミット=704684)** ***
* Sparse Data Support #937(https://github.com/BVLC/caffe/pull/937)
** https://github.com/torch/nn/blob/master/SparseLinear.lua
*** https://github.com/torch/nn/blob/master/doc/simple.md#nn.SparseLinear
31
32. mafを⽤用いたCaffeの並列列実⾏行行について
Q. デモでmafを⽤用いたCaffeを実⾏行行した際、1並列列での実⾏行行を指定する-j1オプ
ションを付けたのはなぜか?
A. Data Layerに⽤用いたleveldbファイルを同時アクセス出来ないためです
Data Layerを⽤用いて訓練データを与える場合、⼊入⼒力力ファイルを予め適切切なファイル形式に変
換しなければなりません(変換コマンドはCaffe側で⽤用意されています)。
その形式の1つのleveldb形式では、同じファイルに複数のプロセスから同時にデータにアク
セスすることができません (下記issue参照*)。
今回のデモでは、様々なprototxtを⽣生成しましたが、⼊入⼒力力ファイルは共通していました。複
数のタスクが並列列に⾛走って競合が発⽣生するのを防ぐため1並列列で実⾏行行しました。
issueによるとleveldb形式ではなくLMDB形式を⽤用いれば並列列実⾏行行でも問題ないようです。
* Parallel access to Leveldb #695(https://github.com/BVLC/caffe/issues/695)
32
33. Mac OS 10.9でのCaffeのインストール (1/2)
Q. Mac OS 10.9上でCaffeを簡単にインストールする⽅方法はないか?
A. GPUを使わないならば10.8とほぼ同等の⽅方法でできます
公式サイトのインストール⼿手順にある通り、Mac OS 10.9でのインストール⽅方法は10.8
以前とは異異なります。
その原因の⼀一つがデフォルトで使⽤用するC++ライブラリです。Mac OS 10.9ではデフォ
ルトC++コンパイラであるClang++はlibc++をデフォルトで⽤用いています。しかし、
NVIDIAのCUDAはlibstdc++でコンパイルしたライブラリとしかリンクできません。
33