The document discusses FactorVAE, a method for disentangling latent representations in variational autoencoders (VAEs). It introduces Total Correlation (TC) as a penalty term that encourages independence between latent variables. TC is added to the standard VAE objective function to guide the model to learn disentangled representations. The document provides details on how TC is defined and computed based on the density-ratio trick from generative adversarial networks. It also discusses how FactorVAE uses TC to learn disentangled representations and can be evaluated using a disentanglement metric.
大規模データセットでの推論に便利なSVIの概要をまとめました.
SVIは確率的最適化の枠組みで行う変分ベイズ法です.
随時更新してます.
参考文献
[1]Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference. The Journal of Machine Learning Research, Vol. 14, No. 1, pp. 1303–1347, 2013.
[2] 佐藤一誠. トピックモデルによる統計的意味解析. コロナ社, 2015.
大規模データセットでの推論に便利なSVIの概要をまとめました.
SVIは確率的最適化の枠組みで行う変分ベイズ法です.
随時更新してます.
参考文献
[1]Matthew D Hoffman, David M Blei, Chong Wang, and John Paisley. Stochastic variational inference. The Journal of Machine Learning Research, Vol. 14, No. 1, pp. 1303–1347, 2013.
[2] 佐藤一誠. トピックモデルによる統計的意味解析. コロナ社, 2015.
On the Dynamics of Machine Learning Algorithms and Behavioral Game TheoryRikiya Takahashi
Presentation Material used in guest lecturing at University of Tsukuba on September 17, 2016.
Target audience is part-time PhD student working at a machine learning, data mining, or agent-based simulation project.
sublabel accurate convex relaxation of vectorial multilabel energiesFujimoto Keisuke
This document summarizes a presentation on the paper "Sublabel-Accurate Convex Relaxation of Vectorial Multilabel Energies". It discusses how the paper proposes a method to efficiently solve high-dimensional, nonlinear vectorial labeling problems by approximating them as convex problems. Specifically, it divides the problem domain into subregions and approximates each subregion with a convex function, yielding an overall approximation that is still non-convex but with higher accuracy. This lifting technique transforms the variables into a higher-dimensional space to formulate the data and regularization terms in a way that allows solving the problem as a convex optimization.
Fractality of Massive Graphs: Scalable Analysis with Sketch-Based Box-Coverin...Kenko Nakamura
This document proposes a sketch-based box-covering algorithm to efficiently analyze the fractality of massive graphs. It summarizes that some real-world networks have been found to be fractal in nature, but existing algorithms for determining fractality are too slow for large networks. The proposed method uses min-hash to represent boxes implicitly and solves the box-covering problem efficiently in the sketch space using a binary search tree and heap, allowing fractality analysis of networks with millions of edges for the first time.
The document proposes a novel hand posture recognition method based on curriculum learning with deep convolutional neural networks. The key ideas are:
1) Train a DCNN using curriculum learning with two heterogeneous tasks - segmentation and classification - to transfer knowledge between the tasks.
2) The network is first trained for segmentation task and then transferred to classification task, updating the parameters.
3) Experiments show the proposed method achieves better hand shape classification accuracy compared to training without curriculum learning.
18. 注目のきっかけ(1)
音声認識・画像認識のベンチマークでトップ
音声認識(2011)
F. Seide, G. Li and D. Yu, “Conversational Speech Transcription Using
Context-Dependent Deep Neural Networks.”, INTERSPEECH2011.
多層(7つ)結合.事前学習あり
一般物体認識(2012)
A. Krizhevsky, I. Sutskever and G. E. Hinton. "ImageNet Classification
with Deep Convolutional Neural Networks." NIPS. Vol. 1. No. 2. 2012.
多層のCNNで従来性能を大きく上回る
18
129. Normalize Layer
Local contrast normalization
Convolutional layer Normalize layer
同一特徴マップにおける局所領域内で正規化する
vj,k = xj,k − wp,q xj+p,k+q∑
wp,q =1∑
yj,k =
vj,k
max(C,σ jk )
σ jk = wpqvj+p,k+q
2
∑
K. Jarrett, K. Kavukcuoglu, M. Ranzato and Y.LeCun ,“What is the Best Multi-Stage Architecture for
Object Recognition?”, ICCV2009 129
130. Normalize Layer
Local response normalization
Convolutional layer Normalize layer
同一位置における異なる特徴マップ間で正規化する
yi
j,k = (1+α (yl
j,k )2
)β
l=i−N/2
i+N/2
∑
G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov ,“Improving neural networks by preventing
co-adaptation of feature detectors ”, arxiv2012
130
206. SSDの特徴
複数の解像度の特徴マップを用いて物体検出
→ 小さな物体から大きな物体まで検出可能
多くのBounding box候補を使用
→ 1枚の画像から多くの物体を検出可能
206
YOLO v1の矩形候補数
7 x 7 x 2 = 98 (最新版コードでは 7 x 7 x 3 = 147)
SSDの矩形候補数
(38 x 38 x 4) + (19 x 19 x 6) + (10 x 10 x 6) + (5 x 5 x 6) + (3 x 3 x 4) + (1 x 1 x 4) = 8,732
SSD: Single Shot MultiBox Detector, ECCV2016
227. ゲームを利用したデータ生成
227
G. Ros, The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of
Urban Scenes, CVPR2016
Virtual cityをCGで作成
様々な国の街の雰囲気,天候,季節を再現
13400フレーム分のデータ,13クラスにラベリング
全周囲のカメラ
距離画像
(未公開)
Dataset URL : http://synthia-dataset.net
238. カリキュラムラーニング(5)
認識は識別問題に対応したネットワークのみ利用
238
5
Input data : gray scale image
output : class label
T. Yamashita, Hand Posture Recognition Based on Bottom-up Structured Deep Convolutional Neural Network with
Curriculum Learning”, “Curriculum Learning, ICIP2014
239. カリキュラムラーニング(6)
without curriculum learning with curriculum learning
Ground Truth class
classificationclass
Ground Truth class
classificationclass
239
T. Yamashita, Hand Posture Recognition Based on Bottom-up Structured Deep Convolutional Neural Network with
Curriculum Learning”, “Curriculum Learning, ICIP2014
286. RNNを利用した自然言語処理
Recurrent neural network based language model
286
Word Embedding
入力単語をベクトル表現
ベクトル長:辞書の単語数
入力単語の要素は1
それ以外は0
過去の履歴(文脈)をベクトルとして保持
各単語の確率を出力
ソフトマックスを利用
323. 本物らしさ 0.9 0.7
Generative Adversarial Network
Generatorの学習
323
Discriminatorを騙すような画像を生成するようにする
生成データの誤差を(1−本物らしさ)として逆伝播する
本物データは利用しない
Generator Discriminator
z
G(z)
D(x)
学習データ
min
G
max
D
Ex⇠Pdata(x) [ln D(x)] + Ez⇠Pz(z) [ln(1 D(G(z)))]
パラメータは固定
I. Goodfellow, Generative Adversarial Networks, 2014
324. Generatorの工夫
Laplacian Pyramid of Generative Adversarial Net
324
z3を入力とし,低解像度画像を生成
z2と低解像度画像から高周波成分を生成し,低解像度画像を合成
高解像度化を繰り返し行う
E.Denton, Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, 2015
325. LPGANの学習
Laplacian Pyramid of Generative Adversarial Net
325
高解像度画像の高周波成分を本物かどうかを判定するDiscriminatorを学習
徐々に低解像度のDiscriminatorを学習していく
E.Denton, Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks, 2015
326. DCGAN
Deep Convolutional Generative Adversarial Nets
326
CNNでGeneratorを構成
A. Radford, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015
327. DCGAN
Deep Convolutional Generative Adversarial Nets
327
A. Radford, Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, 2015
343. Caffeのインストール(1)
セットアップ用のファイルを用意
Make.config.exampleを Make.configにコピー
cp Makefile.config.example Makefile.config
Make.configを修正
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!
# cuDNN acceleration switch (uncomment to build with cuDNN).
# USE_CUDNN := 1
# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1
# uncomment to disable IO dependencies and corresponding data layers
# USE_OPENCV := 0
# USE_LEVELDB := 0
# USE_LMDB := 0
cuDNNを使用する場合は「#」を削除
CPUで使用する場合は「#」を削除
入力データを与える方法で使用する場合ものの「#」を削除
343
344. Caffeのインストール(2)
セットアップ用のファイルを用意
Make.config.exampleを Make.configにコピー
cp Makefile.config.example Makefile.config
Make.configを修正
# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas
# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib
BLASのパッケージを指定する
BLASのパッケージの場所を指定する
(標準的な場所にない場合)
主にmacの場合の設定
344
345. Caffeのインストール(3)
セットアップ用のファイルを用意
Make.config.exampleを Make.configにコピー
cp Makefile.config.example Makefile.config
Make.configを修正
# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app
# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 ¥
/usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include ¥
# $(ANACONDA_HOME)/include/python2.7 ¥
# $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include ¥
MATLABから利用する場合のみ
MATLABの場所を指定
pythonの場所を指定
anacondaを利用する場合は,
こちらでpythonの場所を指定
345
346. Caffeのインストール(4)
セットアップ用のファイルを用意
Make.config.exampleを Make.configにコピー
cp Makefile.config.example Makefile.config
Make.configを修正
# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m ¥
# /usr/lib/python3.5/dist-packages/numpy/core/include
# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib
# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core;
print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib
python3を利用する場合の設定
pythonのライブラリの場所を指定
特に変更する必要はなし
346
347. Caffeのインストール(5)
セットアップ用のファイルを用意
Make.config.exampleを Make.configにコピー
cp Makefile.config.example Makefile.config
Make.configを修正
# Uncomment to support layers written in Python (will link against Python libs)
# WITH_PYTHON_LAYER := 1
# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
# If Homebrew is installed at a non standard location (for example your home directory) and you
use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib
pythonで独自の層を使用する場合
ライブラリなどを独自の場所に
置いている場合
特に変更する必要はなし
347
366. creat_mnist.shの修正内容
#!/usr/bin/env sh
# This script converts the mnist data into lmdb/leveldb format,
# depending on the value assigned to $BACKEND.
set -e
EXAMPLE=examples/mnist
DATA=data/mnist
BUILD=build/examples/mnist
BACKEND="lmdb"
echo "Creating ${BACKEND}..."
rm -rf $EXAMPLE/mnist_train_${BACKEND}
rm -rf $EXAMPLE/mnist_test_${BACKEND}
$BUILD/convert_mnist_data.bin $DATA/train-images-idx3-ubyte ¥
$DATA/train-labels-idx1-ubyte $EXAMPLE/mnist_train_${BACKEND} --backend=${BACKEND}
$BUILD/convert_mnist_data.bin $DATA/t10k-images-idx3-ubyte ¥
$DATA/t10k-labels-idx1-ubyte $EXAMPLE/mnist_test_${BACKEND} --backend=${BACKEND}
echo "Done."
ここを修正
convert_mnist_data.binがある場所を指定
366
393. TensorFlow
393
import numpy as np
import tensorflow as tf
W = tf.Variable([.3], tf.float32) # Model parameters
b = tf.Variable([-.3], tf.float32)
x = tf.placeholder(tf.float32) # Model input and output
y = tf.placeholder(tf.float32)
linear_model = W * x + b
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
optimizer = tf.train.GradientDescentOptimizer(0.01) # optimizer
train = optimizer.minimize(loss)
x_train = [1,2,3,4] # training data
y_train = [0,-1,-2,-3]
init = tf.global_variables_initializer() # training loop
sess = tf.Session()
sess.run(init) # reset values to wrong
for i in range(1000):
sess.run(train, {x:x_train, y:y_train})
curr_W, curr_b, curr_loss = sess.run([W, b, loss], {x:x_train, y:y_train}) # evaluate training
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))
394. TenforFlow
394
def main(_):
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
# Train
for _ in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
# Test trained model
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images,
y_: mnist.test.labels}))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data',
help='Directory for storing input data')
FLAGS, unparsed = parser.parse_known_args()
tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)