SlideShare a Scribd company logo
1
DEEP LEARNING JP
[DL Papers]
http://deeplearning.jp/
DeepLearning 論文乱読会
2018/4/6
Aman Sinha, Hongseok Namkoong, John Duchi (Stanford University)
“Certifying Some Distributional Robustness with Principled Adversarial Training” (ICLR2018, Oral)
どんな研究?
データ������動にロバストな推論を行えることを�
証する効率的な手法を提案.Distributional Robustness,
Wasserstein Ball��点を利用.
提案法�肝�?
(1) ラグラン�ュ��定�数法でDistributional
Robustnessを書き直す.
(2) 上���な���大�(NP-Hard)な�で�上手く�
似する => よく分かってないですすいません。
�行研究と��い�?
Distributional robustnessに��く��方法より効率的(P
をどう��か)?で��いクラス(損失�形)に適用可能.
Fast gradient sign method (FGSM) [Goodfellow, 2015]
�ような入力��で���線形���定�ない.
Keyとなる洞察/結果
岩岩澤澤
Hongseok Namkoong, John Duchi (Stanford University)
“Variance-based Regularization with Convex Objectives” (NIPS2017, Oral, Best Paper)
どんな研究? Keyとなる洞察/結果
生成過程Pにロバストな推論を行うた��手法を提案.応
用�として�Adversarial Exampleやドメイン適応など.
�行研究と��い�?
Varianceを直��小化する方法(次式)�あったが
,Variance�損失�数が�でも��になる.提案法��
なままこ�式を�似.
提案法�肝�?
Distributional Robustness��点で�える + そ�効率的
な学習方法 [D. & Namkoog, 2016]
理論的な結果
1. 提案法�式が�Varianceを��にする�適化��
界により高い確率で抑えられる
2. 収束も早い
実験的な結果(ピックアップ)
1. Portease clevage experiment
2. Document classification in Reuters corpus
Saeid Motiian, Quinn Jones, Seyed Mehdi Iranmanesh, Gianfranco Doretto (West Virginia University)
“Few-Shot Adversarial Domain Adaptation” (NIPS2017)
https://arxiv.org/abs/1711.02536
どんな研究?
ター�ットドメイン��ンプルが�ししか無い(がラベルが
ある)��ドメイン適応手法を提案.クラスを��した�ア
リング(図1)とそれを利用した損失(式4)が肝.
阿阿久久澤澤
Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg,
Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous(google)
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
どんな研究?
音声合成�,一��テ�ストに対していく�も�読み上げ
方(韻律)が存在する問題を抱えている.
音声合成システムTacotronを��し,一��テ�ストを
様々な韻律で読み上げることを可能にした.
提案法�肝�?
�行研究と��い�?
ナイーブなseq2seq音声合成類デル�そもそも韻律をコン
トロールする方法を持たない
韻律�表現をクラスターで学習し,そ�表現を音声合成類
デル���に用いる既存研究と�い,End-to-Endに学習
を行うことができる
Keyとなる洞察/結果
鹿鹿山山
XiaosongWang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification
and Localization of Common Thorax Diseases (CVPR 2017, arXiv 2017.12)
どんな研究?
大規模な胸部X線画像データベースを作成し,またそ�
データを用いて画像から病状を分類するCNN類デル�
ベースラインを提案
提案法�肝�?
�行研究と��い�?
32,717人人 112,120枚枚��胸胸部部X線線画画像像にに対対すするる8種種類類��病病
状状ララベベルルをを作作成成 ((後後ににララベベルル数数をを14種種類類にに拡拡大大))
- ���多くて7400枚 but ラベル無し
- �����理を用いて,画像に��する��線��
レ�ートから一�的な�種�病状ラベルを推定
既既存存��画画像像分分類類類類デデルル AlexNet, GoogLeNet,
VGGNet, ResNet をを用用いいてて包包括括的的なな分分類類精精度度ベベーーススラライイ
ンン検検証証
Keyとなる洞察/結果
データ��りに対して,���数で�������重み�
けを��することで類デル�精度上�
病�に�いて�表現を�規化する DNorm アルゴリズム,
並びに文章が表す biomedical context を抽出する
MetaMap アルゴリズム ��合せ,レ�ートに�られる特
�的な�定構文�を��により,8種類�病状推定で
90%�精度,再現度,F1値を達成
→ ��に�るラベル��動生成が可能に
Qingji Guan, Yaping Huang, Zhun Zhong, Zhedong Zheng, Liang Zheng and Yi Yang
Diagnose like a Radiologist: Attention Guided Convolutional Neural Network
for Thorax Disease Classification (arXiv 2018.1)
どんな研究?
胸部X線画像から病状を分類するにあたって,画像全体と
一部分���を�合する CNN を提案.
ChestX-ray14[Wang 2017] データセットで現状SOTA
提案法�肝�?
�行研究と��い�?
画画像像全全体体だだけけででななくく,,注注意意機機構構でで抽抽出出ししたた特特定定��一一部部分分
��画画像像もも用用いいてて分分類類問問題題ををををくく
- 病��胸部X線画像�一部に�み存在していることが
多い
- �像��によって��んでいたり,境界がズレている
ことがある
注注目目領領域域抽抽出出��学学習習にに,,GTババウウンンデディィンンググボボッッククススをを必必
要要ととししなないい
Keyとなる洞察/結果
Local x Global (DenseNet121) で SoTA,
Global -> Local -> Fusion � 順で学習,重み固定が一
番精度が出た
松松嶋嶋
Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research)
https://arxiv.org/abs/1711.11543
“Embodied Question Answering” (arXiv, 2017)
どんな研究?
3D環境中における新たなQAタスクEmbodied Question
Answering (EmbodiedQA)を提案.
シミュレータをgithub上で公開.
https://github.com/facebookresearch/house3d
提案法�肝�?
既存�QAタスクと��い
1) 状態が1人称視点で与えられる
2) �問に�えるた�に�行動が必要
(一応)実験として,プランナとコントローラからなる階層的
なRLを利用した結果を掲載
- navigationとQA�類�ュールを�々に学習(SL or 模倣
学習)したあと,両者を結合して学習
Keyとなる洞察
タスク�����的な部分
「環境�知覚�意����行動ができる知的な�ー��ント
�構�をすることが��的な目�」
- 能動的な知覚(active perception)が必要
- 「常識的な」推論が必要
ex)��ことを�かれたからガレー�に行こう
- ���groundingが必要(��と行動���対応)
Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research)
https://arxiv.org/abs/1711.11543
“Embodied Question Answering” (arXiv, 2017)
Overview
This paper proposes Embodied Question Answering
(EmbodiedQA) task.
The simulator is available in github.
https://github.com/facebookresearch/house3d
Key Point of Proposed Method
Difference between existing QA task
1) State is presented as First person view
2) Agent needs its actions in order to answer correctly
In Experiment, they use hierarchical RL consisted of
planner and controller
- Train separately both modules of navigation and
QA, then joint two modules
Main Insights
Design concept of task
“Long term objective is to make intellogent agents that
can perceive, communicate and act”
- need active perception
- need inference with “common sense”
ex) If asked about car, agents try to go garage,
- need grounding of symbol and real world
David Ha, Jürgen Schmidhuber
https://arxiv.org/abs/1803.10122
“World Models” (arXiv, 2018)
どんな研究?
�化学習において環境�類デル�学習と�ー��ント�
操作を分けて学習する
- VAEと混合ガウシアンRNNで環境��イナミクスを類デ
ル化
- �化学習�コントローラ部分�シンプルにできる
環境�類デル�学習により実環境なしで学習可能
(hallucinated dream)で,学習した方策を実環境に転移す
ることも可能
提案法�肝�?
環境�類デルとコントローラに分けることでコントローラを
シンプルにした
- VAEで入力を低次元化
- 潜在表現z�分分布布を予測する(混合ガウシアンRNN)
- コントローラ�シンプルな類デルでよい
(zとhを結合した線形類デル)
�行研究と��い�?
大きなRNN����的なデータ�表現力が高い
But RLにおいて�credit assignment�問題があり,��
的小さいNNを利用していた
- 小さい類デル�方が早く�い方策を��ける
提案手法で�,環境�類デルとコントローラに分けること
で表現力�高い大きなNNを利用できる
Keyとなる結果
- CarRacing-v0で��て規定�スコアを上�った
- 学習した環境�類デルだけでタスクが実行可能
David Ha, Jürgen Schmidhuber
https://arxiv.org/abs/1803.10122
“World Models” (arXiv, 2018)
Overview
This paper proposes to learn dynamics of environment
and control of agent separately in RL settings.
- model dynamics of environment using VAE and
mixture gaussian RNN
- We can make controller simpler (with fewer
parameters)
By learning model of environment, the agent can learn
policies without interacting real environment
(hallucinated dream), then even transfer into real
settings.
Key Point of Proposed Method
Making the controller simpler by dividing models into
“World Model” with a RNN, and controller with small
number of parameters
- dimension reduction with VAE
- predict latent representation z using Gaussian
Mixture RNN
- simple controller with linear model
Difference between Previous Work
Large RNNs have high capacity but in RL setting,
there’s credit assignment problem, so existing method
tended to use smaller RNNs.
In proposed method, the model is divided into the
model of environment and controller, so large RNNs
can be used.
Main Insights
- First model that achieved required score in
CarRacing-v0 task
- solve task using only learned environment model
Ad

More Related Content

What's hot (14)

[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
Deep Learning JP
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
Deep Learning JP
 
【2017.03】cvpaper.challenge2017
【2017.03】cvpaper.challenge2017【2017.03】cvpaper.challenge2017
【2017.03】cvpaper.challenge2017
cvpaper. challenge
 
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
Deep Learning JP
 
Generative Models(メタサーベイ )
Generative Models(メタサーベイ )Generative Models(メタサーベイ )
Generative Models(メタサーベイ )
cvpaper. challenge
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
cvpaper. challenge
 
Annotating object instances with a polygon rnn
Annotating object instances with a polygon rnnAnnotating object instances with a polygon rnn
Annotating object instances with a polygon rnn
Takanori Ogata
 
【2016.02】cvpaper.challenge2016
【2016.02】cvpaper.challenge2016【2016.02】cvpaper.challenge2016
【2016.02】cvpaper.challenge2016
cvpaper. challenge
 
[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ
Deep Learning JP
 
[DL輪読会]Explainable Reinforcement Learning: A Survey
[DL輪読会]Explainable Reinforcement Learning: A Survey[DL輪読会]Explainable Reinforcement Learning: A Survey
[DL輪読会]Explainable Reinforcement Learning: A Survey
Deep Learning JP
 
【2017.02】cvpaper.challenge2017
【2017.02】cvpaper.challenge2017【2017.02】cvpaper.challenge2017
【2017.02】cvpaper.challenge2017
cvpaper. challenge
 
グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題
joisino
 
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
Deep Learning JP
 
[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation
Deep Learning JP
 
[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
[DL輪読会]Life-Long Disentangled Representation Learning with Cross-Domain Laten...
Deep Learning JP
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
Deep Learning JP
 
【2017.03】cvpaper.challenge2017
【2017.03】cvpaper.challenge2017【2017.03】cvpaper.challenge2017
【2017.03】cvpaper.challenge2017
cvpaper. challenge
 
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
Deep Learning JP
 
Generative Models(メタサーベイ )
Generative Models(メタサーベイ )Generative Models(メタサーベイ )
Generative Models(メタサーベイ )
cvpaper. challenge
 
【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016【2016.08】cvpaper.challenge2016
【2016.08】cvpaper.challenge2016
cvpaper. challenge
 
Annotating object instances with a polygon rnn
Annotating object instances with a polygon rnnAnnotating object instances with a polygon rnn
Annotating object instances with a polygon rnn
Takanori Ogata
 
【2016.02】cvpaper.challenge2016
【2016.02】cvpaper.challenge2016【2016.02】cvpaper.challenge2016
【2016.02】cvpaper.challenge2016
cvpaper. challenge
 
[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ[DL輪読会]Dense Captioning分野のまとめ
[DL輪読会]Dense Captioning分野のまとめ
Deep Learning JP
 
[DL輪読会]Explainable Reinforcement Learning: A Survey
[DL輪読会]Explainable Reinforcement Learning: A Survey[DL輪読会]Explainable Reinforcement Learning: A Survey
[DL輪読会]Explainable Reinforcement Learning: A Survey
Deep Learning JP
 
【2017.02】cvpaper.challenge2017
【2017.02】cvpaper.challenge2017【2017.02】cvpaper.challenge2017
【2017.02】cvpaper.challenge2017
cvpaper. challenge
 
グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題
joisino
 
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...
Deep Learning JP
 
[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation[DL輪読会]Graph R-CNN for Scene Graph Generation
[DL輪読会]Graph R-CNN for Scene Graph Generation
Deep Learning JP
 

Similar to Deeplearning lt.pdf (20)

Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Toru Fujino
 
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者
cvpaper. challenge
 
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
Deep Learning JP
 
RobotPaperChallenge 2019-07
RobotPaperChallenge 2019-07RobotPaperChallenge 2019-07
RobotPaperChallenge 2019-07
robotpaperchallenge
 
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
Deep Learning JP
 
Introduction of the_paper
Introduction of the_paperIntroduction of the_paper
Introduction of the_paper
NaokiIto8
 
Semi supervised, weakly-supervised, unsupervised, and active learning
Semi supervised, weakly-supervised, unsupervised, and active learningSemi supervised, weakly-supervised, unsupervised, and active learning
Semi supervised, weakly-supervised, unsupervised, and active learning
Yusuke Uchida
 
[DL輪読会]ODT: Online Decision Transformer
[DL輪読会]ODT: Online Decision Transformer[DL輪読会]ODT: Online Decision Transformer
[DL輪読会]ODT: Online Decision Transformer
Deep Learning JP
 
Learning to summarize from human feedback
Learning to summarize from human feedbackLearning to summarize from human feedback
Learning to summarize from human feedback
harmonylab
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Daiki Shimada
 
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
Deep Learning JP
 
Connecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typingConnecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typing
禎晃 山崎
 
Deep learning勉強会20121214ochi
Deep learning勉強会20121214ochiDeep learning勉強会20121214ochi
Deep learning勉強会20121214ochi
Ohsawa Goodfellow
 
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks
Deep Learning JP
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)
Masahiro Suzuki
 
【2015.05】cvpaper.challenge@CVPR2015
【2015.05】cvpaper.challenge@CVPR2015【2015.05】cvpaper.challenge@CVPR2015
【2015.05】cvpaper.challenge@CVPR2015
cvpaper. challenge
 
先端技術とメディア表現 第4回レポートまとめ
先端技術とメディア表現 第4回レポートまとめ先端技術とメディア表現 第4回レポートまとめ
先端技術とメディア表現 第4回レポートまとめ
Digital Nature Group
 
[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection
Deep Learning JP
 
[DL輪読会]Relational inductive biases, deep learning, and graph networks
[DL輪読会]Relational inductive biases, deep learning, and graph networks[DL輪読会]Relational inductive biases, deep learning, and graph networks
[DL輪読会]Relational inductive biases, deep learning, and graph networks
Deep Learning JP
 
Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Rainbow: Combining Improvements in Deep Reinforcement Learning (AAAI2018 unde...
Toru Fujino
 
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者
cvpaper. challenge
 
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
[DL輪読会]Object-Oriented Dynamics Predictor (NIPS 2018)
Deep Learning JP
 
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
【輪読会】Braxlines: Fast and Interactive Toolkit for RL-driven Behavior Engineeri...
Deep Learning JP
 
Introduction of the_paper
Introduction of the_paperIntroduction of the_paper
Introduction of the_paper
NaokiIto8
 
Semi supervised, weakly-supervised, unsupervised, and active learning
Semi supervised, weakly-supervised, unsupervised, and active learningSemi supervised, weakly-supervised, unsupervised, and active learning
Semi supervised, weakly-supervised, unsupervised, and active learning
Yusuke Uchida
 
[DL輪読会]ODT: Online Decision Transformer
[DL輪読会]ODT: Online Decision Transformer[DL輪読会]ODT: Online Decision Transformer
[DL輪読会]ODT: Online Decision Transformer
Deep Learning JP
 
Learning to summarize from human feedback
Learning to summarize from human feedbackLearning to summarize from human feedback
Learning to summarize from human feedback
harmonylab
 
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Convolutional Neural Networks のトレンド @WBAFLカジュアルトーク#2
Daiki Shimada
 
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
[DL輪読会]STORM: An Integrated Framework for Fast Joint-Space Model-Predictive C...
Deep Learning JP
 
Connecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typingConnecting embedding for knowledge graph entity typing
Connecting embedding for knowledge graph entity typing
禎晃 山崎
 
Deep learning勉強会20121214ochi
Deep learning勉強会20121214ochiDeep learning勉強会20121214ochi
Deep learning勉強会20121214ochi
Ohsawa Goodfellow
 
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks
Deep Learning JP
 
深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)深層生成モデルと世界モデル(2020/11/20版)
深層生成モデルと世界モデル(2020/11/20版)
Masahiro Suzuki
 
【2015.05】cvpaper.challenge@CVPR2015
【2015.05】cvpaper.challenge@CVPR2015【2015.05】cvpaper.challenge@CVPR2015
【2015.05】cvpaper.challenge@CVPR2015
cvpaper. challenge
 
先端技術とメディア表現 第4回レポートまとめ
先端技術とメディア表現 第4回レポートまとめ先端技術とメディア表現 第4回レポートまとめ
先端技術とメディア表現 第4回レポートまとめ
Digital Nature Group
 
[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection
Deep Learning JP
 
[DL輪読会]Relational inductive biases, deep learning, and graph networks
[DL輪読会]Relational inductive biases, deep learning, and graph networks[DL輪読会]Relational inductive biases, deep learning, and graph networks
[DL輪読会]Relational inductive biases, deep learning, and graph networks
Deep Learning JP
 
Ad

More from Deep Learning JP (20)

【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
 
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
 
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
 
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
 
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
 
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
 
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo... 【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
 
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
 
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
 
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
 
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
 
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
 
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
 
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "【DL輪読会】"Language Instructed Reinforcement Learning  for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
 
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
 
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
 
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
 
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
 
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
 
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
 
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
 
Ad

Deeplearning lt.pdf

  • 1. 1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ DeepLearning 論文乱読会 2018/4/6
  • 2. Aman Sinha, Hongseok Namkoong, John Duchi (Stanford University) “Certifying Some Distributional Robustness with Principled Adversarial Training” (ICLR2018, Oral) どんな研究? データ������動にロバストな推論を行えることを� 証する効率的な手法を提案.Distributional Robustness, Wasserstein Ball��点を利用. 提案法�肝�? (1) ラグラン�ュ��定�数法でDistributional Robustnessを書き直す. (2) 上���な���大�(NP-Hard)な�で�上手く� 似する => よく分かってないですすいません。 �行研究と��い�? Distributional robustnessに��く��方法より効率的(P をどう��か)?で��いクラス(損失�形)に適用可能. Fast gradient sign method (FGSM) [Goodfellow, 2015] �ような入力��で���線形���定�ない. Keyとなる洞察/結果
  • 4. Hongseok Namkoong, John Duchi (Stanford University) “Variance-based Regularization with Convex Objectives” (NIPS2017, Oral, Best Paper) どんな研究? Keyとなる洞察/結果 生成過程Pにロバストな推論を行うた��手法を提案.応 用�として�Adversarial Exampleやドメイン適応など. �行研究と��い�? Varianceを直��小化する方法(次式)�あったが ,Variance�損失�数が�でも��になる.提案法�� なままこ�式を�似. 提案法�肝�? Distributional Robustness��点で�える + そ�効率的 な学習方法 [D. & Namkoog, 2016] 理論的な結果 1. 提案法�式が�Varianceを��にする�適化�� 界により高い確率で抑えられる 2. 収束も早い 実験的な結果(ピックアップ) 1. Portease clevage experiment 2. Document classification in Reuters corpus
  • 5. Saeid Motiian, Quinn Jones, Seyed Mehdi Iranmanesh, Gianfranco Doretto (West Virginia University) “Few-Shot Adversarial Domain Adaptation” (NIPS2017) https://arxiv.org/abs/1711.02536 どんな研究? ター�ットドメイン��ンプルが�ししか無い(がラベルが ある)��ドメイン適応手法を提案.クラスを��した�ア リング(図1)とそれを利用した損失(式4)が肝.
  • 7. Yuxuan Wang, Daisy Stanton, Yu Zhang, RJ Skerry-Ryan, Eric Battenberg, Joel Shor, Ying Xiao, Fei Ren, Ye Jia, Rif A. Saurous(google) Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis どんな研究? 音声合成�,一��テ�ストに対していく�も�読み上げ 方(韻律)が存在する問題を抱えている. 音声合成システムTacotronを��し,一��テ�ストを 様々な韻律で読み上げることを可能にした. 提案法�肝�? �行研究と��い�? ナイーブなseq2seq音声合成類デル�そもそも韻律をコン トロールする方法を持たない 韻律�表現をクラスターで学習し,そ�表現を音声合成類 デル���に用いる既存研究と�い,End-to-Endに学習 を行うことができる Keyとなる洞察/結果
  • 9. XiaosongWang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald M. Summers ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases (CVPR 2017, arXiv 2017.12) どんな研究? 大規模な胸部X線画像データベースを作成し,またそ� データを用いて画像から病状を分類するCNN類デル� ベースラインを提案 提案法�肝�? �行研究と��い�? 32,717人人 112,120枚枚��胸胸部部X線線画画像像にに対対すするる8種種類類��病病 状状ララベベルルをを作作成成 ((後後ににララベベルル数数をを14種種類類にに拡拡大大)) - ���多くて7400枚 but ラベル無し - �����理を用いて,画像に��する��線�� レ�ートから一�的な�種�病状ラベルを推定 既既存存��画画像像分分類類類類デデルル AlexNet, GoogLeNet, VGGNet, ResNet をを用用いいてて包包括括的的なな分分類類精精度度ベベーーススラライイ ンン検検証証 Keyとなる洞察/結果 データ��りに対して,���数で�������重み� けを��することで類デル�精度上� 病�に�いて�表現を�規化する DNorm アルゴリズム, 並びに文章が表す biomedical context を抽出する MetaMap アルゴリズム ��合せ,レ�ートに�られる特 �的な�定構文�を��により,8種類�病状推定で 90%�精度,再現度,F1値を達成 → ��に�るラベル��動生成が可能に
  • 10. Qingji Guan, Yaping Huang, Zhun Zhong, Zhedong Zheng, Liang Zheng and Yi Yang Diagnose like a Radiologist: Attention Guided Convolutional Neural Network for Thorax Disease Classification (arXiv 2018.1) どんな研究? 胸部X線画像から病状を分類するにあたって,画像全体と 一部分���を�合する CNN を提案. ChestX-ray14[Wang 2017] データセットで現状SOTA 提案法�肝�? �行研究と��い�? 画画像像全全体体だだけけででななくく,,注注意意機機構構でで抽抽出出ししたた特特定定��一一部部分分 ��画画像像もも用用いいてて分分類類問問題題ををををくく - 病��胸部X線画像�一部に�み存在していることが 多い - �像��によって��んでいたり,境界がズレている ことがある 注注目目領領域域抽抽出出��学学習習にに,,GTババウウンンデディィンンググボボッッククススをを必必 要要ととししなないい Keyとなる洞察/結果 Local x Global (DenseNet121) で SoTA, Global -> Local -> Fusion � 順で学習,重み固定が一 番精度が出た
  • 12. Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research) https://arxiv.org/abs/1711.11543 “Embodied Question Answering” (arXiv, 2017) どんな研究? 3D環境中における新たなQAタスクEmbodied Question Answering (EmbodiedQA)を提案. シミュレータをgithub上で公開. https://github.com/facebookresearch/house3d 提案法�肝�? 既存�QAタスクと��い 1) 状態が1人称視点で与えられる 2) �問に�えるた�に�行動が必要 (一応)実験として,プランナとコントローラからなる階層的 なRLを利用した結果を掲載 - navigationとQA�類�ュールを�々に学習(SL or 模倣 学習)したあと,両者を結合して学習 Keyとなる洞察 タスク�����的な部分 「環境�知覚�意����行動ができる知的な�ー��ント �構�をすることが��的な目�」 - 能動的な知覚(active perception)が必要 - 「常識的な」推論が必要 ex)��ことを�かれたからガレー�に行こう - ���groundingが必要(��と行動���対応)
  • 13. Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra (Facebook Research) https://arxiv.org/abs/1711.11543 “Embodied Question Answering” (arXiv, 2017) Overview This paper proposes Embodied Question Answering (EmbodiedQA) task. The simulator is available in github. https://github.com/facebookresearch/house3d Key Point of Proposed Method Difference between existing QA task 1) State is presented as First person view 2) Agent needs its actions in order to answer correctly In Experiment, they use hierarchical RL consisted of planner and controller - Train separately both modules of navigation and QA, then joint two modules Main Insights Design concept of task “Long term objective is to make intellogent agents that can perceive, communicate and act” - need active perception - need inference with “common sense” ex) If asked about car, agents try to go garage, - need grounding of symbol and real world
  • 14. David Ha, Jürgen Schmidhuber https://arxiv.org/abs/1803.10122 “World Models” (arXiv, 2018) どんな研究? �化学習において環境�類デル�学習と�ー��ント� 操作を分けて学習する - VAEと混合ガウシアンRNNで環境��イナミクスを類デ ル化 - �化学習�コントローラ部分�シンプルにできる 環境�類デル�学習により実環境なしで学習可能 (hallucinated dream)で,学習した方策を実環境に転移す ることも可能 提案法�肝�? 環境�類デルとコントローラに分けることでコントローラを シンプルにした - VAEで入力を低次元化 - 潜在表現z�分分布布を予測する(混合ガウシアンRNN) - コントローラ�シンプルな類デルでよい (zとhを結合した線形類デル) �行研究と��い�? 大きなRNN����的なデータ�表現力が高い But RLにおいて�credit assignment�問題があり,�� 的小さいNNを利用していた - 小さい類デル�方が早く�い方策を��ける 提案手法で�,環境�類デルとコントローラに分けること で表現力�高い大きなNNを利用できる Keyとなる結果 - CarRacing-v0で��て規定�スコアを上�った - 学習した環境�類デルだけでタスクが実行可能
  • 15. David Ha, Jürgen Schmidhuber https://arxiv.org/abs/1803.10122 “World Models” (arXiv, 2018) Overview This paper proposes to learn dynamics of environment and control of agent separately in RL settings. - model dynamics of environment using VAE and mixture gaussian RNN - We can make controller simpler (with fewer parameters) By learning model of environment, the agent can learn policies without interacting real environment (hallucinated dream), then even transfer into real settings. Key Point of Proposed Method Making the controller simpler by dividing models into “World Model” with a RNN, and controller with small number of parameters - dimension reduction with VAE - predict latent representation z using Gaussian Mixture RNN - simple controller with linear model Difference between Previous Work Large RNNs have high capacity but in RL setting, there’s credit assignment problem, so existing method tended to use smaller RNNs. In proposed method, the model is divided into the model of environment and controller, so large RNNs can be used. Main Insights - First model that achieved required score in CarRacing-v0 task - solve task using only learned environment model