This document summarizes a research paper on scaling laws for neural language models. Some key findings of the paper include:
- Language model performance depends strongly on model scale and weakly on model shape. With enough compute and data, performance scales as a power law of parameters, compute, and data.
- Overfitting is universal, with penalties depending on the ratio of parameters to data.
- Large models have higher sample efficiency and can reach the same performance levels with less optimization steps and data points.
- The paper motivated subsequent work by OpenAI on applying scaling laws to other domains like computer vision and developing increasingly large language models like GPT-3.
This document summarizes a research paper on scaling laws for neural language models. Some key findings of the paper include:
- Language model performance depends strongly on model scale and weakly on model shape. With enough compute and data, performance scales as a power law of parameters, compute, and data.
- Overfitting is universal, with penalties depending on the ratio of parameters to data.
- Large models have higher sample efficiency and can reach the same performance levels with less optimization steps and data points.
- The paper motivated subsequent work by OpenAI on applying scaling laws to other domains like computer vision and developing increasingly large language models like GPT-3.
強化学習勉強会・論文紹介(第50回)Optimal Asset Allocation using Adaptive Dynamic Programming...Naoki Nishimura
Optimal Asset Allocation using Adaptive Dynamic Programming
Neuneier. Ralph, In Advances in Neural Information Processing Systems. 1996.
Enhancing Q-Learning for Optimal Asset Allocation
Neuneier. Ralph, In Advances in Neural Information Processing Systems. 1998.
This document contains notes from a machine learning discussion. It includes:
1. An introduction to BakFoo Inc. CEO Yuta Kashino's background in astrophysics, Python, and realtime data platforms.
2. References to papers and researchers in Bayesian deep learning and probabilistic programming, including Edward library creators Dustin Tran and Blei Lab.
3. An overview of how Edward combines TensorFlow for deep learning with probabilistic programming to perform Bayesian modeling, inference via VI and MCMC, and criticisms.
Tensor Decomposition and its ApplicationsKeisuke OTAKI
This document discusses tensor factorizations and decompositions and their applications in data mining. It introduces tensors as multi-dimensional arrays and covers 2nd order tensors (matrices) and 3rd order tensors. It describes how tensor decompositions like the Tucker model and CANDECOMP/PARAFAC (CP) model can be used to decompose tensors into core elements to interpret data. It also discusses singular value decomposition (SVD) as a way to decompose matrices and reduce dimensions while approximating the original matrix.
Convex Optimization Modelling with CVXOPTandrewmart11
An introduction to convex optimization modelling using cvxopt in an IPython environment. The facility location problem is used as an example to demonstrate modelling in cvxopt.
13. 行動予測
人の行動軌跡から、「人の好む経路」を
学習
✔ 行き先を指定して、「どの経路を通る
か」を推定できる
✔ 芝生、歩道 … などの属性の価値を
推定しているので、別シーンへの適用
も可能
Kris Kitani, Brian D. Ziebart, J. Andrew Bagnell, and Martial
Hebert, "Activity Forecasting," European Conference on
Computer Vision (ECCV), October, 2012.
13
16. 論文概要
タイトル: Maximum Entropy Deep Inverse Reinforcement Learning
著者: Markus Wulfmeier, Peter Ondruska, Ingmar Posner
✔ IRL の1手法である Maximum Entropy IRL を拡張
✔ ニューラルネットを用い、複雑で非線形な報酬関数を近似
✔ 簡単な実験で現時点で State of Art な手法(GPIRL)と同等以上の精度が、
高速に得られた
16