Glm talk Tomas

0 likes•3,347 views

The document discusses distributed implementations of generalized linear models (GLMs) on the H2O platform. GLMs generalize linear regression by adding a link function to transform the response variable and allow the noise variance to vary. The H2O implementation solves GLMs using an inner-outer loop approach, with the inner loop using an alternating direction method of multipliers solver and the outer loop averaging results across nodes. Regularization is added through elastic net penalties to avoid overfitting and obtain sparse solutions.

Technology Health & Medicine

Distributed GLM Implementation
on H2O platform
Tomas Nykodym, 0xDATA

Linear Regression
Data:
x, y + noise
Goal:
predict y using x
i.e. find a,b s.t.
y = a*x + b

Linear Regression
Least Squares Fit
Real Relation:
y=3x+10+N(0,20)
Best Fit:
y = 3.08*x + 6

Prostate Cancer Example
Data:
x = PSA
(prostate-specific antigen)
y = CAPSULE
0 = no tumour
1 = tumour
Goal:
predict y using x

Prostate Cancer Example
Linear Regression Fit
Data:
x = PSA
(prostate-specific antigen)
y = CAPSULE
0 = no tumour
1 = tumour
Fit:
Least squares fit

Generalized Linear Model
Generalizes linear regression by:
– adding a link function g to transform the output
z = g(y) – new response variable
– noise (i.e.variance) does not have to be constant
– fit is maximal likelihood instead of least squares

Prostate Cancer
Logistic Regression Fit
Data:
x = PSA
(prostate-specific antigen)
y = CAPSULE
0 = no tumour
1 = tumour
GLM Fit:
– Binomial family
– Logit link
– Predict probability
of CAPSULE=1.

Implementation - Solve GLM by IRLSM
Input:
– X: data matrix N*P
– Y: response vector (N rows)
– family, link function, α,β
INNER LOOP:
Solve elastic net:
ADMM(Boyd 2010, page 43):
OUTER LOOP:
While β changes, compute:
zk +1=β k +( y−μ k )
d η
d μ
W k +1
−1
=(
d η
d μ
)
2
Var(μ k )
γ
l+1
=( X
T
WX +ρ I )
−1
X
T
Wz+ρ (β
l
−u
l
)
β l+1
=Sλ /ρ (γ l+1
+ul
)
ul+1
=uk
+γ l+1
−βl+1
Output:
– β vector of coefficients, solution
to max-likellihood
XX = X
T
W k+1 X
Xz= X
T
Wzk +1

$H2O Implementation Outer Loop: (Map Reduce Task) public class SimpleGLM extends MRTask { @Override public void map(Chunk c) { res = new double [p][p]; for(double [] x:c.rows()){ double eta,mu,var; eta = computeEta(x); mu = _link.linkInv(eta); var = Math.max(1e-5,_family.variance(mu)); double dp = _link.linkInvDeriv(eta); double w = dp*dp/var; for(int i = 0; i < x.length; ++i) for(int j = 0; j < x.length; ++j) res[i][j] += x[i]*x[j]*w; } } @Override public void reduce(SimpleGLM g) { for(int i = 0; i < res.length; ++i) for(int j = 0; i < res.length; ++i) res[i][j] += g.res[i][j]; } } Inner Loop: (ADMM solver) public double [] solve(Matrix xx, Matrix xy) { // ADMM LSM Solve CholeskyDecomposition lu; // cache decomp! lu = new CholeskyDecomposition(xx); for( int i = 0; i < 1000; ++i ) { // Solve using cached Cholesky decomposition! xm = lu.solve(xyPrime); // compute u and z update for( int j = 0; j < N-1; ++j ) { double x_hat = xm.get(j, 0); x_norm += x_hat * x_hat; double zold = z[j]; z[j] = shrinkage(x_hat + u[j], kappa); u[j] += x_hat - z[j]; u_norm += u[j] * u[j]; } } } double shrinkage(double x, double k) { return Math.max(0,x-k)-Math.max(0,-x-k); }$

Regularization
Elastic Net (Zhou, Hastie, 2005):
● Added L1 and L2 penalty to β to:
– avoid overfitting, reduce variance
– obtain sparse solution (L1 penalty)
– avoid problems with correlated covariates
No longer analytical solution.
Options: LARS, ADMM, Generalized Gradient, ...
β =argmin( X β − y)
T
( X β −y)+α ∥β∥1+(1−α )∥β∥2
2

Linear Regression
Least Squares Method
Find β by minimizing the sum of squared errors:
Analytical solution:
Easily parallelized if XT
X is reasonably small.
β =(X T
X )−1
X T
y=(
1
n
∑ xi xi
T
)
−1
1
n
∑ xi y
β =argmin( X β − y)
T
( X β −y)

Generalized Linear Model
● Generalizes linear regression by:
– adding a link function g to transform the response
z = g(y) – new response variable
η = Xβ – linear predictor
μ = g-1
(η)
– y has a distribution in the exponential family
– variance depends on μ
e.g var(μ) = μ*(1-μ) for Binomial family.
– fit by maximizing the likelihood of the model

More Related Content

What's hot (20)

PPTX

Pytorch and Machine Learning for the Math ImpairedTyrel Denison

PDF

Semi vae memo (1)Masato Nakai

PDF

Irs gan docMasato Nakai

PDF

Semi vae memo (2)Masato Nakai

PDF

Numerical_Methods_Simpson_RuleAlex_5991

PDF

Deep Learning with Julia1.0 and FluxSatoshi Terasaki

PPT

C++ ammar .s.qammarsalem5

PDF

Lec 3-mcgregorAtner Yegorov

PDF

Fast parallelizable scenario-based stochastic optimizationPantelis Sopasakis

PDF

Sergey Shelpuk & Olha Romaniuk - “Deep learning, Tensorflow, and Fashion: how...Lviv Startup Club

PDF

InterpolationDmytro Mitin

PDF

Машинное обучение на JS. С чего начать и куда идти | Odessa Frontend Meetup #12OdessaFrontend

PPTX

Cristian tovar 10 03 jtIsaac Geovaniz Tellez Chacon

PPTX

Cristian tovar 10 03 jtIsaac Geovaniz Tellez Chacon

PDF

Statistics for Economics Midterm 2 Cheat SheetLaurel Ayuyao

PDF

The Uncertain EnterpriseClarkTony

PDF

HMPC for Upper Stage Attitude ControlPantelis Sopasakis

PPTX

CrystalBall - Compute Relative Frequency in Hadoop Suvash Shah

PDF

0004sts9ratm0624

PDF

Maximizing Submodular Function over the Integer LatticeTasuku Soma

Pytorch and Machine Learning for the Math ImpairedTyrel Denison

Semi vae memo (1)Masato Nakai

Irs gan docMasato Nakai

Semi vae memo (2)Masato Nakai

Numerical_Methods_Simpson_RuleAlex_5991

Deep Learning with Julia1.0 and FluxSatoshi Terasaki

C++ ammar .s.qammarsalem5

Lec 3-mcgregorAtner Yegorov

Fast parallelizable scenario-based stochastic optimizationPantelis Sopasakis

Sergey Shelpuk & Olha Romaniuk - “Deep learning, Tensorflow, and Fashion: how...Lviv Startup Club

InterpolationDmytro Mitin

Машинное обучение на JS. С чего начать и куда идти | Odessa Frontend Meetup #12OdessaFrontend

Cristian tovar 10 03 jtIsaac Geovaniz Tellez Chacon

Statistics for Economics Midterm 2 Cheat SheetLaurel Ayuyao

The Uncertain EnterpriseClarkTony

HMPC for Upper Stage Attitude ControlPantelis Sopasakis

CrystalBall - Compute Relative Frequency in Hadoop Suvash Shah

0004sts9ratm0624

Maximizing Submodular Function over the Integer LatticeTasuku Soma

Viewers also liked (6)

PDF

H2O World - Building Data Products for Data Natives - Monica RogatiSri Ambati

PPTX

Running GLM in RSri Ambati

PDF

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14Sri Ambati

PPTX

H2O World - Clustering & Feature Extraction on Text - Seth RedmoreSri Ambati

PDF

Arno candel h2o_a_platform_for_big_math_hadoop_summit_june2016Sri Ambati

PDF

H2O World - Advanced Analytics at Macys.com - Daqing ZhaoSri Ambati

H2O World - Building Data Products for Data Natives - Monica RogatiSri Ambati

Running GLM in RSri Ambati

H2O.ai's Distributed Deep Learning by Arno Candel 04/03/14Sri Ambati

H2O World - Clustering & Feature Extraction on Text - Seth RedmoreSri Ambati

Arno candel h2o_a_platform_for_big_math_hadoop_summit_june2016Sri Ambati

H2O World - Advanced Analytics at Macys.com - Daqing ZhaoSri Ambati

Similar to Glm talk Tomas (20)

PDF

Hybrid dynamics in large-scale logistics networksMKosmykov

PDF

Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo

PDF

H2O World - Consensus Optimization and Machine Learning - Stephen BoydSri Ambati

PDF

LMM, linear models with random effects, lecture 10CharlesMBlangerNzaki

PDF

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie

PDF

SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel

PDF

MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...The Statistical and Applied Mathematical Sciences Institute

PPT

Logisticregressionsujimaa

PPT

Logisticregressionrnunoo

PDF

Need for Controllers having Integer Coeﬃcients in Homomorphically Encrypted D...CDSL_at_SNU

PPT

logisticregression.pptShrutiPanda12

PPT

logisticregressionSahilShahPhD2020

PPT

unconditional binary logisticregression.pptmikaelgirum

PDF

NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang

PDF

Computing near-optimal policies from trajectories by solving a sequence of st...Université de Liège (ULg)

PDF

QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...The Statistical and Applied Mathematical Sciences Institute

PPTX

ML unit-1.pptxSwarnaKumariChinni

PDF

Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsSean Meyn

PDF

PMHMathematicaSamplePeter Hammel

PDF

Hands-On Algorithms for Predictive ModelingArthur Charpentier

Hybrid dynamics in large-scale logistics networksMKosmykov

Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo

H2O World - Consensus Optimization and Machine Learning - Stephen BoydSri Ambati

LMM, linear models with random effects, lecture 10CharlesMBlangerNzaki

Time-Series Analysis on Multiperiodic Conditional Correlation by Sparse Covar...Michael Lie

SIAM - Minisymposium on Guaranteed numerical algorithmsJagadeeswaran Rathinavel

MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...The Statistical and Applied Mathematical Sciences Institute

Logisticregressionsujimaa

Logisticregressionrnunoo

Need for Controllers having Integer Coeﬃcients in Homomorphically Encrypted D...CDSL_at_SNU

logisticregression.pptShrutiPanda12

logisticregressionSahilShahPhD2020

unconditional binary logisticregression.pptmikaelgirum

NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang

Computing near-optimal policies from trajectories by solving a sequence of st...Université de Liège (ULg)

QMC: Transition Workshop - Applying Quasi-Monte Carlo Methods to a Stochastic...The Statistical and Applied Mathematical Sciences Institute

ML unit-1.pptxSwarnaKumariChinni

Reinforcement Learning: Hidden Theory and New Super-Fast AlgorithmsSean Meyn

PMHMathematicaSamplePeter Hammel

Hands-On Algorithms for Predictive ModelingArthur Charpentier

More from Sri Ambati (20)

PDF

H2O Label Genie Starter Track - Support PresentationSri Ambati

PDF

H2O.ai Agents : From Theory to Practice - Support PresentationSri Ambati

PDF

H2O Generative AI Starter Track - Support Presentation Slides.pdfSri Ambati

PDF

H2O Gen AI Ecosystem Overview - Level 1 - Slide DeckSri Ambati

PDF

An In-depth Exploration of Enterprise h2oGPTe Slide DeckSri Ambati

PDF

Intro to Enterprise h2oGPTe Presentation SlidesSri Ambati

PDF

Enterprise h2o GPTe Learning Path Slide DeckSri Ambati

PDF

H2O Wave Course Starter - Presentation SlidesSri Ambati

PDF

Large Language Models (LLMs) - Level 3 SlidesSri Ambati

PDF

Data Science and Machine Learning Platforms (2024) SlidesSri Ambati

PDF

Data Prep for H2O Driverless AI - SlidesSri Ambati

PDF

H2O Cloud AI Developer Services - Slides (2024)Sri Ambati

PDF

LLM Learning Path Level 2 - Presentation SlidesSri Ambati

PDF

LLM Learning Path Level 1 - Presentation SlidesSri Ambati

PDF

Hydrogen Torch - Starter Course - Presentation SlidesSri Ambati

PDF

Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2Sri Ambati

PDF

H2O Driverless AI Starter Course - Slides and AssignmentsSri Ambati

PPTX

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati

PDF

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati

PPTX

Generative AI Masterclass - Model Risk Management.pptxSri Ambati

H2O Label Genie Starter Track - Support PresentationSri Ambati

H2O.ai Agents : From Theory to Practice - Support PresentationSri Ambati

H2O Generative AI Starter Track - Support Presentation Slides.pdfSri Ambati

H2O Gen AI Ecosystem Overview - Level 1 - Slide DeckSri Ambati

An In-depth Exploration of Enterprise h2oGPTe Slide DeckSri Ambati

Intro to Enterprise h2oGPTe Presentation SlidesSri Ambati

Enterprise h2o GPTe Learning Path Slide DeckSri Ambati

H2O Wave Course Starter - Presentation SlidesSri Ambati

Large Language Models (LLMs) - Level 3 SlidesSri Ambati

Data Science and Machine Learning Platforms (2024) SlidesSri Ambati

Data Prep for H2O Driverless AI - SlidesSri Ambati

H2O Cloud AI Developer Services - Slides (2024)Sri Ambati

LLM Learning Path Level 2 - Presentation SlidesSri Ambati

LLM Learning Path Level 1 - Presentation SlidesSri Ambati

Hydrogen Torch - Starter Course - Presentation SlidesSri Ambati

Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2Sri Ambati

H2O Driverless AI Starter Course - Slides and AssignmentsSri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati

Generative AI Masterclass - Model Risk Management.pptxSri Ambati

Recently uploaded (20)

PDF

State-Dependent Conformal Perception Bounds for Neuro-Symbolic VerificationIvan Ruchkin

PDF

Market Insight : ETH Dominance ReturnsCIFDAQ

PDF

Brief History of Internet - Early Days of Internetsutharharshit158

PDF

Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdfCA Suvidha Chaplot

PDF

The Future of Artificial Intelligence (AI)Mukul

PPTX

Introduction to Flutter by Ayush Desai.pptxayushdesai204

PDF

Generative AI vs Predictive AI-The Ultimate Comparison GuideLily Clark

PPTX

IT Runs Better with ThousandEyes AI-driven AssuranceThousandEyes

PDF

Trying to figure out MCP by actually building an app from scratch with open s...Julien SIMON

PDF

RAT Builders - How to Catch Them All [DeepSec 2024]malmoeb

PDF

Responsible AI and AI Ethics - By Sylvester EbhonuSylvester Ebhonu

PPTX

cloud computing vai.pptx for the projectvaibhavdobariyal79

PPTX

Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptxbashnahara11

PDF

TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...TrustArc

PDF

Per Axbom: The spectacular lies of mapsNexer Digital

PDF

How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdfStryv Solutions Pvt. Ltd.

PDF

GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdfLuiz Carneiro

PDF

Research-Fundamentals-and-Topic-Development.pdfayesha butalia

PPTX

Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...AgileNetwork

PPTX

AI in Daily Life: How Artificial Intelligence Helps Us Every Dayvanshrpatil7

State-Dependent Conformal Perception Bounds for Neuro-Symbolic VerificationIvan Ruchkin

Market Insight : ETH Dominance ReturnsCIFDAQ

Brief History of Internet - Early Days of Internetsutharharshit158

Data_Analytics_vs_Data_Science_vs_BI_by_CA_Suvidha_Chaplot.pdfCA Suvidha Chaplot

The Future of Artificial Intelligence (AI)Mukul

Introduction to Flutter by Ayush Desai.pptxayushdesai204

Generative AI vs Predictive AI-The Ultimate Comparison GuideLily Clark

IT Runs Better with ThousandEyes AI-driven AssuranceThousandEyes

Trying to figure out MCP by actually building an app from scratch with open s...Julien SIMON

RAT Builders - How to Catch Them All [DeepSec 2024]malmoeb

Responsible AI and AI Ethics - By Sylvester EbhonuSylvester Ebhonu

cloud computing vai.pptx for the projectvaibhavdobariyal79

Farrell_Programming Logic and Design slides_10e_ch02_PowerPoint.pptxbashnahara11

TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...TrustArc

Per Axbom: The spectacular lies of mapsNexer Digital

How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdfStryv Solutions Pvt. Ltd.

GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdfLuiz Carneiro

Research-Fundamentals-and-Topic-Development.pdfayesha butalia

Agile Chennai 18-19 July 2025 | Workshop - Enhancing Agile Collaboration with...AgileNetwork

AI in Daily Life: How Artificial Intelligence Helps Us Every Dayvanshrpatil7

Glm talk Tomas

1. Distributed GLM Implementation on H2O platform Tomas Nykodym, 0xDATA

2. Linear Regression Data: x, y + noise Goal: predict y using x i.e. find a,b s.t. y = a*x + b

3. Linear Regression Least Squares Fit Real Relation: y=3x+10+N(0,20) Best Fit: y = 3.08*x + 6

4. Prostate Cancer Example Data: x = PSA (prostate-specific antigen) y = CAPSULE 0 = no tumour 1 = tumour Goal: predict y using x

5. Prostate Cancer Example Linear Regression Fit Data: x = PSA (prostate-specific antigen) y = CAPSULE 0 = no tumour 1 = tumour Fit: Least squares fit

6. Generalized Linear Model Generalizes linear regression by: – adding a link function g to transform the output z = g(y) – new response variable – noise (i.e.variance) does not have to be constant – fit is maximal likelihood instead of least squares

7. Prostate Cancer Logistic Regression Fit Data: x = PSA (prostate-specific antigen) y = CAPSULE 0 = no tumour 1 = tumour GLM Fit: – Binomial family – Logit link – Predict probability of CAPSULE=1.

8. Implementation - Solve GLM by IRLSM Input: – X: data matrix N*P – Y: response vector (N rows) – family, link function, α,β INNER LOOP: Solve elastic net: ADMM(Boyd 2010, page 43): OUTER LOOP: While β changes, compute: zk +1=β k +( y−μ k ) d η d μ W k +1 −1 =( d η d μ ) 2 Var(μ k ) γ l+1 =( X T WX +ρ I ) −1 X T Wz+ρ (β l −u l ) β l+1 =Sλ /ρ (γ l+1 +ul ) ul+1 =uk +γ l+1 −βl+1 Output: – β vector of coefficients, solution to max-likellihood XX = X T W k+1 X Xz= X T Wzk +1

9. H2O Implementation Outer Loop: (Map Reduce Task) public class SimpleGLM extends MRTask { @Override public void map(Chunk c) { res = new double [p][p]; for(double [] x:c.rows()){ double eta,mu,var; eta = computeEta(x); mu = _link.linkInv(eta); var = Math.max(1e-5,_family.variance(mu)); double dp = _link.linkInvDeriv(eta); double w = dp*dp/var; for(int i = 0; i < x.length; ++i) for(int j = 0; j < x.length; ++j) res[i][j] += x[i]*x[j]*w; } } @Override public void reduce(SimpleGLM g) { for(int i = 0; i < res.length; ++i) for(int j = 0; i < res.length; ++i) res[i][j] += g.res[i][j]; } } Inner Loop: (ADMM solver) public double [] solve(Matrix xx, Matrix xy) { // ADMM LSM Solve CholeskyDecomposition lu; // cache decomp! lu = new CholeskyDecomposition(xx); for( int i = 0; i < 1000; ++i ) { // Solve using cached Cholesky decomposition! xm = lu.solve(xyPrime); // compute u and z update for( int j = 0; j < N-1; ++j ) { double x_hat = xm.get(j, 0); x_norm += x_hat * x_hat; double zold = z[j]; z[j] = shrinkage(x_hat + u[j], kappa); u[j] += x_hat - z[j]; u_norm += u[j] * u[j]; } } } double shrinkage(double x, double k) { return Math.max(0,x-k)-Math.max(0,-x-k); }

10. Regularization Elastic Net (Zhou, Hastie, 2005): ● Added L1 and L2 penalty to β to: – avoid overfitting, reduce variance – obtain sparse solution (L1 penalty) – avoid problems with correlated covariates No longer analytical solution. Options: LARS, ADMM, Generalized Gradient, ... β =argmin( X β − y) T ( X β −y)+α ∥β∥1+(1−α )∥β∥2 2

11. Linear Regression Least Squares Method Find β by minimizing the sum of squared errors: Analytical solution: Easily parallelized if XT X is reasonably small. β =(X T X )−1 X T y=( 1 n ∑ xi xi T ) −1 1 n ∑ xi y β =argmin( X β − y) T ( X β −y)

12. Generalized Linear Model ● Generalizes linear regression by: – adding a link function g to transform the response z = g(y) – new response variable η = Xβ – linear predictor μ = g-1 (η) – y has a distribution in the exponential family – variance depends on μ e.g var(μ) = μ*(1-μ) for Binomial family. – fit by maximizing the likelihood of the model