R Packages For Machine Learning

This document summarizes machine learning packages available in R. It groups packages by machine learning algorithm or application, including neural networks, recursive partitioning, random forests, regularized and shrinkage methods, boosting, support vector machines, Bayesian methods, genetic algorithms, association rules, fuzzy rule-based systems, and model selection/validation. It also lists meta-packages that provide interfaces to various algorithms, as well as packages implementing specific machine learning books and methods.

Uploaded by

ReaderRat

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views

R Packages For Machine Learning

Uploaded by

ReaderRat

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

CRAN R Packages for Machine Learning

Neural Networks : Single-hidden-layer neural network are implemented in package nnet

(shipped with base R). Package RSNNS offers an interface to the Stuttgart Neural
Network Simulator (SNNS). An interface to the FCNN library allows user-extensible
artificial neural networks in package FCNN4R.

Recursive Partitioning : Tree-structured models for regression, classification and

survival analysis, following the ideas in the CART book, are implemented in rpart
(shipped with base R) and tree. Package rpart is recommended for computing CARTlike trees. A rich toolbox of partitioning algorithms is available in Weka , package RWeka
provides an interface to this implementation, including the J4.8-variant of C4.5 and M5.
The Cubist package fits rule-based models (similar to trees) with linear regression
models in the terminal leaves, instance-based corrections and boosting. The C50
package can fit C5.0 classification trees, rule-based models, and boosted versions of
these.
Two recursive partitioning algorithms with unbiased variable selection and statistical
stopping criterion are implemented in package party. Function ctree() is based on nonparametrical conditional inference procedures for testing independence between
response and each input variable whereas mob() can be used to partition parametric
models. Extensible tools for visualizing binary trees and node distributions of the
response are available in package party as well.
Tree-structured varying coefficient models are implemented in package vcrpart.
For problems with binary input variables the package LogicReg implements logic
regression. Graphical tools for the visualization of trees are available in package
maptree.
Trees for modelling longitudinal data by means of random effects is offered by package
REEMtree. Partitioning of mixture models is performed by RPMM.
Computational infrastructure for representing trees and unified methods for predition
and visualization is implemented in partykit. This infrastructure is used by package
evtree to implement evolutionary learning of globally optimal trees. Oblique trees are
available in package oblique.tree.

Random Forests : The reference implementation of the random forest algorithm for
regression and classification is available in package randomForest. Package ipred has
bagging for regression, classification and survival analysis as well as bundling, a
combination of multiple models via ensemble learning. In addition, a random forest
variant for response variables measured at arbitrary scales based on conditional
inference trees is implemented in package party. randomForestSRC implements a
unified treatment of Breiman's random forests for survival, regression and classification
problems. Quantile regression forests quantregForest allow to regress quantiles of a
numeric response on exploratory variables via a random forest approach. For binary
data, LogicForest is a forest of logic regression trees (package LogicReg. The varSelRF
and Boruta packages focus on variable selection by means for random forest
algorithms. In addition, packages ranger and Rborist offer R interfaces to fast C++
implementations of random forests.

Regularized and Shrinkage Methods : Regression models with some constraint on the
parameter estimates can be fitted with the lasso2 and lars packages. Lasso with
simultaneous updates for groups of parameters (groupwise lasso) is available in
package grplasso; the grpreg package implements a number of other group
penalization models, such as group MCP and group SCAD. The L1 regularization path
for generalized linear models and Cox models can be obtained from functions available
in package glmpath, the entire lasso or elastic-net regularization path (also in elasticnet)
for linear regression, logistic and multinomial regression models can be obtained from
package glmnet. The penalized package provides an alternative implementation of
lasso (L1) and ridge (L2) penalized regression models (both GLM and Cox models).
Package RXshrink can be used to identify and display TRACEs for a specified
shrinkage path and to determine the appropriate extent of shrinkage. Semiparametric
additive hazards models under lasso penalties are offered by package ahaz. A
generalisation of the Lasso shrinkage technique for linear regression is called relaxed
lasso and is available in package relaxo. Fisher's LDA projection with an optional
LASSO penalty to produce sparse solutions is implemented in package penalizedLDA.
The shrunken centroids classifier and utilities for gene expression analyses are
implemented in package pamr. An implementation of multivariate adaptive regression
splines is available in package earth. Variable selection through clone selection in SVMs
in penalized models (SCAD or L1 penalties) is implemented in package penalizedSVM.
Various forms of penalized discriminant analysis are implemented in packages hda, rda,
and sda. Package LiblineaR offers an interface to the LIBLINEAR library. The ncvreg
package fits linear and logistic regression models under the the SCAD and MCP
regression penalties using a coordinate descent algorithm. High-throughput ridge
regression (i.e., penalization with many predictor variables) and heteroskedastic effects
models are the focus of the bigRR package. An implementation of bundle methods for
regularized risk minimization is available form package bmrm.

Boosting : Various forms of gradient boosting are implemented in package gbm (treebased functional gradient descent boosting). The Hinge-loss is optimized by the
boosting implementation in package bst. Package GAMBoost can be used to fit
generalized additive models by a boosting algorithm. An extensible boosting framework
for generalized linear, additive and nonparametric models is available in package
mboost. Likelihood-based boosting for Cox models is implemented in CoxBoost and for
mixed models in GMMBoost. GAMLSS models can be fitted using boosting by
gamboostLSS.

Support Vector Machines and Kernel Methods : The function svm() from e1071 offers
an interface to the LIBSVM library and package kernlab implements a flexible
framework for kernel learning (including SVMs, RVMs and other kernel learning
algorithms). An interface to the SVMlight implementation (only for one-against-all
classification) is provided in package klaR. The relevant dimension in kernel feature
spaces can be estimated using rdetools which also offers procedures for model
selection and prediction.

Bayesian Methods : Bayesian Additive Regression Trees (BART), where the final model
is defined in terms of the sum over many weak learners (not unlike ensemble methods),
are implemented in package BayesTree. Bayesian nonstationary, semiparametric
nonlinear regression and design by treed Gaussian processes including Bayesian
CART and treed linear models are made available by package tgp. Discrete Bayesian
networks can be fitted using bnclassify.

Optimization using Genetic Algorithms : Packages rgp and rgenoud offer optimization
routines based on genetic algorithms. The package Rmalschains implements memetic
algorithms with local search chains, which are a special type of evolutionary algorithms,
combining a steady state genetic algorithm with local search for real-valued parameter
optimization.

Association Rules : Package arules provides both data structures for efficient handling
of sparse binary data as well as interfaces to implementations of Apriori and Eclat for
mining frequent itemsets, maximal frequent itemsets, closed frequent itemsets and
association rules.

Fuzzy Rule-based Systems : Package frbs implements a host of standard methods for
learning fuzzy rule-based systems from data for regression and classification. Package
RoughSets provides comprehensive implementations of the rough set theory (RST) and
the fuzzy rough set theory (FRST) in a single package.

Model selection and validation : Package e1071 has function tune() for hyper
parameter tuning and function errorest() (ipred) can be used for error rate estimation.
The cost parameter C for support vector machines can be chosen utilizing the
functionality of package svmpath. Functions for ROC analysis and other visualisation
techniques for comparing candidate classifiers are available from package ROCR.
Packages hdi and stabs implement stability selection for a range of models, hdi also
offers other inference procedures in high-dimensional models.

Meta packages : Package caret provides miscellaneous functions for building predictive
models, including parameter tuning and variable importance measures. The package
can be used with various parallel implementations (e.g. MPI, NWS etc). In a similar
spirit, package mlr offers a high-level interface to various statistical and machine
learning packages. Package SuperLearner implements a similar toolbox. The h2o
package implements a general purpose machine learning platform that has scalable
implementations of many popular algorithms such as random forest, GBM, GLM (with
elastic net regularization), and deep learning (feedforward multilayer networks), among
others.

Elements of Statistical Learning : Data sets, functions and examples from the book The
Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor
Hastie, Robert Tibshirani and Jerome Friedman have been packaged and are available
as ElemStatLearn.

GUI rattle is a graphical user interface for data mining in R.

CORElearn implements a rather broad class of machine learning algorithms, such as nearest
neighbors, trees, random forests, and several feature selection methods. Similar, package
rminer interfaces several learning algorithms implemented in other packages and computes
several performance measures.

https://personality-project.org/r/r.guide.html#oneway
https://datajobs.com/data-science-repo/Decision-Trees-%5BRokach-and-Maimon%5D.pdf

http://www.researchmethods.org/CARTIntroTutorial.pdf
http://www.rdatamining.com/docs
http://www.rdatamining.com/examples/decision-tree

Inverted Index
No ratings yet
Inverted Index
9 pages
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
No ratings yet
Statistics Formulae Sheet: X X N X F - X N L+ I F N - C) FM F 1) FM F 1) + (FM F 2) × I Lowest Value+highest Value
4 pages
Intro Stats Formula Sheet
No ratings yet
Intro Stats Formula Sheet
5 pages
ECE 440 Cheat Sheet
No ratings yet
ECE 440 Cheat Sheet
2 pages
Formulas in Inferential Statistics
No ratings yet
Formulas in Inferential Statistics
4 pages
EC2303 Final Formula Sheet PDF
No ratings yet
EC2303 Final Formula Sheet PDF
8 pages
Formula Stables
No ratings yet
Formula Stables
29 pages
Statistics Packet
No ratings yet
Statistics Packet
17 pages
Important Statistics Formulas
No ratings yet
Important Statistics Formulas
7 pages
Statistics Probability Midterm Cheat Sheet
0% (1)
Statistics Probability Midterm Cheat Sheet
5 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
R-Tutorial - Introduction
No ratings yet
R-Tutorial - Introduction
30 pages
Statistics For Management and Economics, Tenth Edition Formulas
No ratings yet
Statistics For Management and Economics, Tenth Edition Formulas
11 pages
Parameters: Unless Otherwise Noted, These Formulas Assume
No ratings yet
Parameters: Unless Otherwise Noted, These Formulas Assume
6 pages
Rstudio Cheat Sheet: Console
No ratings yet
Rstudio Cheat Sheet: Console
3 pages
Statistics Formulas: Parameters
No ratings yet
Statistics Formulas: Parameters
3 pages
Categorical Data Analysis
No ratings yet
Categorical Data Analysis
11 pages
An R Tutorial Starting Out
No ratings yet
An R Tutorial Starting Out
9 pages
Formulae Sheet
No ratings yet
Formulae Sheet
11 pages
Cheat Sheet
No ratings yet
Cheat Sheet
163 pages
Data Analysis Formula Sheet Tables (DADM)
No ratings yet
Data Analysis Formula Sheet Tables (DADM)
8 pages
Stat Cookbook
No ratings yet
Stat Cookbook
31 pages
R Programming
No ratings yet
R Programming
63 pages
Frequency Distribution For Categorical Data
No ratings yet
Frequency Distribution For Categorical Data
6 pages
2.4 Transition Matrices
No ratings yet
2.4 Transition Matrices
9 pages
Statistics Formulas Cheatsheet
100% (1)
Statistics Formulas Cheatsheet
2 pages
Sampling Distributions Coursera
No ratings yet
Sampling Distributions Coursera
8 pages
Linear Regression Model
No ratings yet
Linear Regression Model
3 pages
Multivariate Normal Distribution
No ratings yet
Multivariate Normal Distribution
9 pages
R Tutorial
No ratings yet
R Tutorial
26 pages
The Three MS: Analysis Data
No ratings yet
The Three MS: Analysis Data
5 pages
Types of Distributions: Probablity Distribution (Non Specific) Binomial Distribution
No ratings yet
Types of Distributions: Probablity Distribution (Non Specific) Binomial Distribution
1 page
GATE:linear Algebra SAMPLE QUESTIONS
No ratings yet
GATE:linear Algebra SAMPLE QUESTIONS
14 pages
R Package Recommendation
No ratings yet
R Package Recommendation
4 pages
Tutorial Sheet
No ratings yet
Tutorial Sheet
2 pages
Beta Distribution
No ratings yet
Beta Distribution
8 pages
Parametric Families of Discrete Distributions
No ratings yet
Parametric Families of Discrete Distributions
2 pages
3.4 The Matrix of Linear Transformation
No ratings yet
3.4 The Matrix of Linear Transformation
4 pages
Multivariate Material
No ratings yet
Multivariate Material
58 pages
Energy Eigenvalues and Eigenstates
No ratings yet
Energy Eigenvalues and Eigenstates
4 pages
R Manual To Agresti's Categorical Data Analysis
100% (1)
R Manual To Agresti's Categorical Data Analysis
280 pages
Properties of The Trinomial Distribution
No ratings yet
Properties of The Trinomial Distribution
2 pages
Simple Linear Regression in R
No ratings yet
Simple Linear Regression in R
17 pages
ST1131 Cheat Sheet Page 1
0% (1)
ST1131 Cheat Sheet Page 1
1 page
If Are Partitions of Probability Space S: AB A B AB A B
No ratings yet
If Are Partitions of Probability Space S: AB A B AB A B
4 pages
Linear Algebra Final Review
No ratings yet
Linear Algebra Final Review
7 pages
Diagonalization: Definition. A Matrix
No ratings yet
Diagonalization: Definition. A Matrix
5 pages
Statistics
No ratings yet
Statistics
193 pages
Cheatsheet Midterms 2 - 3
No ratings yet
Cheatsheet Midterms 2 - 3
2 pages
Linear Regression Review
67% (6)
Linear Regression Review
3 pages
Sbe10 10 Simple Regression
No ratings yet
Sbe10 10 Simple Regression
100 pages
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
No ratings yet
3 The Rao-Blackwell Theorem: 3.1 Mean Squared Error
2 pages
1.2 Matrices Notes Presentation
No ratings yet
1.2 Matrices Notes Presentation
5 pages
New Multivariate Time-Series Estimators in Stata 11
100% (1)
New Multivariate Time-Series Estimators in Stata 11
34 pages
Assignment 1 Ver 2.0
No ratings yet
Assignment 1 Ver 2.0
3 pages
Multiple Linear Regression Housing Case Study PDF
No ratings yet
Multiple Linear Regression Housing Case Study PDF
151 pages
Stat1012 Cheatsheet Double-Sided
100% (1)
Stat1012 Cheatsheet Double-Sided
2 pages
CRAN Task View Machine Lea..
No ratings yet
CRAN Task View Machine Lea..
3 pages
Best ML Packages in R
No ratings yet
Best ML Packages in R
9 pages
R Functions List
No ratings yet
R Functions List
8 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Fitting & Interpreting Linear Models in Rinear Models in R
100% (1)
Fitting & Interpreting Linear Models in Rinear Models in R
8 pages
Rubik's Cube 2 Look OLL
No ratings yet
Rubik's Cube 2 Look OLL
2 pages
24 Hadoop Interview Questions & Answers For MapReduce Developers - FromDev
No ratings yet
24 Hadoop Interview Questions & Answers For MapReduce Developers - FromDev
3 pages
Using Counters in Hadoop
No ratings yet
Using Counters in Hadoop
2 pages
Map Reduce 101 Basic Template
No ratings yet
Map Reduce 101 Basic Template
1 page
CCDH Preperation
No ratings yet
CCDH Preperation
4 pages
Laundry Symbols
No ratings yet
Laundry Symbols
10 pages
59 Hilarious But True Programming Quotes For Software Developers
No ratings yet
59 Hilarious But True Programming Quotes For Software Developers
4 pages
Approaches To Understanding Human Behavior
No ratings yet
Approaches To Understanding Human Behavior
32 pages
Balances - Catalog
No ratings yet
Balances - Catalog
9 pages
Role of Welding Defects On The Failure of Sub-Sea Carbon Steel Gas Pipeline
No ratings yet
Role of Welding Defects On The Failure of Sub-Sea Carbon Steel Gas Pipeline
5 pages
4-1. Water Logging-2019
No ratings yet
4-1. Water Logging-2019
13 pages
Sumitomo SH210-6, SH220LC-6 Hydraulic Excavator Technical Specifications
No ratings yet
Sumitomo SH210-6, SH220LC-6 Hydraulic Excavator Technical Specifications
14 pages
IGCSE Cambridge International Mathematics 060723 Paper 2 OctNov 2023
No ratings yet
IGCSE Cambridge International Mathematics 060723 Paper 2 OctNov 2023
5 pages
Online Railway Reservation System
100% (2)
Online Railway Reservation System
22 pages
15+PAPER+hybrid+solar+inverter+-+Nitin+sir+-1
No ratings yet
15+PAPER+hybrid+solar+inverter+-+Nitin+sir+-1
6 pages
Skrip Bahasa Inggris
No ratings yet
Skrip Bahasa Inggris
6 pages
Spoken English Made Easy Summer Camp: Lesson 6: Describing A Person
No ratings yet
Spoken English Made Easy Summer Camp: Lesson 6: Describing A Person
4 pages
Manipal University Nursing Dissertation Topics
100% (2)
Manipal University Nursing Dissertation Topics
4 pages
FDH Bank Prospectus Digital PDF
100% (1)
FDH Bank Prospectus Digital PDF
152 pages
Rilke y La Unidad Del Mundo
No ratings yet
Rilke y La Unidad Del Mundo
5 pages
Kirchberger Invest - Anexa - 2 - Indicatori - de - Performanta - 2 - 2021 - 10 - 07
No ratings yet
Kirchberger Invest - Anexa - 2 - Indicatori - de - Performanta - 2 - 2021 - 10 - 07
16 pages
DLL IN TLE FCS 7 Q1 WEEK 2
No ratings yet
DLL IN TLE FCS 7 Q1 WEEK 2
6 pages
Sma Resins - Versatile in Carpet and Rug Shampoo PDF
No ratings yet
Sma Resins - Versatile in Carpet and Rug Shampoo PDF
4 pages
Connecting Modular Floating Structures: Appendices
No ratings yet
Connecting Modular Floating Structures: Appendices
209 pages
Multi-Level Slug Tests in Highly Permeable Formations: 1. Modification of The Springer-Gelhar (SG) Model
No ratings yet
Multi-Level Slug Tests in Highly Permeable Formations: 1. Modification of The Springer-Gelhar (SG) Model
12 pages
Psychology - Several Theories and Research Results Related To Attraction and Liking
No ratings yet
Psychology - Several Theories and Research Results Related To Attraction and Liking
2 pages
Cbse Questions Adm Retirement
No ratings yet
Cbse Questions Adm Retirement
19 pages
Amanda Craven It
No ratings yet
Amanda Craven It
1 page
This Content Downloaded From 99.81.149.184 On Wed, 28 Sep 2022 01:55:51 UTC
No ratings yet
This Content Downloaded From 99.81.149.184 On Wed, 28 Sep 2022 01:55:51 UTC
7 pages
Grade Thresholds - June 2021: Cambridge International AS & A Level Biology (9700)
No ratings yet
Grade Thresholds - June 2021: Cambridge International AS & A Level Biology (9700)
2 pages
Process Analysis
No ratings yet
Process Analysis
13 pages
Take home Task, Heron Data
No ratings yet
Take home Task, Heron Data
6 pages
Management Research Method
No ratings yet
Management Research Method
9 pages
Data Science Process and Machine Learning
No ratings yet
Data Science Process and Machine Learning
6 pages
Vicente Navarro 2022
No ratings yet
Vicente Navarro 2022
5 pages
CAMERON Indirect Heating
100% (1)
CAMERON Indirect Heating
4 pages
ENGLISH
No ratings yet
ENGLISH
10 pages

R Packages For Machine Learning

Uploaded by

R Packages For Machine Learning

Uploaded by

CRAN R Packages for Machine Learning

Neural Networks : Single-hidden-layer neural network are implemented in package nnet

Recursive Partitioning : Tree-structured models for regression, classification and

GUI rattle is a graphical user interface for data mining in R.

You might also like