Serverless: Computing For R
Serverless: Computing For R
serveRless
computing for R
Agenda
01 02 03
The Problem Serverless Architecture
Agenda
01 02 03
The Problem Serverless Architecture
The Problem
How can we build a
cost-effective data
science pipeline that
allows data scientists
using R to easily put
their models into
production, that
scales well and is
easy to maintain?
store models
Model storage
Trigger REST
Batch Scoring Realtime Scoring
6
Agenda
01 02 03
The Problem Serverless Architecture
The Solution
Just like wireless
internet has wires
somewhere, serverless
architectures still have
servers somewhere.
No server provisioning and Scaling is automatic and Billing is based on actual Spinning up new
maintenance is necessary. part of the service. compute resources used. environments is quick and
Hardware and OS are No compute used, no costs. allows for faster
abstracted away. experimentation.
The Evolution of the Cloud
9
Customer Managed
Applications Applications Applications Applications
Scalability Scalability Scalability Scalability
Security Security Security Security
Customer Managed
OS OS OS OS
Provider Managed
Virtualization Virtualization Virtualization Virtualization
Provider Managed
Provider Managed
Servers Servers Servers Servers
Storage Storage Storage Storage
Networking Networking Networking Networking
Data Centers Data Centers Data Centers Data Centers
What now?
10
Configure machine, storage, network, OS servers, applications, scaling run code when needed
Unit of Cost per VM per hour per container per hour per memory/second per request
Terminology
Browser (GUI)
REST calls
excellent tool: Postman
Three ways to interact with Cloud Providers
12
Browser
Browser
(GUI)
Three ways to interact with Cloud Providers
13
Terminology
IaC
(Infrastructure
as Code)
Three ways to interact with Cloud Providers
14
Terminology
REST calls
excellent tool: Postman
Cost Comparison
15
Big
lock-in
potential!
Cost Comparison
16
You create a Linux container group with a 1 vCPU, 1 GB configuration once daily during a month (30
days). The duration of each container group is 5 minutes (300 seconds).
Memory duration
Number of container groups memory duration (seconds) GB price per GB-s number of days Total
vCPU duration
Number of container groups vCPU duration (seconds) vCPU price per vCPU-s number of days Total
You create a Linux container group with a 1 vCPU, 2 GB configuration 50 times daily during a month (30
days). The container group duration is 150 seconds.
Memory duration
Number of container groups memory duration (seconds) GB price per GB-s number of days Total
vCPU duration
Number of container groups vCPU duration (seconds) vCPU price per vCPU-s number of days Total
You create a Linux container group with a 4 vCPU, 8 GB configuration 2 times daily during a month (30
days). The container group duration is 1 hour (= 3600 seconds).
Memory duration
Number of container groups memory duration (seconds) GB price per GB-s number of days Total
vCPU duration
Number of container groups vCPU duration (seconds) vCPU price per vCPU-s number of days Total
Agenda
01 02 03
The Problem Serverless Architecture
TRAINING SCORING
Many ways to realize serverless scoring architecture with different pros and cons
AWS Lambda
additional layers R/
└──library/
├── package 1/
├── package 2/ runtime.zip
├── package …/
└── package n/
Philipp Schirmer
R/
├──bin/
├──lib/
├──library/ runtime.zip
└──…
bootstrap
Compiled packages base layer runtime.r
can be a headache…
A function can use up to 5 layers at a time. The total unzipped size of the function
and all layers can't exceed the unzipped deployment package size limit of 250MB.
Function as a Service
24
Azure Functions
Neal Fultz
C# Java …
function code
modern open source high
performance RPC framework
language worker process
Protocol Buffers
.NET .NET Core
host Dirk Eddelbuettel
Container give us maximum flexibility regarding runtime and reduce vendor lock-in
PROS CONS
Pay-as-you-go
Docker Basics
26
Terminology
FROM rocker/rstudio:3.5.3
COPY * ./TravisR/
WORKDIR /usr/src/app/TravisR
Azure Container + Logic App
27
01 Write Code
01
User Rstudio or the IDE of your choice to write some arbitrary
R code, ideally as a package
03 git Repo 03
Push your package and the dockerfile to github
Azure Container + Logic App
28
ACI
06 Create Container Instance
Create a Azure Container Instance (ACI) pulling the
docker image from the ACR
07 Manage it
Use a Logic App or single REST calls to start and stop it
Logic App
30
IDEA STATUS
R Package serveRless
Many thanks to Hong Ooi for his awesome work supporting R in Azure!
And of course Christoph Bodner und Florian Schwendinger who are not here today.
33
Short Demo
34
Questions?
35
Thank you
for your attention!
linkedin.com/in/thomas-laber
data-science-austria.at
data-analytics.netlify.com
What now?
36
Configure machine, storage, network, OS servers, applications, scaling run code when needed
Unit of Cost per VM per hour per container per hour per memory/second per request