0% found this document useful (0 votes)

10 views

Kubernetes at CERN

CERN utilizes Kubernetes for various applications including infrastructure services, batch computing, machine learning, and reproducible analysis. Key features include heterogeneous clusters, OpenStack integration, and security measures like vulnerability scanning. Ongoing work focuses on cloud bursting for machine learning and addressing open issues related to policy enforcement and storage costs.

Uploaded by

info

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Kubernetes at CERN

Uploaded by

info

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Kubernetes at CERN

Ricardo Rocha
CERN IT
Use Cases
Infrastructure Services: JIRA, WebLogic, EDH, …

Batch and Interactive Computing: Jupyter Notebooks, Spark, HTCondor, …

Machine Learning: Kubeflow

Reproduceable Analysis: REANA

Experiment Tools: CMSWeb, Rucio, …

Kubernetes Grid Sites

Many others we don’t know about

Main Features
Heterogeneous Clusters with Node Groups

OpenStack cloud provider to interact with our private cloud

Identity, Cluster Auto Scaling, Load Balancing, …

CSI drivers for CVMFS (read-only filesystem) and CephFS

EOS integration via a DaemonSet runnings eosxd (no CSI for now)

Central Logging and Metric collection, Alarming

Vulnerability Scanning and Image Signing, Security Reports

$ helm install fluxcd/flux \
--namespace flux --name flux --values flux-values.yaml

Helm and GitOps --set git.pollInterval=1m

--set git.url=https://gitlab.cern.ch/.../hub

$ cat flux-values.yaml
rbac:
create: true
helmOperator:
create: true
chartsSyncInterval: 5m
configureRepositories:
enable: true
repositories:
- name: jupyterhub
url: https://charts.cern.ch/jupyterhub
Registry
...
docker push

Helm
Meta FluxCD Helm
Release
Chart Operator
CRD
git push git pull
Helm and GitOps |-- charts
|-- hub
Chart.yaml requirements.yaml values.yaml
|-- templates
custom-manifest.yaml
|-- namespaces
prod.yaml stg.yaml
|-- releases
apiVersion: flux.weave.works/v1beta1 |-- prod
kind: HelmRelease hub.yaml
metadata: |-- stg
name: hub hub.yaml
namespace: prod |-- secrets
spec: |-- prod
releaseName: hub secrets.yaml
chart: |-- stg
git: https://gitlab.cern.ch/.../hub.git secrets.yaml
path: charts/hub
ref: master
valuesFrom:
- secretKeyRef:
name: hub-secrets
key: values.yaml
This is how we plug our encrypted
values: values data
binderhub:
...
70 TB Dataset Cluster on GKE Job Results Interactive
Visualization
Max 25000 Cores

Single Region, 3 Zones Aggregation

25000 Kubernetes Jobs
CERN → NL Region (via Zurich link)

Initially transferred to Zurich, retransferred to NL for higher capacity

Transfer to NL still went via Zurich

CERN → NL Region (via Zurich link)

Cluster Image Data

Process
Creation Pre-Pull Stage-In

5 min 4 min 4 min 90 sec

CERN → Zurich → NL Region

Cluster Image Data

Process
Creation Pre-Pull Stage-In

5 min 4 min 4 min 90 sec

GCP Pricing
Billing is updated daily, though there are APIs to query for details

Considering a ~10 minutes run it implies (compute table prices, NL region)

$1.043 * 1530 / 6 = $260 (~5x cheaper if using pre-emptibles)

Parking storage cost for the dataset (monthly cost, lots of room for creativity)

$0.020 * 70000 = $1400

Total under $300 usd

Running on credits, no Committed Use or Sustained Compute discounts

Ongoing Work
Use Case: Notebooks, ML Pipelines
Persistent Storage for Feedback

1. 2. Distributed
User Notebook
Compute

Build, Validate Model Train at Scale

4. Serving
Cloud Bursting Hub K8S

Recently deployed in our staging cluster

Burst to public clouds when needed

Virtual
CPUs GPUs
Kubelet
Especially interesting for ML: GPUs, TPUs

Transparent to CERN users

On demand - only pay for actual use (testing in a workshop soon)

External endpoint can be any Kubernetes cluster

Trying with the Google Cloud (GKE)

Open Issues
Policy definition and enforcement for external resources

Accounting

Storage

Parking costs

Remote Access vs Replication (Hot Cache? Already done for CVMFS)

Handling and cost of output data (egress)

Scale test for the network / gateway setup

Other Points of Interest
Kubeflow

Batch on GKE

Quick call with the team

Open Sourcing? Maybe. Tied to GKE? Yes for now

Fair Share relying on Budgets? They fit better the unlimited resources model

KNative / Cloud Run

Anthos

Q Tips: Fast, Scalable, and Maintainable Kdb+
From Everand
Q Tips: Fast, Scalable, and Maintainable Kdb+
Nick Psaris
No ratings yet
Google Cloud Platform an Architect's Guide
From Everand
Google Cloud Platform an Architect's Guide
alasdair gilchrist
5/5 (1)
Kubernetes From Scratch: By: Eng. Mohamed Elemam Email
100% (2)
Kubernetes From Scratch: By: Eng. Mohamed Elemam Email
72 pages
Uvm Code Examples
100% (1)
Uvm Code Examples
115 pages
Introduction To Kubernetes
100% (6)
Introduction To Kubernetes
182 pages
Kubernetes' Architecture Deep Dive - Umeå May 2019
No ratings yet
Kubernetes' Architecture Deep Dive - Umeå May 2019
62 pages
Robotic Arm Based On Internet of Things (Iot) : Under The Guidance of
No ratings yet
Robotic Arm Based On Internet of Things (Iot) : Under The Guidance of
15 pages
OSS2019 HS k8sNativeInfra OperatorFor5Gedge
No ratings yet
OSS2019 HS k8sNativeInfra OperatorFor5Gedge
32 pages
IJIRMPS_JAN_2024
No ratings yet
IJIRMPS_JAN_2024
34 pages
Building ML Products With Kubeflow (PDFDrive)
No ratings yet
Building ML Products With Kubeflow (PDFDrive)
38 pages
Helm and GitOps at CERN
100% (1)
Helm and GitOps at CERN
18 pages
CNCF Webinar - Kubernetes 1.16 PDF
No ratings yet
CNCF Webinar - Kubernetes 1.16 PDF
48 pages
Kubernetes
No ratings yet
Kubernetes
10 pages
Prasanth Kothuri, Danilo Piparo, Enric Tejedor Saavedra, Diogo Castro Cern It and Ep-Sft
No ratings yet
Prasanth Kothuri, Danilo Piparo, Enric Tejedor Saavedra, Diogo Castro Cern It and Ep-Sft
22 pages
reana-cms-physics-days-20250218
No ratings yet
reana-cms-physics-days-20250218
54 pages
Kubernetes For MLOps Engineers
No ratings yet
Kubernetes For MLOps Engineers
7 pages
Introduction To Kubernetes
No ratings yet
Introduction To Kubernetes
182 pages
Kubernetes Basic To Advanced
No ratings yet
Kubernetes Basic To Advanced
4 pages
Hands-On Multi-Cloud Kubernetes: Multi-cluster kubernetes deployment and scaling with FluxCD, Virtual Kubelet, Submariner and KubeFed
From Everand
Hands-On Multi-Cloud Kubernetes: Multi-cluster kubernetes deployment and scaling with FluxCD, Virtual Kubelet, Submariner and KubeFed
Joe Brian
No ratings yet
Scaling Up AI ML With Kubernetes
No ratings yet
Scaling Up AI ML With Kubernetes
7 pages
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
No ratings yet
CS341: Project in Mining Massive Datasets: Michele Catasta, Jure Leskovec, Jeffrey Ullman
29 pages
Machine Learning
No ratings yet
Machine Learning
102 pages
SciNet Tutorial
No ratings yet
SciNet Tutorial
22 pages
Learning Apache Spark 2
From Everand
Learning Apache Spark 2
Muhammad Asif Abbasi
No ratings yet
Resource Deployment: CERN Openlab II Quarterly Review 20 September 2006
No ratings yet
Resource Deployment: CERN Openlab II Quarterly Review 20 September 2006
13 pages
Introduction To Kubernetes PDF
100% (2)
Introduction To Kubernetes PDF
182 pages
V7I4-1158
No ratings yet
V7I4-1158
4 pages
How Kubernetes Networking Works - Under The Hood
No ratings yet
How Kubernetes Networking Works - Under The Hood
14 pages
6_OpenAI Case Study _ Kubernetes
No ratings yet
6_OpenAI Case Study _ Kubernetes
3 pages
Cloud Project
No ratings yet
Cloud Project
7 pages
Cloud Native Security
From Everand
Cloud Native Security
Chris Binnie
5/5 (1)
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
K8s Report
No ratings yet
K8s Report
14 pages
Kubernetes - Docker DLH Short Course
100% (1)
Kubernetes - Docker DLH Short Course
164 pages
Kubernetes Rook Ceph
No ratings yet
Kubernetes Rook Ceph
28 pages
Kubernetes Notes
No ratings yet
Kubernetes Notes
22 pages
Where can buy (Ebook) LFS258 - Kubernetes Fundamentals by The Linux Foundation ebook with cheap price
100% (1)
Where can buy (Ebook) LFS258 - Kubernetes Fundamentals by The Linux Foundation ebook with cheap price
65 pages
Open Source IoT Platform Based On OpenStack and Kubernetes - Mirantis
No ratings yet
Open Source IoT Platform Based On OpenStack and Kubernetes - Mirantis
8 pages
Kubernetes
No ratings yet
Kubernetes
10 pages
Team_3_Kubernetes_MinIO_WS2021
No ratings yet
Team_3_Kubernetes_MinIO_WS2021
34 pages
Kubernetes - Summary by Mrinal Sauraj
No ratings yet
Kubernetes - Summary by Mrinal Sauraj
3 pages
One of The Most Unpolished Kubernetes Introduction Presentations Ever Given On A Thursday Afternoon From Freiburg, Germany, Ever
No ratings yet
One of The Most Unpolished Kubernetes Introduction Presentations Ever Given On A Thursday Afternoon From Freiburg, Germany, Ever
22 pages
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
No ratings yet
Kubernetes Networking: Marian Babik, Spyridon Trigazis Cern
19 pages
Updated Jntu Attendance
No ratings yet
Updated Jntu Attendance
4 pages
Learning Cascading
From Everand
Learning Cascading
Victoria Loewengart
No ratings yet
A Beginner-Friendly Introduction To Kubernetes - by David Chong - Towards Data Science
100% (1)
A Beginner-Friendly Introduction To Kubernetes - by David Chong - Towards Data Science
20 pages
Kubernaties
No ratings yet
Kubernaties
36 pages
Kubernetes - Objects Nov24
No ratings yet
Kubernetes - Objects Nov24
11 pages
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Mastering Go Network Automation
From Everand
Mastering Go Network Automation
Ian Taylor
No ratings yet
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
From Everand
Mastering Go Network Automation: Automating Networks, Container Orchestration, Kubernetes with Puppet, Vegeta and Apache JMeter
Ian Taylor
No ratings yet
Kubernetes Architecture a Deep Dive
No ratings yet
Kubernetes Architecture a Deep Dive
10 pages
SciNet Tutorial
No ratings yet
SciNet Tutorial
22 pages
Kuber Notes
No ratings yet
Kuber Notes
10 pages
Kubernetes Simplified
No ratings yet
Kubernetes Simplified
8 pages
Kubernates-Part1
No ratings yet
Kubernates-Part1
28 pages
Mastering Kubernetes
From Everand
Mastering Kubernetes
Manish Soni
No ratings yet
Kubernetes: A Comprehensive Overview
100% (1)
Kubernetes: A Comprehensive Overview
67 pages
Cloud-AI-Native 6G powered by eBPF
No ratings yet
Cloud-AI-Native 6G powered by eBPF
20 pages
Using Google Compute Engine: 1 Getting An Account and Setting Up Payment
No ratings yet
Using Google Compute Engine: 1 Getting An Account and Setting Up Payment
11 pages
Kubernetes 10 Tips To Overcome The Overwhelming
No ratings yet
Kubernetes 10 Tips To Overcome The Overwhelming
15 pages
Kubernetes Tutorial - Architecture, Basics, Features With EXAMPLE
No ratings yet
Kubernetes Tutorial - Architecture, Basics, Features With EXAMPLE
6 pages
Archivos Chimbos
No ratings yet
Archivos Chimbos
10 pages
Devaraconda Dinesh: Career Objective
No ratings yet
Devaraconda Dinesh: Career Objective
2 pages
11 Low Latency Programming
No ratings yet
11 Low Latency Programming
84 pages
Jbasic Users Guide
No ratings yet
Jbasic Users Guide
247 pages
ABU - AIA - 301 - Ansible Infrastructure Automation - Technical Enablement
No ratings yet
ABU - AIA - 301 - Ansible Infrastructure Automation - Technical Enablement
39 pages
DS - FT602Q IC Datasheet
No ratings yet
DS - FT602Q IC Datasheet
28 pages
364 Microsoft Office Quiz Based Computer MCQ Tutorial Set 4 Question Answer With Explanation PDF
No ratings yet
364 Microsoft Office Quiz Based Computer MCQ Tutorial Set 4 Question Answer With Explanation PDF
7 pages
Evolution of Microprocessor With Its History
No ratings yet
Evolution of Microprocessor With Its History
4 pages
Day2 2 WiiWare Technical Overview
No ratings yet
Day2 2 WiiWare Technical Overview
60 pages
Sel Text
No ratings yet
Sel Text
13 pages
Installing Debian Linux Lenny On Virtual Box On Mac OS X
No ratings yet
Installing Debian Linux Lenny On Virtual Box On Mac OS X
40 pages
PowerChute Network Shutdown 5.0 Compatibility Chart
No ratings yet
PowerChute Network Shutdown 5.0 Compatibility Chart
1 page
Daa Solutions
No ratings yet
Daa Solutions
10 pages
Final Smart Mirror Report02
No ratings yet
Final Smart Mirror Report02
41 pages
Claves para Activar Office 2016
No ratings yet
Claves para Activar Office 2016
3 pages
Amazon Simple Queue Service
100% (1)
Amazon Simple Queue Service
4 pages
TLE Computer Systems Servicing 10 Installing and Configuring Computer Systems (ICCS) Quarter 2 - Week 1 Module
No ratings yet
TLE Computer Systems Servicing 10 Installing and Configuring Computer Systems (ICCS) Quarter 2 - Week 1 Module
10 pages
Notes 231204 223511
No ratings yet
Notes 231204 223511
4 pages
cs757 ns2 Tutorial1
No ratings yet
cs757 ns2 Tutorial1
16 pages
NICE3000 EN81-20 BR en Singles Web V0.1
No ratings yet
NICE3000 EN81-20 BR en Singles Web V0.1
12 pages
OOP1 Lab - 07 - Fall
No ratings yet
OOP1 Lab - 07 - Fall
2 pages
Network Printer - Raspi Cups Prints PDF Input As Garbage On Paper - Server Fault
No ratings yet
Network Printer - Raspi Cups Prints PDF Input As Garbage On Paper - Server Fault
2 pages
Introduction To SAP
No ratings yet
Introduction To SAP
8 pages
ATD 344 Product Guide RevC en Us
No ratings yet
ATD 344 Product Guide RevC en Us
353 pages
C Assignment Before 10th April
No ratings yet
C Assignment Before 10th April
13 pages
Rockchip RK356X Quick Start Linux NVR V1.5.0 20220728 En
No ratings yet
Rockchip RK356X Quick Start Linux NVR V1.5.0 20220728 En
27 pages
Mariadb Platform X5 vs. Mysql Enterprise Edition 8: September 2020
No ratings yet
Mariadb Platform X5 vs. Mysql Enterprise Edition 8: September 2020
18 pages
VMCE95 Textbook 20180914 PDF
No ratings yet
VMCE95 Textbook 20180914 PDF
281 pages

Kubernetes at CERN

Uploaded by

Kubernetes at CERN

Uploaded by

Kubernetes at CERN

Batch and Interactive Computing: Jupyter Notebooks, Spark, HTCondor, …

Machine Learning: Kubeflow

Reproduceable Analysis: REANA

Experiment Tools: CMSWeb, Rucio, …

Kubernetes Grid Sites

Many others we don’t know about

OpenStack cloud provider to interact with our private cloud

Identity, Cluster Auto Scaling, Load Balancing, …

CSI drivers for CVMFS (read-only filesystem) and CephFS

Central Logging and Metric collection, Alarming

Vulnerability Scanning and Image Signing, Security Reports

Helm and GitOps --set git.pollInterval=1m

Single Region, 3 Zones Aggregation

Initially transferred to Zurich, retransferred to NL for higher capacity

Transfer to NL still went via Zurich

Cluster Image Data

5 min 4 min 4 min 90 sec

Cluster Image Data

5 min 4 min 4 min 90 sec

Considering a ~10 minutes run it implies (compute table prices, NL region)

$1.043 * 1530 / 6 = $260 (~5x cheaper if using pre-emptibles)

$0.020 * 70000 = $1400

Total under $300 usd

Running on credits, no Committed Use or Sustained Compute discounts

Build, Validate Model Train at Scale

Recently deployed in our staging cluster

Burst to public clouds when needed

Transparent to CERN users

On demand - only pay for actual use (testing in a workshop soon)

External endpoint can be any Kubernetes cluster

Trying with the Google Cloud (GKE)

Remote Access vs Replication (Hot Cache? Already done for CVMFS)

Handling and cost of output data (egress)

Scale test for the network / gateway setup

Quick call with the team

Open Sourcing? Maybe. Tied to GKE? Yes for now

KNative / Cloud Run

You might also like