SlideShare a Scribd company logo
DIOS: Dynamic Instrumentation for (not so) Outstanding Scheduling Blake Sutton & Chris Sosa
Motivation ON OR
Approach: Adaptive Distributed Scheduler Centralized global scheduler and distributed local services Hares monitor machines for “undesirable” events Hares also gather  application-specific info  with Pin Rhino schedules jobs and responds to events from Hares Migrate Pause / Resume Kill / Restart
“ Pinvolvement”:  What it is Insert new code into apps on the fly No recompile Operates on a copy  Code caching Our Pintool Routine-level Instruction-level pin –t mytool -- ./myprogram Borrowed from Luk et al. 2005.
“ Pinvolvement”:  What it measures No reliance on hardware-specific-performance counters Want to capture memory behavior over time Gathered: Ratio of malloc to free calls Wall-clock time to execute 10,000,000 insns Number of memory ops in last 2,000,000 insns
Evaluation Distributed scheduler Rhino on realitytv13, Hare on realitytv13-16 heatedplate with modified parameters Hares detect if lower than 10% memory available and informs Rhino to take action Rhino reschedules youngest job at Hare site Baseline: Smallest Queues Pintool 2 applications from SPLASH-2 Heatedplate
Results: The Good Scheduler shows potential for improvement Lower total runtime with simple policy
Results: The Bad Overhead from Pintool is too high to realize gains Pin isn’t designed for on-the-fly analysis Could not reattach Code caching isn’t enough 7.64 7.90 14.51 6.27 1.25 1.00 lu 5.81 6.04 7.84 2.87 1.48 1.00 ocean 7.26 7.45 5.43 2.65 1.88 1.00 heatedplate latency # mems malloc/free count only pin native application
Results: The “Interesting” Pintool does capture intriguing info…
Other Issues Condor Process migration requires re-linking Doesn’t support multithreaded applications Other “user-level” process migration mechanisms have similar requirements Pin Unable to intersperse low and high overhead with Pintool Even the smallest overhead was not negligible Up to almost 2x slowdown just using Pin with heatedplate and no extra instrumentation Scheduling decisions have a bigger impact for long-running jobs
Conclusion: the Future of DIOS Overhead is prohibitive (for now) Pin needs to support reattach Lighter instrumentation framework However, instrumentation can capture aspects of application-specific behavior Future Work Pin as a process migration mechanism
¿ Preguntas?
Wait…hasn’t this been solved? Condor  popular user-space distributed scheduler process migration tries to keep queues balanced but jobs have different behavior over time from each other LSF (Load Sharing Facility) monitors system, moves processes around based on what they need must input static job information (requires profiling etc beforehand) what if something about your job isn't captured by your input? what if you end up giving it margins that are too large? too small?  unnecessary inefficiencies? it's not exactly hassle-free...   Hardware feedback PAPI Still not very portable (invasive kernel patch for install) Wouldn't it be nice if the scheduler could just..."do the right thing"?

More Related Content

What's hot (6)

SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
Splunk
 
Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments
Liming Zhu
 
Splunk Implementation and Usage - Garmin
Splunk Implementation and Usage - GarminSplunk Implementation and Usage - Garmin
Splunk Implementation and Usage - Garmin
Splunk
 
Production profiling: What, Why and How
Production profiling: What, Why and HowProduction profiling: What, Why and How
Production profiling: What, Why and How
RichardWarburton
 
Reactive Microservices with eclipse vert.x
Reactive Microservices with eclipse vert.xReactive Microservices with eclipse vert.x
Reactive Microservices with eclipse vert.x
Tiera Fann, MBA
 
Semi-Real Time Inclinometer readings using Wireless Technologies
Semi-Real Time Inclinometer readings using Wireless TechnologiesSemi-Real Time Inclinometer readings using Wireless Technologies
Semi-Real Time Inclinometer readings using Wireless Technologies
RekaNext Capital
 
SplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin InternationalSplunkLive! Customer Presentation - Garmin International
SplunkLive! Customer Presentation - Garmin International
Splunk
 
Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments
Liming Zhu
 
Splunk Implementation and Usage - Garmin
Splunk Implementation and Usage - GarminSplunk Implementation and Usage - Garmin
Splunk Implementation and Usage - Garmin
Splunk
 
Production profiling: What, Why and How
Production profiling: What, Why and HowProduction profiling: What, Why and How
Production profiling: What, Why and How
RichardWarburton
 
Reactive Microservices with eclipse vert.x
Reactive Microservices with eclipse vert.xReactive Microservices with eclipse vert.x
Reactive Microservices with eclipse vert.x
Tiera Fann, MBA
 
Semi-Real Time Inclinometer readings using Wireless Technologies
Semi-Real Time Inclinometer readings using Wireless TechnologiesSemi-Real Time Inclinometer readings using Wireless Technologies
Semi-Real Time Inclinometer readings using Wireless Technologies
RekaNext Capital
 

Viewers also liked (8)

Handling Byzantine Faults
Handling Byzantine FaultsHandling Byzantine Faults
Handling Byzantine Faults
awesomesos
 
Amazon’s Cloud Computing Efforts
Amazon’s Cloud Computing EffortsAmazon’s Cloud Computing Efforts
Amazon’s Cloud Computing Efforts
awesomesos
 
Masters of Science presentation: Bringing The Grid Home
Masters of Science presentation:  Bringing The Grid HomeMasters of Science presentation:  Bringing The Grid Home
Masters of Science presentation: Bringing The Grid Home
awesomesos
 
An Installable File System For Genesis II
An Installable File System For Genesis IIAn Installable File System For Genesis II
An Installable File System For Genesis II
awesomesos
 
Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008
awesomesos
 
A Guide to DAGMan
A Guide to DAGManA Guide to DAGMan
A Guide to DAGMan
awesomesos
 
A Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection RingsA Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection Rings
awesomesos
 
Distributed Snapshots
Distributed SnapshotsDistributed Snapshots
Distributed Snapshots
awesomesos
 
Handling Byzantine Faults
Handling Byzantine FaultsHandling Byzantine Faults
Handling Byzantine Faults
awesomesos
 
Amazon’s Cloud Computing Efforts
Amazon’s Cloud Computing EffortsAmazon’s Cloud Computing Efforts
Amazon’s Cloud Computing Efforts
awesomesos
 
Masters of Science presentation: Bringing The Grid Home
Masters of Science presentation:  Bringing The Grid HomeMasters of Science presentation:  Bringing The Grid Home
Masters of Science presentation: Bringing The Grid Home
awesomesos
 
An Installable File System For Genesis II
An Installable File System For Genesis IIAn Installable File System For Genesis II
An Installable File System For Genesis II
awesomesos
 
Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008Bringing The Grid Home for Grid2008
Bringing The Grid Home for Grid2008
awesomesos
 
A Guide to DAGMan
A Guide to DAGManA Guide to DAGMan
A Guide to DAGMan
awesomesos
 
A Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection RingsA Hardware Architecture For Implementing Protection Rings
A Hardware Architecture For Implementing Protection Rings
awesomesos
 
Distributed Snapshots
Distributed SnapshotsDistributed Snapshots
Distributed Snapshots
awesomesos
 

Similar to DIOS - compilers (20)

Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05
Rajesh Gupta
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systems
Hariharan Ganesan
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike Freedman
Spark Summit
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
Gibraltar Software
 
operating system question bank
operating system question bankoperating system question bank
operating system question bank
rajatdeep kaur
 
Understanding the characteristics of android wear os
Understanding the characteristics of android wear osUnderstanding the characteristics of android wear os
Understanding the characteristics of android wear os
Pratik Jain
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
MohamedBilal73
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
ScyllaDB
 
Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)
siouxhotornot
 
Workload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning PlatformWorkload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning Platform
Activeeon
 
Autosar Basics hand book_v1
Autosar Basics  hand book_v1Autosar Basics  hand book_v1
Autosar Basics hand book_v1
Keroles karam khalil
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
inside-BigData.com
 
Evolving to Cloud-Native - Nate Schutta (2/2)
Evolving to Cloud-Native - Nate Schutta (2/2)Evolving to Cloud-Native - Nate Schutta (2/2)
Evolving to Cloud-Native - Nate Schutta (2/2)
VMware Tanzu
 
PPT.pdf
PPT.pdfPPT.pdf
PPT.pdf
RameshBabu461344
 
Real time operating system which explains scheduling algorithms
Real time operating system which explains scheduling algorithmsReal time operating system which explains scheduling algorithms
Real time operating system which explains scheduling algorithms
Lavanya Sandeep
 
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
FastBit Embedded Brain Academy
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
Hiroshi Wada
 
Survey of task scheduler
Survey of task schedulerSurvey of task scheduler
Survey of task scheduler
elisha25
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-scheduling
Hesham Elmasry
 
Embedded Intro India05
Embedded Intro India05Embedded Intro India05
Embedded Intro India05
Rajesh Gupta
 
Real Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systemsReal Time Operating system (RTOS) - Embedded systems
Real Time Operating system (RTOS) - Embedded systems
Hariharan Ganesan
 
Spark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike FreedmanSpark Streaming and IoT by Mike Freedman
Spark Streaming and IoT by Mike Freedman
Spark Summit
 
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Data Con LA
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
Gibraltar Software
 
operating system question bank
operating system question bankoperating system question bank
operating system question bank
rajatdeep kaur
 
Understanding the characteristics of android wear os
Understanding the characteristics of android wear osUnderstanding the characteristics of android wear os
Understanding the characteristics of android wear os
Pratik Jain
 
5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx5.7 Parallel Processing - Reactive Programming.pdf.pptx
5.7 Parallel Processing - Reactive Programming.pdf.pptx
MohamedBilal73
 
Automating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency SpreadsAutomating the Hunt for Non-Obvious Sources of Latency Spreads
Automating the Hunt for Non-Obvious Sources of Latency Spreads
ScyllaDB
 
Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)Sioux Hot-or-Not: The future of Linux (Alan Cox)
Sioux Hot-or-Not: The future of Linux (Alan Cox)
siouxhotornot
 
Workload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning PlatformWorkload Automation for Cloud Migration and Machine Learning Platform
Workload Automation for Cloud Migration and Machine Learning Platform
Activeeon
 
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System IntegrationsApplying Cloud Techniques to Address Complexity in HPC System Integrations
Applying Cloud Techniques to Address Complexity in HPC System Integrations
inside-BigData.com
 
Evolving to Cloud-Native - Nate Schutta (2/2)
Evolving to Cloud-Native - Nate Schutta (2/2)Evolving to Cloud-Native - Nate Schutta (2/2)
Evolving to Cloud-Native - Nate Schutta (2/2)
VMware Tanzu
 
Real time operating system which explains scheduling algorithms
Real time operating system which explains scheduling algorithmsReal time operating system which explains scheduling algorithms
Real time operating system which explains scheduling algorithms
Lavanya Sandeep
 
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with DebuggingPART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
PART-1 : Mastering RTOS FreeRTOS and STM32Fx with Debugging
FastBit Embedded Brain Academy
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
Hiroshi Wada
 
Survey of task scheduler
Survey of task schedulerSurvey of task scheduler
Survey of task scheduler
elisha25
 
Module 3-cpu-scheduling
Module 3-cpu-schedulingModule 3-cpu-scheduling
Module 3-cpu-scheduling
Hesham Elmasry
 

More from awesomesos (9)

PicFS presentation
PicFS presentationPicFS presentation
PicFS presentation
awesomesos
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
awesomesos
 
Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)
awesomesos
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
awesomesos
 
Lustre And Nfs V4
Lustre And Nfs V4Lustre And Nfs V4
Lustre And Nfs V4
awesomesos
 
A Web Based Covert File System
A Web Based Covert File SystemA Web Based Covert File System
A Web Based Covert File System
awesomesos
 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
awesomesos
 
Exploring The Cloud
Exploring The CloudExploring The Cloud
Exploring The Cloud
awesomesos
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
awesomesos
 
PicFS presentation
PicFS presentationPicFS presentation
PicFS presentation
awesomesos
 
Online feedback correlation using clustering
Online feedback correlation using clusteringOnline feedback correlation using clustering
Online feedback correlation using clustering
awesomesos
 
Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)Web Service Choreography Interface (Wsci)
Web Service Choreography Interface (Wsci)
awesomesos
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
awesomesos
 
Lustre And Nfs V4
Lustre And Nfs V4Lustre And Nfs V4
Lustre And Nfs V4
awesomesos
 
A Web Based Covert File System
A Web Based Covert File SystemA Web Based Covert File System
A Web Based Covert File System
awesomesos
 
Distributed File Systems
Distributed File SystemsDistributed File Systems
Distributed File Systems
awesomesos
 
Exploring The Cloud
Exploring The CloudExploring The Cloud
Exploring The Cloud
awesomesos
 
Data Grid Taxonomies
Data Grid TaxonomiesData Grid Taxonomies
Data Grid Taxonomies
awesomesos
 

Recently uploaded (20)

Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Vibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdfVibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdf
Baiju Muthukadan
 
TrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
TrsLabs - AI Agents for All - Chatbots to Multi-Agents SystemsTrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
TrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
Trs Labs
 
Make GenAI investments go further with the Dell AI Factory
Make GenAI investments go further with the Dell AI FactoryMake GenAI investments go further with the Dell AI Factory
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Play It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google CertificatePlay It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google Certificate
VICTOR MAESTRE RAMIREZ
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Vaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without HallucinationsVaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without Hallucinations
john409870
 
Web and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in RajpuraWeb and Graphics Designing Training in Rajpura
Web and Graphics Designing Training in Rajpura
Erginous Technology
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveDesigning Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
ScyllaDB
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
Vibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdfVibe Coding_ Develop a web application using AI (1).pdf
Vibe Coding_ Develop a web application using AI (1).pdf
Baiju Muthukadan
 
TrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
TrsLabs - AI Agents for All - Chatbots to Multi-Agents SystemsTrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
TrsLabs - AI Agents for All - Chatbots to Multi-Agents Systems
Trs Labs
 
Make GenAI investments go further with the Dell AI Factory
Make GenAI investments go further with the Dell AI FactoryMake GenAI investments go further with the Dell AI Factory
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
MINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PRMINDCTI revenue release Quarter 1 2025 PR
MINDCTI revenue release Quarter 1 2025 PR
MIND CTI
 
Play It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google CertificatePlay It Safe: Manage Security Risks - Google Certificate
Play It Safe: Manage Security Risks - Google Certificate
VICTOR MAESTRE RAMIREZ
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
TrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token ListingTrsLabs Consultants - DeFi, WEb3, Token Listing
TrsLabs Consultants - DeFi, WEb3, Token Listing
Trs Labs
 
Vaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without HallucinationsVaibhav Gupta BAML: AI work flows without Hallucinations
Vaibhav Gupta BAML: AI work flows without Hallucinations
john409870
 

DIOS - compilers

  • 1. DIOS: Dynamic Instrumentation for (not so) Outstanding Scheduling Blake Sutton & Chris Sosa
  • 3. Approach: Adaptive Distributed Scheduler Centralized global scheduler and distributed local services Hares monitor machines for “undesirable” events Hares also gather application-specific info with Pin Rhino schedules jobs and responds to events from Hares Migrate Pause / Resume Kill / Restart
  • 4. “ Pinvolvement”: What it is Insert new code into apps on the fly No recompile Operates on a copy Code caching Our Pintool Routine-level Instruction-level pin –t mytool -- ./myprogram Borrowed from Luk et al. 2005.
  • 5. “ Pinvolvement”: What it measures No reliance on hardware-specific-performance counters Want to capture memory behavior over time Gathered: Ratio of malloc to free calls Wall-clock time to execute 10,000,000 insns Number of memory ops in last 2,000,000 insns
  • 6. Evaluation Distributed scheduler Rhino on realitytv13, Hare on realitytv13-16 heatedplate with modified parameters Hares detect if lower than 10% memory available and informs Rhino to take action Rhino reschedules youngest job at Hare site Baseline: Smallest Queues Pintool 2 applications from SPLASH-2 Heatedplate
  • 7. Results: The Good Scheduler shows potential for improvement Lower total runtime with simple policy
  • 8. Results: The Bad Overhead from Pintool is too high to realize gains Pin isn’t designed for on-the-fly analysis Could not reattach Code caching isn’t enough 7.64 7.90 14.51 6.27 1.25 1.00 lu 5.81 6.04 7.84 2.87 1.48 1.00 ocean 7.26 7.45 5.43 2.65 1.88 1.00 heatedplate latency # mems malloc/free count only pin native application
  • 9. Results: The “Interesting” Pintool does capture intriguing info…
  • 10. Other Issues Condor Process migration requires re-linking Doesn’t support multithreaded applications Other “user-level” process migration mechanisms have similar requirements Pin Unable to intersperse low and high overhead with Pintool Even the smallest overhead was not negligible Up to almost 2x slowdown just using Pin with heatedplate and no extra instrumentation Scheduling decisions have a bigger impact for long-running jobs
  • 11. Conclusion: the Future of DIOS Overhead is prohibitive (for now) Pin needs to support reattach Lighter instrumentation framework However, instrumentation can capture aspects of application-specific behavior Future Work Pin as a process migration mechanism
  • 13. Wait…hasn’t this been solved? Condor popular user-space distributed scheduler process migration tries to keep queues balanced but jobs have different behavior over time from each other LSF (Load Sharing Facility) monitors system, moves processes around based on what they need must input static job information (requires profiling etc beforehand) what if something about your job isn't captured by your input? what if you end up giving it margins that are too large? too small? unnecessary inefficiencies? it's not exactly hassle-free...   Hardware feedback PAPI Still not very portable (invasive kernel patch for install) Wouldn't it be nice if the scheduler could just..."do the right thing"?

Editor's Notes

  • #3: Our project is about how to schedule jobs among a group of machines. Our implementation is at the user level, but the same idea could be applied in the kernel of a distributed operating system. Long-running, short-running, memory-intensive, cpu-bound…don’t know what kind of jobs to expect. So how can the scheduler put them where they should be if it doesn’t know these things? Transition: Wouldn’t it be nice if the scheduler could just “handle it” – without the user having specify characteristics of their jobs in advance?
  • #4: Our approach to this problem is DIOS – an adaptive distributed scheduler. Describe diagram: local schedulers (Hare) run on each machine, with queues of jobs. Global scheduler (Rhino) receives events from the Hares and sends down actions – like, migrate, or pause. Transition: So you must be thinking…wait, how are you going to just “gather application-specific info”?
  • #5: The answer is – we’ll write a tool with Pin, a dynamic instrumentation framework. Describe diagram – as you can see from the diagram, and from this command up here, Pin is kind of like a miniature virtual machine. It takes in a pintool and the program binary, and runs it in the context of Pin, inserting new code into the application as it runs – using the tool as the instructions for what code to execute and where to insert it. For example, a pintool to count the number of instructions executed in a program could insert code to increment a variable before every instruction. There are several point instrumentation can be introduced – our pintool uses routine-level and instruction-level.
  • #6: So we’ve established that Pin is a tool for what we want to do – dynamically instrument applications. But what code do we want to insert? What are we looking to get from our pintool? Since we are trying to detect and avoid memory contention between processes, it makes senses to study the memory behavior of the applications. To this end, we chose three things (describe them). The figure to side there shows how the pintool fits in to our overall plan – it would collect information for each application and report the results to Hare, the local scheduler. Then Hare, which is also monitoring the memory subsystem of the local machine, reports to Rhino, and Rhino decides what to do.
  • #7: Considering our motivation, it was important to try to evaluate it on a somewhat realistic workload. Since it seems like most long-running jobs on clusters are scientific applications, we wanted to use real scientific benchmarks. Describe benchmarks. To evaluate the scheduler, we measured the total runtime from groups of 100 jobs. We varied the parameters to the heatedplate program (dataset size and number of iterationas) in order to vary the length of the jobs, and produced a set of jobs on a curve – a great many short-running jobs with a few long-running jobs. Past work indicates that is a common job submission trend in batch systems. Then, to evaluate our pintool, we measured the overhead from running each application with our pintool and also tracked the information we collected over time to see if we could correlate it to interesting behavior or differences between programs.
  • #8: So here are our results from evaluating the distributed scheduler by itself. The good news is we saw potential for improvement –just from using a simple policy to react to the presence of memory contention, the total runtime goes down. Might be able to get even better results on long-running jobs, with better information on the running processes (like we could get from dynamic instrumentation!) So if you’re wondering why we’re showing you results for our scheduler with this simple policy – but not with our whole system of including application-specfic information…well that brings me to The Bad.
  • #9: Although our scheduler works perfectly well with the pintool, we discovered that the overhead introduced by Pin is just too much. Some of our overhead results are below – we show the time to run the application natively, with pin (no pintool), with a tool that only counts instructions, and with our three metrics. The way we hoped to solve the overhead problem originally was to basically only instrument when we needed to –like when the scheduler decided the machine was performing badly. Then, the relatively high overhead to run the analysis wouldn’t have to make much of an impact overall. However, we were unable to get the performance gains we hoped – Pin doesn’t offer the ability to completely attach and detach from a running program, only to attach, and we discovered when we tried to add and remove instrumentation dynamically that we lost the gains from code caching. So while this idea could work with another system or with a new Pin, we couldn’t manage to bring the overhead down.
  • #10: But on the bright side, we were able to collect some interesting information – this figure shows the variation over time of our memory instruction measurements – it shows the change in the number of memory instructions executed in a window over time – hence the negative numbers. Note how similar the patterns of LU and heatedplate are – talk about how that’s probably because they are tightly looped and very repetitive, whereas Ocean is obviously performing a more irregular and complex analysis with some possible distinct phases in it. Possibility of using the variation in a metric like this to “predict the predictability” - to separate applications that are better left alone from those that are more likely to be safely handled by common heuristics, etc.
  • #12: So – the future of DIOS.
  • #13: Questions?
  • #14: Kind of...but no comprehensive solution.