SlideShare a Scribd company logo
Arghya Kusum Das, Ph.D.
Assistant Professor, University of Wisconsin-Platteville
In collaboration with
Radha Nagarajan, Ph.D.
Director, COSH, Marshfield Clinic Health System (Digital Health, Data Science, Bioinformatics, RWE)
RWE)
Graphical Structure Learning Accelerated with POWER9
o Overview of Graphical Models
o Implementation
o Preliminary Findings
o Healthcare Applications
Overview
Graphs/Networks: Comprised of nodes and edges
nodes/vertex: represent the entities of interest
edges: represent the associations/relationships between the nodes.
Graphical models: Model the associations between the entities as a graph.
Example:
nodes: COVID subjects
edges: association between the COVID subjects (e.g. contact tracing)
© searchengineland.com
Why Graphical Models?
o system-level abstractions: Graphical models can reveal system-level properties and behavior not
apparent in the reductionist representation. System-level abstractions is especially critical in developing
developing targeted intervention.
e.g. model COVID spread in a given community from contact tracing1; use the model to assist in
assist in targeted community-based interventions/policies
e.g. model the signaling mechanism initiated by COVID spike protein; use the model to identify
identify potential target molecules for drugs to minimize disease severity/inflammation2
o in-silico models: Graphical models can be experimented in a controlled and cost-effective manner. This
includes posing questions to these models (e.g. inference).
e.g. given the evidence that a subject has cough, fever, sore throat and shortness of breath
determine the probability that the subject is COVID +ve
o causal associations: Graphical models may reveal causal association3 under certain implicit assumptions
(Note: we are attempting decipher causality from observational data!)
1https://www.cdc.gov/coronavirus/2019-ncov/daily-life-coping/contact-tracing.html
2https://www.cebm.net/covid-19/dexamethasone/
3Pearl, J [2009] Causality: Models, Reasoning and Inference.
Problem:
What we have: Data across an informed set of variables (D)
What we need: Graphical structure (G) representing the associations between these variables
Pair-wise dependencies:
Direct associations between a given pair of nodes determined using similarity measures
Note: Associations between a pair of variables may not be direct and can mediated through a third
variable.Conclusions based on pair-wise dependencies while helpful may be incomplete.
e.g. Loss of Taste (L) and Disease Severity (D) may not be associated as such (i.e. marginally
marginally independent). However, L and D may be associated given that the subject has COVID
L D
C
D
L
What we need: Graphical structure
Approach: Bayesian structure learning
- Models the joint probability distribution across the given informed set of variables
- Incorporates conditional dependencies between a given set of variables in an iterative manner
C
D
L
Data?
o multivariate: more than one variable is measured
o Can be longitudinal or cross-sectional
longitudinal:
a continuous process is sampled as a function of time resulting in time series
challenging to obtain as the several factors have to be controlled
cross-sectional:
replicate measurements of a continuous process is sampled in a given time window (snapshot)
(snapshot)
relatively easier to obtain
Note: The approaches to be discussed implicitly assumes that the properties of the data is preserved across
the replicate realizations.
Question: Given the cross-sectional data on the loss of taste (Yes/No), Disease Severity (Yes/No), Result of
COVID test (+/-) can we model the association between them
Three popular approaches for structure learning (static):
o Constraint-based Learn the structure using conditional independence tests
o Search and score Learn the structure that best fits the data using a greedy search with a scoring
criteria
o Hybrid Learn the structure using a combination of constraint-based and search-score
approaches
Subject C (+/-) D (Y/N) L (Y/N)
1 + Y Y
2 + Y N
3 - Y Y
4 - N Y
. . . .
. . . .
. . . .
C
D
L
? ?
Bayesian network structure learning:
o Exhaustive Enumeration: Number of possible structures grows super-exponentially with the number
of nodes n1.
𝑎𝑛 = 𝑘=1
𝑛
(−1)𝑘−1 𝑛
𝑘
2𝑘(𝑛−𝑘)
𝑎𝑛−𝑘
𝑎0 = 1
Note: Exhaustive enumeration in general is not computationally feasible from a practical standpoint.
1Robinson, R. W. "Counting Labeled Acyclic Digraphs." In New Directions in Graph Theory (Ed. F. Harary). New
Nodes DAGs
1 1
2 3
3 25
4 543
5 29281
. .
. .
Markov Equivalence Class: probabilistically indistinguishable graphical structures.
𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐶 . 𝑝(𝐷/𝐶)
𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐷 . 𝑝(𝐶/𝐷)
𝑝 𝐿, 𝐷, 𝐶 = 𝑝 𝐿 . 𝑝(𝐶/𝐿). 𝑝(𝐷/𝐶)
Note: Even if exhaustive enumeration were possible, structures can be learned only up to the Markov
equivalence class.
C
D
L
C
D
L
C
D
L
Search and Score (Hill Climbing):
𝑃 𝐺|𝐷 α 𝑃 𝐷|𝐺 . 𝑃(𝐺)
Theoretical consideration on the complexity of Greedy search under certain assumptions have been
been investigated1
1Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing
Likelihood Prior
Search and Score (Hill Climbing)
Hill-climbing is a sequential algorithm. Score of the present structure G* is generated by modifying the
modifying the previous structure (G) as in Step 4 in an iterative manner
BIC Score = 𝑖=1
𝑛
log[𝑃(𝑋𝑖/Π𝑋𝑖
)] −
𝑑
2
log 𝑛
Opportunities for distributing the computation in the hill climbing approach
o The potential structures interrogated in Step 4(a) can be distributed
o BIC score of a candidate structure is the sum of the scores of its local structures, hence can be
distributed
o Greedy aspect of hill-climbing in conjunction Markov equivalence can result in locally optimal
convergence encouraging repeating the procedure with multiple random restarts, this in turn can be
can be distributed
Regularization term
d = #parameters
Implementation: Architecture
*Image from IC922 Redbook
x86:
Server: HPE ProLiant DL580
servers
CPU Type: Intel Xeon EX-series
Cores per node: 16
DRAM: 512GB
POWER 9:
Server: IC922
CPU Type: DD2.3 POWER9
processor modules
Cores per node: 160 virtual cores
Access up to 32DIMM
Sustained bandwidth 28.8 GB
Implementation:
o Data description: HEPMASS1,2 (10.5 x 106 samples comprising of 28 variables , Baldi et al., 20161). All
continuous normalized features were discretized into binary categorical variables by thresholding
thresholding about their mean.
o Python Implementation:
Bayesian network using Pandas, NetworkX
1Baldi P, et al. [2016] Parameterized Neural Networks for High-Energy Physics. The Eur. Phys. J. C 76(235).
2Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing.
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
Multiple Cores Architecture: Dask Distributed
Python/Dask APIs
Parallel Restart
SHA-256 Hash confirms
uniqueness of visited graph
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
A
C
D
B
E
Spawning multiple Hill Climbing
instances
Data
Performance of structure learning on POWER and x86:
Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill-
Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and
POWER architectures (# implies significant difference).
The computational time were statistically significant (p < 0.001) between the x86 and the POWER
architectures, with the POWER architectures taken considerably lesser time than x86. As expected, BIC
score takes less computational time than K2 score and these scores
0
10000
20000
30000
40000
50000
1 2 3
Time
(Seconds)
Max Fan in
x86 POWER
Performance of x86 and POWER 9 on HEPMASS (BIC Score)
0
10000
20000
30000
40000
50000
1 2 3
Time
(Seconds)
Max Fan in
x86 POWER
Performance of x86 and POWER 9 on HEPMASS (K2 Score)
# # #
# # #
Performance of structure learning on POWER and x86 with varying Map Tasks:
Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill-
Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and
POWER architectures (# implies significant difference).
There was statistically significant difference in the computational time between the x86 and the POWER
architectures when random restarts were distributed as map task jobs. As the number of map tasks were
increased the computations time decreased across both POWER and x86 and the separation in the average
time increased between x86 and POWER.
# Corresponds to p < 0.05; * Corresponds to p <
0.0001
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
1 2 3 4 5 6 7
Time
(Seconds)
Map Tasks
x86 POWER
# # # # # * *
Performance of POWER and x86 with varying Map Tasks
(BIC Score)
2 4 8 16 32 64 128
Healthcare – current trends:
o Explosion in Digital Healthcare Data:
- Source Systems: Continued digitization from multiple sources (EHR, Claims, Registries, IoT) and multiple types
(Text, Image, Signals)
- Multiscale Profiles: Emphasis on capturing the complete description of patients.
- Common Data Models: Develop approaches for sharing observational healthcare data (OMOP/OHDSI) across
multiple organizations and research networks (e.g. HIE, PCORNet)
- High-throughput: molecular data (e.g. Next Generation Sequencing)
- FHIR: Development of (Fast Healthcare Interoperability Resources) for enhanced interoperability across systems
and devices
o Explosion in Analytics Adoption:
- Descriptive, Predictive, Prescriptive Analytics
- Shift from storage to analytics and consensus-based to evidence-based/data-driven approaches to impact
outcomes/KPIs.
- Surge in the adoption of Machine Learning (ML) and Artificial Intelligence (AI) approaches.
Healthcare Applications:
Graphical Models – where do they fit in
o Healthcare data sets are inherently multivariate and noisy attributed to
several factors. Probabilistic graphical models are especially suited to
handle noisy data.
o Associations in multivariate healthcare data may be unknown.
Graphical models can discover novel associations (hypothesis
generation) in addition to validating known associations (hypothesis
testing). Deciphering these associations is critical in prescribing
targeted interventions.
o Graphical models fall under ML and AI1. Can be used for descriptive,
predictive and prescriptive analytics (e.g. Naïve Bayes Classifier). AI
aspect of Graphical models: Answer queries posed from the evidence
provided about a disease.
o Graphical Models Healthcare applications include: Diagnostic
Reasoning, Prognostic Reasoning and Treatment selection, Discovering
functional associations2
o Emphasis on inferring causal associations from observational
healthcare data with potential to complement classical approaches (e.g.
RCT 3), RCTs being idealizations.
o Interpretable and easily visualized for critical evaluation in healthcare
settings.
Need: Architectures and programming environment that can implement
1Russell, S. Norvig, R. [2020] Artificial Intelligence: A Modern
Approach, 4th ed
2Lucas PJF et al. [2004] Bayesian networks in biomedicine and
health-care Artif. Intell. Med. 30(3):201-14
3Berwick, D [2008] The Science of Improvement, JAMA, 1182-
1184
4Mclachlan, S et al. [2020] Bayesian networks in healthcare:
Distribution by medical condition. Artificial Intelligence in
Medicine. 107, 101912
Summary
o Structure learning is computationally intensive especially across large data sets and large number of variables
o Preliminary findings revealed marked improvement in performance using POWER architectures in
addressing computational challenges of structure learning approaches such as hill-climbing
o Need for a more detailed investigation using a battery of data sets and across distinct graphical model
algorithms
o Graphical modeling approaches in general have considerable healthcare applications. Their ability to reason
under uncertainty makes them especially ideal for healthcare analytics.
o https://onstituteacademy.herokuapp.com
Acknowledgements
Marco Scutari, Ph.D. Senior Researcher, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA),
Switzerland
Terry Leatherland, Trish Froeschle, Thomas Prokop, IBM, USA
Ganesan Narayanswami, OpenPOWER leader in Education and Research

More Related Content

What's hot (19)

PDF
Parallel KNN for Big Data using Adaptive Indexing
IRJET Journal
 
PDF
A0360109
iosrjournals
 
PDF
Dimensionality Reduction Techniques for Document Clustering- A Survey
IJTET Journal
 
PDF
Big Data Processing using a AWS Dataset
Vishva Abeyrathne
 
PDF
C0312023
iosrjournals
 
PDF
A Comparison of Computation Techniques for DNA Sequence Comparison
IJORCS
 
PDF
B0330811
iosrjournals
 
PDF
Ensemble based Distributed K-Modes Clustering
IJERD Editor
 
PDF
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET Journal
 
PDF
Big Data Clustering Model based on Fuzzy Gaussian
IJCSIS Research Publications
 
PDF
Volume 2-issue-6-1930-1932
Editor IJARCET
 
PDF
Ba2419551957
IJMER
 
PPTX
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Jenny Liu
 
PDF
Cray HPC + D + A = HPDA
inside-BigData.com
 
PPTX
Clustering for Stream and Parallelism (DATA ANALYTICS)
DheerajPachauri
 
PDF
Investigating the 3D structure of the genome with Hi-C data analysis
tuxette
 
PDF
Classification accuracy of sar images for various land
eSAT Publishing House
 
PDF
Volume 2-issue-6-2143-2147
Editor IJARCET
 
PDF
Reproducibility and differential analysis with selfish
tuxette
 
Parallel KNN for Big Data using Adaptive Indexing
IRJET Journal
 
A0360109
iosrjournals
 
Dimensionality Reduction Techniques for Document Clustering- A Survey
IJTET Journal
 
Big Data Processing using a AWS Dataset
Vishva Abeyrathne
 
C0312023
iosrjournals
 
A Comparison of Computation Techniques for DNA Sequence Comparison
IJORCS
 
B0330811
iosrjournals
 
Ensemble based Distributed K-Modes Clustering
IJERD Editor
 
IRJET- Sampling Selection Strategy for Large Scale Deduplication of Synthetic...
IRJET Journal
 
Big Data Clustering Model based on Fuzzy Gaussian
IJCSIS Research Publications
 
Volume 2-issue-6-1930-1932
Editor IJARCET
 
Ba2419551957
IJMER
 
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Jenny Liu
 
Cray HPC + D + A = HPDA
inside-BigData.com
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
DheerajPachauri
 
Investigating the 3D structure of the genome with Hi-C data analysis
tuxette
 
Classification accuracy of sar images for various land
eSAT Publishing House
 
Volume 2-issue-6-2143-2147
Editor IJARCET
 
Reproducibility and differential analysis with selfish
tuxette
 

Similar to Graphical Structure Learning accelerated with POWER9 (20)

PDF
Implementing a neural network potential for exascale molecular dynamics
PFHub PFHub
 
PPTX
[20240703_LabSeminar_Huy]MakeGNNGreatAgain.pptx
thanhdowork
 
PDF
Data dissemination and materials informatics at LBNL
Anubhav Jain
 
PDF
PointNet
PetteriTeikariPhD
 
PPTX
[20240628_LabSeminar_Huy]ScalableSTGNN.pptx
thanhdowork
 
PDF
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Derryck Lamptey, MPhil, CISSP
 
PDF
Massive parallelism with gpus for centrality ranking in complex networks
ijcsit
 
PDF
Ijciet 10 01_153-2
IAEME Publication
 
PDF
Multiple Target Machine Learning Prediction of Capacity Curves of Reinforced ...
Journal of Soft Computing in Civil Engineering
 
PDF
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
lauratoni4
 
PDF
Model Evaluation in the land of Deep Learning
Pramit Choudhary
 
PDF
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
ijcsit
 
PDF
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
ijitcs
 
PDF
Laplacian-regularized Graph Bandits
lauratoni4
 
PPTX
SVM - Functional Verification
Sai Kiran Kadam
 
PDF
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy
 
PDF
Engineering Data Science Objectives for Social Network Analysis
David Gleich
 
PDF
1104.0355
sudddd44
 
PDF
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
cscpconf
 
PDF
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
csandit
 
Implementing a neural network potential for exascale molecular dynamics
PFHub PFHub
 
[20240703_LabSeminar_Huy]MakeGNNGreatAgain.pptx
thanhdowork
 
Data dissemination and materials informatics at LBNL
Anubhav Jain
 
[20240628_LabSeminar_Huy]ScalableSTGNN.pptx
thanhdowork
 
Achieving Portability and Efficiency in a HPC Code Using Standard Message-pas...
Derryck Lamptey, MPhil, CISSP
 
Massive parallelism with gpus for centrality ranking in complex networks
ijcsit
 
Ijciet 10 01_153-2
IAEME Publication
 
Multiple Target Machine Learning Prediction of Capacity Curves of Reinforced ...
Journal of Soft Computing in Civil Engineering
 
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
lauratoni4
 
Model Evaluation in the land of Deep Learning
Pramit Choudhary
 
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKS
ijcsit
 
An Efficient Algorithm to Calculate The Connectivity of Hyper-Rings Distribut...
ijitcs
 
Laplacian-regularized Graph Bandits
lauratoni4
 
SVM - Functional Verification
Sai Kiran Kadam
 
Scalable and Efficient Algorithms for Analysis of Massive, Streaming Graphs
Jason Riedy
 
Engineering Data Science Objectives for Social Network Analysis
David Gleich
 
1104.0355
sudddd44
 
X-TREPAN: A MULTI CLASS REGRESSION AND ADAPTED EXTRACTION OF COMPREHENSIBLE D...
cscpconf
 
X-TREPAN : A Multi Class Regression and Adapted Extraction of Comprehensible ...
csandit
 
Ad

More from Ganesan Narayanasamy (20)

PDF
Empowering Engineering Faculties: Bridging the Gap with Emerging Technologies
Ganesan Narayanasamy
 
PDF
Chip Design Curriculum development Residency program
Ganesan Narayanasamy
 
PDF
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
PDF
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 
PDF
Workload Transformation and Innovations in POWER Architecture
Ganesan Narayanasamy
 
PDF
OpenPOWER Workshop at IIT Roorkee
Ganesan Narayanasamy
 
PDF
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
PDF
IBM BOA for POWER
Ganesan Narayanasamy
 
PDF
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
PDF
OpenPOWER Latest Updates
Ganesan Narayanasamy
 
PDF
POWER10 innovations for HPC
Ganesan Narayanasamy
 
PDF
Deeplearningusingcloudpakfordata
Ganesan Narayanasamy
 
PDF
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
PDF
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
Ganesan Narayanasamy
 
PDF
AI in healthcare - Use Cases
Ganesan Narayanasamy
 
PDF
AI in Health Care using IBM Systems/OpenPOWER systems
Ganesan Narayanasamy
 
PDF
AI in Healh Care using IBM POWER systems
Ganesan Narayanasamy
 
PDF
Poster from NUS
Ganesan Narayanasamy
 
PDF
SAP HANA on POWER9 systems
Ganesan Narayanasamy
 
PDF
AI in the enterprise
Ganesan Narayanasamy
 
Empowering Engineering Faculties: Bridging the Gap with Emerging Technologies
Ganesan Narayanasamy
 
Chip Design Curriculum development Residency program
Ganesan Narayanasamy
 
Basics of Digital Design and Verilog
Ganesan Narayanasamy
 
180 nm Tape out experience using Open POWER ISA
Ganesan Narayanasamy
 
Workload Transformation and Innovations in POWER Architecture
Ganesan Narayanasamy
 
OpenPOWER Workshop at IIT Roorkee
Ganesan Narayanasamy
 
Deep Learning Use Cases using OpenPOWER systems
Ganesan Narayanasamy
 
IBM BOA for POWER
Ganesan Narayanasamy
 
OpenPOWER System Marconi100
Ganesan Narayanasamy
 
OpenPOWER Latest Updates
Ganesan Narayanasamy
 
POWER10 innovations for HPC
Ganesan Narayanasamy
 
Deeplearningusingcloudpakfordata
Ganesan Narayanasamy
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
Ganesan Narayanasamy
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
Ganesan Narayanasamy
 
AI in healthcare - Use Cases
Ganesan Narayanasamy
 
AI in Health Care using IBM Systems/OpenPOWER systems
Ganesan Narayanasamy
 
AI in Healh Care using IBM POWER systems
Ganesan Narayanasamy
 
Poster from NUS
Ganesan Narayanasamy
 
SAP HANA on POWER9 systems
Ganesan Narayanasamy
 
AI in the enterprise
Ganesan Narayanasamy
 
Ad

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
Digital Circuits, important subject in CS
contactparinay1
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
The Rise of AI and IoT in Mobile App Tech.pdf
IMG Global Infotech
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Digital Circuits, important subject in CS
contactparinay1
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
UPDF - AI PDF Editor & Converter Key Features
DealFuel
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
AI Agents in the Cloud: The Rise of Agentic Cloud Architecture
Lilly Gracia
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 

Graphical Structure Learning accelerated with POWER9

  • 1. Arghya Kusum Das, Ph.D. Assistant Professor, University of Wisconsin-Platteville In collaboration with Radha Nagarajan, Ph.D. Director, COSH, Marshfield Clinic Health System (Digital Health, Data Science, Bioinformatics, RWE) RWE) Graphical Structure Learning Accelerated with POWER9
  • 2. o Overview of Graphical Models o Implementation o Preliminary Findings o Healthcare Applications Overview
  • 3. Graphs/Networks: Comprised of nodes and edges nodes/vertex: represent the entities of interest edges: represent the associations/relationships between the nodes. Graphical models: Model the associations between the entities as a graph. Example: nodes: COVID subjects edges: association between the COVID subjects (e.g. contact tracing) © searchengineland.com
  • 4. Why Graphical Models? o system-level abstractions: Graphical models can reveal system-level properties and behavior not apparent in the reductionist representation. System-level abstractions is especially critical in developing developing targeted intervention. e.g. model COVID spread in a given community from contact tracing1; use the model to assist in assist in targeted community-based interventions/policies e.g. model the signaling mechanism initiated by COVID spike protein; use the model to identify identify potential target molecules for drugs to minimize disease severity/inflammation2 o in-silico models: Graphical models can be experimented in a controlled and cost-effective manner. This includes posing questions to these models (e.g. inference). e.g. given the evidence that a subject has cough, fever, sore throat and shortness of breath determine the probability that the subject is COVID +ve o causal associations: Graphical models may reveal causal association3 under certain implicit assumptions (Note: we are attempting decipher causality from observational data!) 1https://www.cdc.gov/coronavirus/2019-ncov/daily-life-coping/contact-tracing.html 2https://www.cebm.net/covid-19/dexamethasone/ 3Pearl, J [2009] Causality: Models, Reasoning and Inference.
  • 5. Problem: What we have: Data across an informed set of variables (D) What we need: Graphical structure (G) representing the associations between these variables Pair-wise dependencies: Direct associations between a given pair of nodes determined using similarity measures Note: Associations between a pair of variables may not be direct and can mediated through a third variable.Conclusions based on pair-wise dependencies while helpful may be incomplete. e.g. Loss of Taste (L) and Disease Severity (D) may not be associated as such (i.e. marginally marginally independent). However, L and D may be associated given that the subject has COVID L D C D L
  • 6. What we need: Graphical structure Approach: Bayesian structure learning - Models the joint probability distribution across the given informed set of variables - Incorporates conditional dependencies between a given set of variables in an iterative manner C D L
  • 7. Data? o multivariate: more than one variable is measured o Can be longitudinal or cross-sectional longitudinal: a continuous process is sampled as a function of time resulting in time series challenging to obtain as the several factors have to be controlled cross-sectional: replicate measurements of a continuous process is sampled in a given time window (snapshot) (snapshot) relatively easier to obtain Note: The approaches to be discussed implicitly assumes that the properties of the data is preserved across the replicate realizations.
  • 8. Question: Given the cross-sectional data on the loss of taste (Yes/No), Disease Severity (Yes/No), Result of COVID test (+/-) can we model the association between them Three popular approaches for structure learning (static): o Constraint-based Learn the structure using conditional independence tests o Search and score Learn the structure that best fits the data using a greedy search with a scoring criteria o Hybrid Learn the structure using a combination of constraint-based and search-score approaches Subject C (+/-) D (Y/N) L (Y/N) 1 + Y Y 2 + Y N 3 - Y Y 4 - N Y . . . . . . . . . . . . C D L ? ?
  • 9. Bayesian network structure learning: o Exhaustive Enumeration: Number of possible structures grows super-exponentially with the number of nodes n1. 𝑎𝑛 = 𝑘=1 𝑛 (−1)𝑘−1 𝑛 𝑘 2𝑘(𝑛−𝑘) 𝑎𝑛−𝑘 𝑎0 = 1 Note: Exhaustive enumeration in general is not computationally feasible from a practical standpoint. 1Robinson, R. W. "Counting Labeled Acyclic Digraphs." In New Directions in Graph Theory (Ed. F. Harary). New Nodes DAGs 1 1 2 3 3 25 4 543 5 29281 . . . .
  • 10. Markov Equivalence Class: probabilistically indistinguishable graphical structures. 𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐶 . 𝑝(𝐷/𝐶) 𝑝 𝐿, 𝐷, 𝐶 = 𝑝(𝐿/𝐶). 𝑝 𝐷 . 𝑝(𝐶/𝐷) 𝑝 𝐿, 𝐷, 𝐶 = 𝑝 𝐿 . 𝑝(𝐶/𝐿). 𝑝(𝐷/𝐶) Note: Even if exhaustive enumeration were possible, structures can be learned only up to the Markov equivalence class. C D L C D L C D L
  • 11. Search and Score (Hill Climbing): 𝑃 𝐺|𝐷 α 𝑃 𝐷|𝐺 . 𝑃(𝐺) Theoretical consideration on the complexity of Greedy search under certain assumptions have been been investigated1 1Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing Likelihood Prior
  • 12. Search and Score (Hill Climbing) Hill-climbing is a sequential algorithm. Score of the present structure G* is generated by modifying the modifying the previous structure (G) as in Step 4 in an iterative manner BIC Score = 𝑖=1 𝑛 log[𝑃(𝑋𝑖/Π𝑋𝑖 )] − 𝑑 2 log 𝑛 Opportunities for distributing the computation in the hill climbing approach o The potential structures interrogated in Step 4(a) can be distributed o BIC score of a candidate structure is the sum of the scores of its local structures, hence can be distributed o Greedy aspect of hill-climbing in conjunction Markov equivalence can result in locally optimal convergence encouraging repeating the procedure with multiple random restarts, this in turn can be can be distributed Regularization term d = #parameters
  • 13. Implementation: Architecture *Image from IC922 Redbook x86: Server: HPE ProLiant DL580 servers CPU Type: Intel Xeon EX-series Cores per node: 16 DRAM: 512GB POWER 9: Server: IC922 CPU Type: DD2.3 POWER9 processor modules Cores per node: 160 virtual cores Access up to 32DIMM Sustained bandwidth 28.8 GB
  • 14. Implementation: o Data description: HEPMASS1,2 (10.5 x 106 samples comprising of 28 variables , Baldi et al., 20161). All continuous normalized features were discretized into binary categorical variables by thresholding thresholding about their mean. o Python Implementation: Bayesian network using Pandas, NetworkX 1Baldi P, et al. [2016] Parameterized Neural Networks for High-Energy Physics. The Eur. Phys. J. C 76(235). 2Scutari, M et al. [2018] Learning Bayesian Networks from Big Data with Greedy Search, Statistics and Computing. A C D B E A C D B E A C D B E A C D B E A C D B E A C D B E
  • 15. Multiple Cores Architecture: Dask Distributed Python/Dask APIs Parallel Restart SHA-256 Hash confirms uniqueness of visited graph A C D B E A C D B E A C D B E A C D B E A C D B E A C D B E Spawning multiple Hill Climbing instances Data
  • 16. Performance of structure learning on POWER and x86: Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill- Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and POWER architectures (# implies significant difference). The computational time were statistically significant (p < 0.001) between the x86 and the POWER architectures, with the POWER architectures taken considerably lesser time than x86. As expected, BIC score takes less computational time than K2 score and these scores 0 10000 20000 30000 40000 50000 1 2 3 Time (Seconds) Max Fan in x86 POWER Performance of x86 and POWER 9 on HEPMASS (BIC Score) 0 10000 20000 30000 40000 50000 1 2 3 Time (Seconds) Max Fan in x86 POWER Performance of x86 and POWER 9 on HEPMASS (K2 Score) # # # # # #
  • 17. Performance of structure learning on POWER and x86 with varying Map Tasks: Mean, standard distribution of the computational time across 5 runs of the HEPMASS data with Hill- Climbing. A two-sample ttest with unequal variance was used to compared the times between x86 and POWER architectures (# implies significant difference). There was statistically significant difference in the computational time between the x86 and the POWER architectures when random restarts were distributed as map task jobs. As the number of map tasks were increased the computations time decreased across both POWER and x86 and the separation in the average time increased between x86 and POWER. # Corresponds to p < 0.05; * Corresponds to p < 0.0001 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 1 2 3 4 5 6 7 Time (Seconds) Map Tasks x86 POWER # # # # # * * Performance of POWER and x86 with varying Map Tasks (BIC Score) 2 4 8 16 32 64 128
  • 18. Healthcare – current trends: o Explosion in Digital Healthcare Data: - Source Systems: Continued digitization from multiple sources (EHR, Claims, Registries, IoT) and multiple types (Text, Image, Signals) - Multiscale Profiles: Emphasis on capturing the complete description of patients. - Common Data Models: Develop approaches for sharing observational healthcare data (OMOP/OHDSI) across multiple organizations and research networks (e.g. HIE, PCORNet) - High-throughput: molecular data (e.g. Next Generation Sequencing) - FHIR: Development of (Fast Healthcare Interoperability Resources) for enhanced interoperability across systems and devices o Explosion in Analytics Adoption: - Descriptive, Predictive, Prescriptive Analytics - Shift from storage to analytics and consensus-based to evidence-based/data-driven approaches to impact outcomes/KPIs. - Surge in the adoption of Machine Learning (ML) and Artificial Intelligence (AI) approaches.
  • 19. Healthcare Applications: Graphical Models – where do they fit in o Healthcare data sets are inherently multivariate and noisy attributed to several factors. Probabilistic graphical models are especially suited to handle noisy data. o Associations in multivariate healthcare data may be unknown. Graphical models can discover novel associations (hypothesis generation) in addition to validating known associations (hypothesis testing). Deciphering these associations is critical in prescribing targeted interventions. o Graphical models fall under ML and AI1. Can be used for descriptive, predictive and prescriptive analytics (e.g. Naïve Bayes Classifier). AI aspect of Graphical models: Answer queries posed from the evidence provided about a disease. o Graphical Models Healthcare applications include: Diagnostic Reasoning, Prognostic Reasoning and Treatment selection, Discovering functional associations2 o Emphasis on inferring causal associations from observational healthcare data with potential to complement classical approaches (e.g. RCT 3), RCTs being idealizations. o Interpretable and easily visualized for critical evaluation in healthcare settings. Need: Architectures and programming environment that can implement 1Russell, S. Norvig, R. [2020] Artificial Intelligence: A Modern Approach, 4th ed 2Lucas PJF et al. [2004] Bayesian networks in biomedicine and health-care Artif. Intell. Med. 30(3):201-14 3Berwick, D [2008] The Science of Improvement, JAMA, 1182- 1184 4Mclachlan, S et al. [2020] Bayesian networks in healthcare: Distribution by medical condition. Artificial Intelligence in Medicine. 107, 101912
  • 20. Summary o Structure learning is computationally intensive especially across large data sets and large number of variables o Preliminary findings revealed marked improvement in performance using POWER architectures in addressing computational challenges of structure learning approaches such as hill-climbing o Need for a more detailed investigation using a battery of data sets and across distinct graphical model algorithms o Graphical modeling approaches in general have considerable healthcare applications. Their ability to reason under uncertainty makes them especially ideal for healthcare analytics. o https://onstituteacademy.herokuapp.com Acknowledgements Marco Scutari, Ph.D. Senior Researcher, Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA), Switzerland Terry Leatherland, Trish Froeschle, Thomas Prokop, IBM, USA Ganesan Narayanswami, OpenPOWER leader in Education and Research