Introduction To The Ibm Dataops Methodology and Practice
Introduction To The Ibm Dataops Methodology and Practice
Julie Lockner
Director, Portfolio Optimization
and Offering Management
IBM Data and AI
Steven Eliuk
VP, Deep Learning &
Governance Automation
IBM Global CDO
There is no AI 81% 8X
without IA do not understand
the data required
AI pioneers are 8X
more likely to have
(information architecture) for AI a robust data
architecture
AI
ANALYZE - Build and scale AI with trust & explainability
MODERNIZE
ORGANIZE - Create a business-ready analytics foundation Unlock the value of data for
an AI and multicloud world
COLLECT - Make data simple and accessible
Months - Quarters
“Our study shows that 95% of organizations see negative impacts from
poor data quality, resulting in wasted resources and additional costs.”
https://www.experian.co.uk/assets/data-quality/experian-global-data-management-report-jan-2019.pdf
Gartner
Hours - Days
Months - Quarters
200,000 2 Hour
85% 90%
ROI
DataOps Impact
80%
Data Prep
1
3
Single iteration
Multiple iterations
Months-Quarters
Days-Weeks
One outcome, costly if wrong
Multiple outcomes, more chances for success
IBM Watson / © 2020 IBM Corporation
DataOps requires Automation
and Multicloud Architecture
Business-ready
Automated master data management data
On-Prem
• Know: Spreadsheets
No DataOps • Trust: Emails
• Use: Hand coding
Enterprise Ops & Services VP Finance, Controller Global Chief Data Office CIO
CAO
Enterprise Data & AI Platform Enterprise Data Governance Adoption & Value Creation Client & Product Master Data Deep Learning
Production Platform & Solutions Enterprise Governance Workflow Modernization & Transformation
Platform Adoption
Engineering Delivery automation leveraging Enterprise Data & AI
Platform
Business Controls, Support & Data Acquisition (M&A, 3rd Party, AI Accelerator
Operations Public)
It can take DAYS for SMEs to Users can easily find, understand and trust the
review/ approve business data they need to drive
term business insights WITH SPEED
• A complex series of organic Deep Lack of data for model training impacts the performance
Learning models were developed for
CEDP metadata classifications Local restrictions related to processing of the business
information within the limits of certain jurisdiction
• Backed by micro-services: Can be
installed anywhere (cloud, container)
95% reduction
Up to
~$27
in cycle time: Dramatically enhanced
million
targeted at full automation in 18 months Data Quality
with regulatory & in
governance checks productivity
savings
Unified.
Classifying terabytes of data to make it easily discoverable while providing
the data stewardship, lineage, and impact analysis to assure it is trustworthy
24
Small Tag Set as a Product
Up to 96% accuracy Business terms can differ across the different groups
on holdout data in an organization.
To address this:
AMG's classifications in the current release
Up to 70% accuracy use an "umbrella" set of 25 terms defined to cover
on data that was once the varying cases we see at the GCDO
inaccessible
Q1 2018 Q4 2019