SlideShare a Scribd company logo
Apache Atlas & Open Metadata
Dataworks Sydney 2017
Nigel Jones,
Software Architect
IBM
Ferd Scheepers
Chief Information Architect
ING
2
Open Metadata and Governance will allow…
… metadata to be captured when the data is created, moved with the data and
be augmented and processed by any of the vendor tools.
Open Metadata and Governance consists of:
1. Standardized, extensible set of metadata types
2. Metadata exchange APIs and notifications
3. Frameworks for automated governance
Open Metadata and Governance will allow you to have:
1. An enterprise data catalogue that lists all of your data, where it is located, its origin (lineage),
owner, structure, meaning, classification and quality
2. New data tools (from any vendor) connect to your data catalogue out of the box
3. Metadata being added automatically to the catalogue as new data is created and analysed
4. Subject matter experts collaborating around the data
5. Automated governance processes protect and manage your data
3
What is Open Metadata and Governance?
4
Positioning of Apache Atlas for Open Metadata
Open and
Unified Metadata
Metadata
repository
Apache Atlas
Metadata
repository
IBM
Metadata
repository
SAS
Open Metadata Repository Service
OMRS
Open Metadata Access Service
OMAS
Components defined
and being developed
by Open Metadata &
Governance project
Metadata
highway
• Apache Atlas provides an open community for developing the reference implementation
for open metadata and governance. In essence Apache Atlas delivers 2 main
capabilities:
• it plays a role of a metadata repository (Graph Database) for a metadata end-user tool
• and, it plays the important role of delivering the federated/unified metadata layer
across the entire landscape of an enterprise
• The software development governance from the Apache Software Foundation (ASF)
creates confidence that the technology will be maintained and enhanced as appropriate
in an equitable manner.
Role of Apache Atlas
5
… because Apache is mostly focused on development and we are missing a governance
body for managing the adoption of and compliance to the Open Metadata and Governance
standards. We envision the following roles for ODPI:
1. Be an advocate of the Open Metadata and Governance standards, make them visible
and their value understood.
2. Facilitate discussions around the Open Metadata and Governance standards evolution,
maintenance and development.
3. Test and sign-off compliance of vendor offerings to the Open Metadata and Governance
standards.
6
Doing all of this under Apache Atlas flag is not enough…
1. Hands-on Community members:
• ING
• IBM
• HortonWorks
2. Companies we have had conversations with:
• CIBC
• SAS
• Microsoft
• Oracle
• Informatica
• Waterline
• RBC
• DBS
7
Who is in ?
1. Ambition level:
• End of September 2017: Open Metadata working demo.
• Mid-December November 2017: first version of user access.
• Google for Data
2. Next steps:
• End of Q2 2018: production ready version of Virtual Data
Connector.
8
Timeline and next steps
About Me
https://www.linkedin.com/in/nigelljones
https://www.twitter.com/planetf1
jonesn@uk.ibm.com@
Objective
Why
How
Excite &
Engage
Apache Atlas
Open
Metadata
Atlas has graduated!
DOB: 2015-05-05R: 0.8.1
Atlas Architecture
Storage Repository
Graph
 Type System
 REST API
 Models
 UI & Apps
Hooks &
Bridges
https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
A reminder of our problem.. And solution
Open and
Unified Metadata
Extend beyond Hadoop
++
Common Core Data model
Data Assets Governance Lineage
Glossary Collaboration
Models &
Reference
Data
Base Types,
Systems &
Infrastructure
Metadata
Discovery
https://cwiki.apache.org/confluence/display/ATLAS/Building+out+the+Open+Metadata+Typesystem
Open Metadata and Governance with Apache Atlas
Open APIs - OMRS
Metadata Highway
Adapter
Plugin
Open Connector
Framework
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258803
Open APIs - OMAS
OMRS
Governance
Engine
OMAS
Glossary
OMAS
Asset OMAS
Information
View OMAS
++......
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
OMAS – detail
Project List
Metadata Service
Data/Asset
Community Metadata
Service
Landscape Definition
Metadata Service
Asset Catalog
Metadata Service
Classification and Mapping
Metadata Service
Information View
Metadata Service
Connector Directory
Metadata Service
Governance Definitions
Metadata Service
Information Process
Metadata Service
Glossary and Taxonomy
Metadata Service
Asset
Metadata Service
Discovery
Metadata Service
Governance Action
Metadata Service
Roles and Access
Metadata Service
Models and Schema
Metadata Service
Connector
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
Business
metadata
Structural
metadata for
a data store
New glossary function for semantic processing
EMPNAME EMPNO JOBCODE SALARY
EMPLOYEE
RECORD
Employee
Work Location
Annual Salary
Job Title
Employee Id
Employee Name
Hourly Pay Rate
Manager Compensation Plan
HAS-A
HAS-A
HAS-A
HAS-A
HAS-A
HAS-A
IS-A IS-A
Sensitive
IS-A
Data
00 3809890 6 7 Lemmie Stage 818928 3082 4 New York 4 27 DataStage Expert 1 45324 300 27 Code St Harlem NY 1 3
https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
Replacing v1
Taxonomy (tech
preview)
Categories
Terms
hierarchies
Rich
Relationships
Classifications
Glossary
https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
Open Discovery Framework
Open
Framework
Plugins
characterize data
& relationships
Updates
metadata
with results
Initial implementation
in master
https://cwiki.apache.org/confluence/display/ATLAS/Automated+metadata+discovery
Governance Action Framework
metadata drives enforcement
Classification (tag) based – scalable, glossary
driven
Access, Masking, Filtering
Supports Apache Ranger but open APIs for others
Audit,Rights - Exception management, Rights,
Privacy (to look at in future)
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258801
Open ecosystem
https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
Summary
Open
Metadata
Enterprise
Catalog
Discovery
Multi
Vendor
Open,
Layered
APIs
Metadata
store
integration
Open
Source &
Governance
ubiquitous
Standard
Models
How can I get involved?
Discuss: Mailing List
Document, Explain: Wiki
Report, Design: Jira
Face to face
Code
Vendors!
https://cwiki.apache.org/confluence/display/ATLAS/Getting+Involved
Governance & Security BOF
Thursday 18:00
C4.7
Open Metadata and Governance with Apache Atlas
Backup
VDC End to End
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69407333
Ad

More Related Content

What's hot (20)

Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Microsoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D KoutsanastasisMicrosoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D Koutsanastasis
Uni Systems S.M.S.A.
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
Tathastu.ai
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
Araf Karsh Hamid
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
C4Media
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
Databricks
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
DataWorks Summit
 
Data engineering
Data engineeringData engineering
Data engineering
Suman Debnath
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
HostedbyConfluent
 
Making Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta LakeMaking Apache Spark Better with Delta Lake
Making Apache Spark Better with Delta Lake
Databricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Microsoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D KoutsanastasisMicrosoft Fabric Intro D Koutsanastasis
Microsoft Fabric Intro D Koutsanastasis
Uni Systems S.M.S.A.
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
Building robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and DebeziumBuilding robust CDC pipeline with Apache Hudi and Debezium
Building robust CDC pipeline with Apache Hudi and Debezium
Tathastu.ai
 
Spark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka StreamsSpark (Structured) Streaming vs. Kafka Streams
Spark (Structured) Streaming vs. Kafka Streams
Guido Schmutz
 
Big Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb ShardingBig Data Redis Mongodb Dynamodb Sharding
Big Data Redis Mongodb Dynamodb Sharding
Araf Karsh Hamid
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform SystemModern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and HudiHow to build a streaming Lakehouse with Flink, Kafka, and Hudi
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Flink Forward
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
Jeffrey T. Pollock
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
C4Media
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
Databricks
 
Introduction to Azure Databricks
Introduction to Azure DatabricksIntroduction to Azure Databricks
Introduction to Azure Databricks
James Serra
 
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solutionDifferentiate Big Data vs Data Warehouse use cases for a cloud solution
Differentiate Big Data vs Data Warehouse use cases for a cloud solution
James Serra
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
HostedbyConfluent
 

Viewers also liked (6)

Bulletproof Jobs: Patterns For Large-Scale Spark Processing
Bulletproof Jobs: Patterns For Large-Scale Spark ProcessingBulletproof Jobs: Patterns For Large-Scale Spark Processing
Bulletproof Jobs: Patterns For Large-Scale Spark Processing
Spark Summit
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
confluent
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
Apurva Mehta
 
Avro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and HadoopAvro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and Hadoop
Jean-Paul Azar
 
Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)Intro to Pinot (2016-01-04)
Intro to Pinot (2016-01-04)
Jean-François Im
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
Kishore Gopalakrishna
 
Bulletproof Jobs: Patterns For Large-Scale Spark Processing
Bulletproof Jobs: Patterns For Large-Scale Spark ProcessingBulletproof Jobs: Patterns For Large-Scale Spark Processing
Bulletproof Jobs: Patterns For Large-Scale Spark Processing
Spark Summit
 
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams APIuser Behavior Analysis with Session Windows and Apache Kafka's Streams API
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
confluent
 
Introducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache KafkaIntroducing Exactly Once Semantics To Apache Kafka
Introducing Exactly Once Semantics To Apache Kafka
Apurva Mehta
 
Avro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and HadoopAvro Tutorial - Records with Schema for Kafka and Hadoop
Avro Tutorial - Records with Schema for Kafka and Hadoop
Jean-Paul Azar
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
Kishore Gopalakrishna
 
Ad

Similar to Open Metadata and Governance with Apache Atlas (20)

Apache atlas sydney 2017-v4
Apache atlas   sydney 2017-v4Apache atlas   sydney 2017-v4
Apache atlas sydney 2017-v4
Nigel Jones
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
DataWorks Summit
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
DataWorks Summit
 
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
DataWorks Summit/Hadoop Summit
 
Unleashing the power of apache atlas with apache - virtual dataconnector
Unleashing the power of apache atlas with apache  - virtual dataconnectorUnleashing the power of apache atlas with apache  - virtual dataconnector
Unleashing the power of apache atlas with apache - virtual dataconnector
Nigel Jones
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Peter Haase
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Slim Baltagi
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
DataWorks Summit/Hadoop Summit
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Indus Khaitan
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
AnalytixDataServices
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
Technical Challenges in Open Metadata
Technical Challenges in Open MetadataTechnical Challenges in Open Metadata
Technical Challenges in Open Metadata
All Things Open
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
Sheetal Pratik
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
Harshakumar Ummerpillai
 
Become an data driven organization through unified metadata using ODPi Egeria
Become an data driven organization through unified metadata using ODPi EgeriaBecome an data driven organization through unified metadata using ODPi Egeria
Become an data driven organization through unified metadata using ODPi Egeria
Data Con LA
 
Apache atlas sydney 2017-v4
Apache atlas   sydney 2017-v4Apache atlas   sydney 2017-v4
Apache atlas sydney 2017-v4
Nigel Jones
 
The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...The rise of big data governance: insight on this emerging trend from active o...
The rise of big data governance: insight on this emerging trend from active o...
DataWorks Summit
 
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
The Rise of Big Data Governance: Insight on this Emerging Trend from Active O...
DataWorks Summit
 
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
ING- CoreIntel- Collect and Process Network Logs Across Data Centers in Real ...
DataWorks Summit/Hadoop Summit
 
Unleashing the power of apache atlas with apache - virtual dataconnector
Unleashing the power of apache atlas with apache  - virtual dataconnectorUnleashing the power of apache atlas with apache  - virtual dataconnector
Unleashing the power of apache atlas with apache - virtual dataconnector
Nigel Jones
 
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the CloudBuilding Enterprise-Ready Knowledge Graph Applications in the Cloud
Building Enterprise-Ready Knowledge Graph Applications in the Cloud
Peter Haase
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Slim Baltagi
 
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo DataFest 2016: Comparing and Contrasting Data Virtualization With Data...
Denodo
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are InterchangeableMyth Busters II: BI Tools and Data Virtualization are Interchangeable
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Creating Interactive Olap Applications With My Sql Enterprise And Mondrian Pr...
Indus Khaitan
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
AnalytixDataServices
 
Master Meta Data
Master Meta DataMaster Meta Data
Master Meta Data
Digikrit
 
Technical Challenges in Open Metadata
Technical Challenges in Open MetadataTechnical Challenges in Open Metadata
Technical Challenges in Open Metadata
All Things Open
 
LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
Sheetal Pratik
 
Become an data driven organization through unified metadata using ODPi Egeria
Become an data driven organization through unified metadata using ODPi EgeriaBecome an data driven organization through unified metadata using ODPi Egeria
Become an data driven organization through unified metadata using ODPi Egeria
Data Con LA
 
Ad

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
DataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
DataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
DataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
DataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
DataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
DataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
DataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
DataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
DataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
DataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
DataWorks Summit
 

Recently uploaded (20)

2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your UsersAutomation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Lynda Kane
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
"PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System""PHP and MySQL CRUD Operations for Student Management System"
"PHP and MySQL CRUD Operations for Student Management System"
Jainul Musani
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your UsersAutomation Dreamin' 2022: Sharing Some Gratitude with Your Users
Automation Dreamin' 2022: Sharing Some Gratitude with Your Users
Lynda Kane
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
tecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdftecnologias de las primeras civilizaciones.pdf
tecnologias de las primeras civilizaciones.pdf
fjgm517
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Salesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docxSalesforce AI Associate 2 of 2 Certification.docx
Salesforce AI Associate 2 of 2 Certification.docx
José Enrique López Rivera
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
AI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global TrendsAI and Data Privacy in 2025: Global Trends
AI and Data Privacy in 2025: Global Trends
InData Labs
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 

Open Metadata and Governance with Apache Atlas

  • 1. Apache Atlas & Open Metadata Dataworks Sydney 2017 Nigel Jones, Software Architect IBM Ferd Scheepers Chief Information Architect ING
  • 2. 2 Open Metadata and Governance will allow… … metadata to be captured when the data is created, moved with the data and be augmented and processed by any of the vendor tools.
  • 3. Open Metadata and Governance consists of: 1. Standardized, extensible set of metadata types 2. Metadata exchange APIs and notifications 3. Frameworks for automated governance Open Metadata and Governance will allow you to have: 1. An enterprise data catalogue that lists all of your data, where it is located, its origin (lineage), owner, structure, meaning, classification and quality 2. New data tools (from any vendor) connect to your data catalogue out of the box 3. Metadata being added automatically to the catalogue as new data is created and analysed 4. Subject matter experts collaborating around the data 5. Automated governance processes protect and manage your data 3 What is Open Metadata and Governance?
  • 4. 4 Positioning of Apache Atlas for Open Metadata Open and Unified Metadata Metadata repository Apache Atlas Metadata repository IBM Metadata repository SAS Open Metadata Repository Service OMRS Open Metadata Access Service OMAS Components defined and being developed by Open Metadata & Governance project Metadata highway
  • 5. • Apache Atlas provides an open community for developing the reference implementation for open metadata and governance. In essence Apache Atlas delivers 2 main capabilities: • it plays a role of a metadata repository (Graph Database) for a metadata end-user tool • and, it plays the important role of delivering the federated/unified metadata layer across the entire landscape of an enterprise • The software development governance from the Apache Software Foundation (ASF) creates confidence that the technology will be maintained and enhanced as appropriate in an equitable manner. Role of Apache Atlas 5
  • 6. … because Apache is mostly focused on development and we are missing a governance body for managing the adoption of and compliance to the Open Metadata and Governance standards. We envision the following roles for ODPI: 1. Be an advocate of the Open Metadata and Governance standards, make them visible and their value understood. 2. Facilitate discussions around the Open Metadata and Governance standards evolution, maintenance and development. 3. Test and sign-off compliance of vendor offerings to the Open Metadata and Governance standards. 6 Doing all of this under Apache Atlas flag is not enough…
  • 7. 1. Hands-on Community members: • ING • IBM • HortonWorks 2. Companies we have had conversations with: • CIBC • SAS • Microsoft • Oracle • Informatica • Waterline • RBC • DBS 7 Who is in ?
  • 8. 1. Ambition level: • End of September 2017: Open Metadata working demo. • Mid-December November 2017: first version of user access. • Google for Data 2. Next steps: • End of Q2 2018: production ready version of Virtual Data Connector. 8 Timeline and next steps
  • 11. Atlas has graduated! DOB: 2015-05-05R: 0.8.1
  • 12. Atlas Architecture Storage Repository Graph  Type System  REST API  Models  UI & Apps Hooks & Bridges https://cwiki.apache.org/confluence/display/ATLAS/Open+Metadata+and+Governance
  • 13. A reminder of our problem.. And solution Open and Unified Metadata
  • 15. Common Core Data model Data Assets Governance Lineage Glossary Collaboration Models & Reference Data Base Types, Systems & Infrastructure Metadata Discovery https://cwiki.apache.org/confluence/display/ATLAS/Building+out+the+Open+Metadata+Typesystem
  • 17. Open APIs - OMRS Metadata Highway Adapter Plugin Open Connector Framework https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258803
  • 18. Open APIs - OMAS OMRS Governance Engine OMAS Glossary OMAS Asset OMAS Information View OMAS ++...... https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
  • 19. OMAS – detail Project List Metadata Service Data/Asset Community Metadata Service Landscape Definition Metadata Service Asset Catalog Metadata Service Classification and Mapping Metadata Service Information View Metadata Service Connector Directory Metadata Service Governance Definitions Metadata Service Information Process Metadata Service Glossary and Taxonomy Metadata Service Asset Metadata Service Discovery Metadata Service Governance Action Metadata Service Roles and Access Metadata Service Models and Schema Metadata Service Connector https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258799
  • 20. Business metadata Structural metadata for a data store New glossary function for semantic processing EMPNAME EMPNO JOBCODE SALARY EMPLOYEE RECORD Employee Work Location Annual Salary Job Title Employee Id Employee Name Hourly Pay Rate Manager Compensation Plan HAS-A HAS-A HAS-A HAS-A HAS-A HAS-A IS-A IS-A Sensitive IS-A Data 00 3809890 6 7 Lemmie Stage 818928 3082 4 New York 4 27 DataStage Expert 1 45324 300 27 Code St Harlem NY 1 3 https://cwiki.apache.org/confluence/display/ATLAS/Area+3+-+Glossary
  • 22. Open Discovery Framework Open Framework Plugins characterize data & relationships Updates metadata with results Initial implementation in master https://cwiki.apache.org/confluence/display/ATLAS/Automated+metadata+discovery
  • 23. Governance Action Framework metadata drives enforcement Classification (tag) based – scalable, glossary driven Access, Masking, Filtering Supports Apache Ranger but open APIs for others Audit,Rights - Exception management, Rights, Privacy (to look at in future) https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70258801
  • 26. How can I get involved? Discuss: Mailing List Document, Explain: Wiki Report, Design: Jira Face to face Code Vendors! https://cwiki.apache.org/confluence/display/ATLAS/Getting+Involved
  • 27. Governance & Security BOF Thursday 18:00 C4.7
  • 30. VDC End to End https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=69407333