SlideShare a Scribd company logo
Data Vault Modeling – An Insight
Nishant Gupta
Bangalore, September 27th
#CWIN17
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 2
Table of Contents
 Introduction
 Data Vault Modeling Components
 Building Data Vault model
 DVM – An Answer
 DVM 2.0 – An Agile way
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 3
Introduction
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 4
What is Data Vault Modeling ?
The Data Vault is a detail oriented, historical tracking and uniquely linked set of
normalized tables that support one or more functional areas of business.
It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF)
and star schema. The design is flexible, scalable, consistent and adaptable to the needs
of the enterprise.
It is a data model that is architected specifically to meet the needs of enterprise data
warehouses.
--Dan Linstedt
The Data Vault is functionally based, not subject oriented as defined
by Bill Inmon
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 5
Why Data Vault Modeling ?
 3 NF has complex Pks when cascading snapshot dates (time driven PKs)
 Real time loading is a challenge
 Drill Down Analysis & Queries are complex and tedious
 Top down approach results in unavoidable Flexibility and Scalability Issues
 Star Schema – difficult to implement /re engineer fact tables for granularity changes
 Data Redundancy & Helper Tables isolates subject related information
 Inconsistent Query linkages due to incompatible grains
 Synchronization issues during Real Time Data Load
 Limited Enterprise views and Data Mining capabilities
Challenges with DW Data Modeling Architectures
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 6
Data Vault Modeling Place in Evolution
 1960s - Codd, Date et. al Normal Forms
 1970s - Peter Chen created E-R diagramming
 1980s - Normal Forms adapted to DWs courtesy Bill Inmon
 1985+ - Star Schema for OLAP by Ralph Kimball
 1990s - Data Vault concept developed Dan Linstedt
 2000+ - Data Vault concept published by Dan Linstedt
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 7
Data Vault Modeling Evolution
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 8
Where does a Data Vault Fit ?
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 9
Data Vault Modeling Components
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 10
Data Vault Modeling - Components
 Hub Entities
 Candidate Keys + Load Time + Source
 Link Entities
 FKs from Hub + Load Time + Source
 Satellite Entities
 Descriptive Data + Load Time + Source + End Time
• Dimension = Hub+ Satellite
• Fact = Satellite + Link [+ Hub ]
Data Vault Modeling has 3 Keys Components summarized below with their attributes
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 11
DVM Components - Hubs
 Hub Entities
 Hub Entities, or Hubs, are a single table carrying at a minimum a unique
list of business keys
 For example, Invoice number, Employee number, Customer Number, Part
number and VIN etc.
 Hubs allows to integrate multiple source systems, hence, should be
source system agnostic
 It contains:
• A Hub PK The Business Key column(s)
• The Load Date (LOAD_DTS) The Source for the record (REC_SRC)
 New in DV 2.0, the Hub PK is a calculated field includes MD5 to link with
Hadoop/No Sql
Hub = Business Key
This is the most important aspect of Data Vault modeling.
It has to identified it correctly yo build an integrated enterprise data warehouse for an organization.
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 12
DVM Components – Hubs contd…
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 13
DVM Components – Hubs contd…
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 14
DVM Components - Links
 Link Entities
 Link Entities or Links, are a physical representation of a many-to-many
3NF relationship
 A Link is therefore an intersection of business keys . A Link must have at
least two Hubs, but they may be composed of many Hubs
 A Link table’s grain is defined by the number of parent keys it contains
similar to Fact Table in Dimensional Modeling
 It contains:
• A Link PK (Hash Key) The PKs from the parent Hubs – used for lookups
• The Load Date (LOAD_DTS) The Source for the record (REC_SRC)
 New in DV 2.0, The Business Key column(s) – for faster query
Links = Associations
Link captures and records the relationship of business elements at the lowest possible grain that shouldn’t
change over time . It includes transactions and hierarchies.
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 15
DVM Components – Links contd…
 Many-to-Many relationships
benefits :
 Flexibility
 Granularity
 Dynamic adaptability
 Scalability
 No need to change the EDW
structure
 Existing Data is fine
 New Data is added
Modeling Links – 1:M or M:M With Link in the Data Vault
If a Link structure is compromised, then the flexibility
of the data vault model is immediately compromised
.
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 16
DVM Components – Links contd…
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 17
DVM Components - Satellites
 Satellite Entities
 Satellite Entities or Satellites, are Hub key context (descriptive) information
 It is a time-dimensional table housing detailed information about the Hub’s
or Link’s business keys at a point in time or over a time period
 The Change Data Capture (CDC) is done and History is stored .
 It’s concept and structure is similar to Type 2 Slowly Changing Dimension
 It contains:
• Satellite Primary Key: Hub or Link Primary Key & Load Date Time Stamp
• Satellite Optional Primary Key: Sequence Surrogate Number
• Other Attributes including End Data The Source for the record (REC_SRC)
 New in DV 2.0, HASH_DIFF columns for Change Data Capture (CDC)
Satellite = Descriptors
Satellites are typically arranged by type or classification of data, and rate of change, this results in,
Satellite to split away groups of fields that change more quickly than others.
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 18
DVM Components – Satellites contd…
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 19
DVM Components – Satellites contd…
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 20
Building Data Vault
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 21
Building a Data Vault
 Model the Hubs. This requires an understanding of business keys and their usage across
the designated scope.
 Model the Links. Forming the relationships between the keys – formulating an
understanding of how the business operates today in context to each business key.
 Model the Satellites. Providing context to each of the business keys as well as the
transactions (Links) that connect the Hubs together. This begins to provide the complete
picture of thebusiness.
 Model the point-in-time tables. This is a Satellite derivative, of which the structure and
definition is outside the scope of this document (due to space constraints).
The Data Vault should be built as follows
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 22
Reference Rule for Data Vault Modeling
 Hub keys cannot migrate into other Hubs (no
parent/child like Hubs).
 Hubs must be connected through Links.
 More than two Hubs can be connected through
Links.
 Links can be connected to other Links.
 Links must have at least two Hubs associated with
them in order to be instantiated.
 Surrogate keys may be utilized for Hubs and Links.
 Surrogate keys may not be utilized for Satellites.
 Hub keys always migrate outward.
 Hub business keys never change, Hubs primary
keys never change.
 Satellites may be connected to Hubs or Links.
 Satellites always contain either a load date-time
stamp, or a numeric reference to a stand-alone load
date-time stamp sequence table.
 Stand-alone tables such as calendars, time, code
and description tables may be utilized.
 Links may have a surrogate key.
 If a hub has two or more satellites, a point-in-time
table may be constructed for ease of joins.
 Satellites are always delta driven, duplicate rows
should not appear.
 Data is separated into Satellite structures based on:
1) type of information 2) rate of change
- BK with low propensity for Change become Hub key -Transactions and Integrated keys become link tables.
- Descriptive data always fits in a Satellite
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 23
Data Vault Loading Sequence
 Hubs for Dimensions
 Links for Dimensions
 Satellites for Dimensions
 Hubs for Fact (if any )
 Links for Fact
 Satellites for Fact
Typically data loading for a Data Vault is in the following sequence :
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 24
Sample Data Vault Model
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 25
World’s Smallest Data Vault
 The Data Vault doesn’t have to be “BIG”
 An data vault can be built incrementally
 Reverse engineering one component is possible
 Building one section of Data Vault, and the connect it
with mart is the right strategy
 The smallest EDW consists of two tables :
 One Hub
 One Satellite
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 26
DVM – An Answer to EDW Architecture Pain
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 27
EDW Needs Today !
In today’s world where we are receiving loads of data across multiple sources/channels we
need Data Model that is architected specifically to meet the below stated EDW needs :
 Extensive possibilities for data attribution.
 All data relationships are key driven.
 Relationships can be dropped and created on-the-fly.
 Data Mining can discover new relationships between elements
 Artificial Intelligence can be utilized to rank the relevancy of the relationships to the user configured
outcome.
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 28
Data Vault Modeling Benefits – An Answer !
The Data Vault Model is a data integration architecture; a series of standards, and
definitional elements or methods by way information is connected within an RDBMS data
store in order to make sense of it.
Business Benefits:
• Manage and Enforce Compliance to various regulation in your Enterprise Data Warehouse
• Spot business problems that were never visible previously
• Rapidly Reduce business cycle time for implementing changes
• Merge new business units into the organization rapidly
• Rapid ROI and Delivery of information to new Star Schemas
• Consolidate disparate data stores., ie: Master Data Management
• SEI CMM Level 5 compliant (Repeatable, consistent, redundant architecture)
• Trace all data back to the source systems
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 29
Data Vault Modeling Benefits – An Answer !
Technical Benefits:
 Near-Real-Time Loads
 Traditional Batch Loads
 In-Database Data Mining
 Terabytes to Petabytes of information (Big Data)
 Incremental Build Out
 Seamless integration of unstructured data (NoSQL)
 Dynamic Model Adaptation – self healing
 Business Rule Changes (with Ease)
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 30
DVM 2.0 – An Agile way !
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 31
Agile manifesto for DW
 User Stories instead of detailed requirements
 Time-boxed iterations
 Iteration has a standard length
 Select User stories that is doable
 Rework is included in the overall scheme of things
 No missed requirements only
• Not Delivered
• Not Discovered
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 32
Agile Data Vault Modeling
 Model Iteratively
 Use DVM to create basic components
 Add more in due course of time
 Virtualize the Access Layer
 No Facts and dimension creation upfront
 ETL and Testing can take long
 Create Database View on Top of pattern based DV Model
 User sees reports with live data
 Plan for performance in later iteration
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 33
What’s New in DV2.0 ?
 Modeling Structure Includes…
 No SQL, and Non-Relational DB Systems, Hybrid Systems
 Minor Structure Changes to support NoSQL
 New ETL Implementation Standards
 For True Real time support
 For NoSQL support
 New Architecture Standards
 Support for NoSQL data management systems
 New Methodology components
 CMMI, Six Sigma and TQM
 Project Planning and Tracking
 Agile Delivery Mechanisms
 Standards, and template for Projects
The Model is fully compliant with Hadoop and needs
NO changes to work properly
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 34
References
 http://tdan.com/data-vault-series
 https://www.slideshare.net/kgraziano/introduction-to-data-vault-modeling
 Building a Scalable Data Warehouse with Data Vault 2.0 by Michael Olschimke and Dan
Linstedt
 Super Charge your Data warehouse by Kent Graziano and Dan Linstedt
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 35
Thank You!
Phone: +91 8884400312
name.nishgupt@capgemini.com
Name Nishant Gupta
Title Manager
Role Technical Manager
@Twitter Account
Speaker 1
Photo
Phone: +XX XXXXXXXX
name.name@capgemini.com
Name NAME
Title
Role
@Twitter Account
Speaker 2
Photo
Session’s Title | Date
Copyright © 2017 Capgemini and Sogeti. All rights reserved. 36
Appendix
www.capgemini.com
The information contained in this presentation is proprietary.
Copyright © 2017 Capgemini and Sogeti. All rights reserved.
Rightshore® is a trademark belonging to Capgemini.
www.sogeti.com
About Capgemini and Sogeti
With more than 180,000 people in over 40 countries, Capgemini is one of With more
than 190,000 people in over 40 countries, Capgemini is one of the world's foremost
providers of consulting, technology and outsourcing services. The Group reported
2016 global revenues of EUR 12.5 billion. Together with its clients, Capgemini
creates and delivers business, technology and digital solutions that fit their needs,
enabling them to achieve innovation and competitiveness. A deeply multicultural
organization, Capgemini has developed its own way of working, the Collaborative
Business Experience™, and draws on Rightshore®, its worldwide delivery model.
Sogeti is a leading provider of technology and software testing, specializing in
Application, Infrastructure and Engineering Services. Sogeti offers cutting-edge
solutions around Testing, Business Intelligence & Analytics, Mobile, Cloud and
Cyber Security. Sogeti brings together more than 23,000 professionals in 15
countries and has a strong local presence in over 100 locations in Europe, USA
and India. Sogeti is a wholly-owned subsidiary of Cap Gemini S.A., listed on the
Paris Stock Exchange.
Ad

More Related Content

What's hot (20)

Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Kent Graziano
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
Kent Graziano
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
Empowered Holdings, LLC
 
Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...
Daniel Upton
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland Bouman
 
Data vault modeling et retour d'expérience
Data vault modeling et retour d'expérienceData vault modeling et retour d'expérience
Data vault modeling et retour d'expérience
Swiss Data Forum Swiss Data Forum
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
Daniel Upton
 
Data vault: What's Next
Data vault: What's NextData vault: What's Next
Data vault: What's Next
Empowered Holdings, LLC
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
CGI
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
Kent Graziano
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Cloudera, Inc.
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
Kent Graziano
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
Kent Graziano
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
Harald Erb
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Kent Graziano
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
Kent Graziano
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
Empowered Holdings, LLC
 
Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...Shorter time to insight more adaptable less costly bi with end to end modelst...
Shorter time to insight more adaptable less costly bi with end to end modelst...
Daniel Upton
 
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland bouman modern_data_warehouse_architectures_data_vault_and_anchor_model...
Roland Bouman
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
Kent Graziano
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
Daniel Upton
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
Kent Graziano
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
Kent Graziano
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
CGI
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
Kent Graziano
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
Kent Graziano
 
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Cloudera, Inc.
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
Kent Graziano
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
Jonathan Seidman
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
Kent Graziano
 
Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
Harald Erb
 

Similar to CWIN 17 / sessions data vault modeling - f2-f - nishat gupta (20)

Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Kent Graziano
 
Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10
Harsha Gowda B R
 
IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And Methodology
Empowered Holdings, LLC
 
Experiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of ThingsExperiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of Things
USGProfessionalsBelgium
 
Experiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of ThingsExperiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of Things
GuyVanderSande
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Denodo
 
Data vault what's Next: Part 2
Data vault what's Next: Part 2Data vault what's Next: Part 2
Data vault what's Next: Part 2
Empowered Holdings, LLC
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
Julien Le Dem
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
Hong Ong
 
DWH Concepts
DWH ConceptsDWH Concepts
DWH Concepts
Samikkumar Shah
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
Neo4j
 
Comparison of control plane deployment architectures in the scope of hypercon...
Comparison of control plane deployment architectures in the scope of hypercon...Comparison of control plane deployment architectures in the scope of hypercon...
Comparison of control plane deployment architectures in the scope of hypercon...
Miroslav Halas
 
Maharshi_Amin_416
Maharshi_Amin_416Maharshi_Amin_416
Maharshi_Amin_416
mamin1411
 
Cloud sim report
Cloud sim reportCloud sim report
Cloud sim report
Jiachen Yang
 
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATAEXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
IRJET Journal
 
KeyAchivementsMimecast
KeyAchivementsMimecastKeyAchivementsMimecast
KeyAchivementsMimecast
Vera Ekimenko
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEEMEMTECHSTUDENTPROJECTS
 
Single view with_mongo_db_(lo)
Single view with_mongo_db_(lo)Single view with_mongo_db_(lo)
Single view with_mongo_db_(lo)
MongoDB
 
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data ModelingAgile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Agile Data Warehouse Modeling: Introduction to Data Vault Data Modeling
Kent Graziano
 
Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10Nw2008 tips tricks_edw_v10
Nw2008 tips tricks_edw_v10
Harsha Gowda B R
 
IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And Methodology
Empowered Holdings, LLC
 
Experiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of ThingsExperiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of Things
USGProfessionalsBelgium
 
Experiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of ThingsExperiences from a Data Vault Pilot Exploiting the Internet of Things
Experiences from a Data Vault Pilot Exploiting the Internet of Things
GuyVanderSande
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Denodo
 
Improving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache ArrowImproving Python and Spark Performance and Interoperability with Apache Arrow
Improving Python and Spark Performance and Interoperability with Apache Arrow
Julien Le Dem
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Kent Graziano
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
Hong Ong
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
Neo4j
 
Comparison of control plane deployment architectures in the scope of hypercon...
Comparison of control plane deployment architectures in the scope of hypercon...Comparison of control plane deployment architectures in the scope of hypercon...
Comparison of control plane deployment architectures in the scope of hypercon...
Miroslav Halas
 
Maharshi_Amin_416
Maharshi_Amin_416Maharshi_Amin_416
Maharshi_Amin_416
mamin1411
 
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATAEXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
EXPLORING WOMEN SECURITY BY DEDUPLICATION OF DATA
IRJET Journal
 
KeyAchivementsMimecast
KeyAchivementsMimecastKeyAchivementsMimecast
KeyAchivementsMimecast
Vera Ekimenko
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEEMEMTECHSTUDENTPROJECTS
 
Single view with_mongo_db_(lo)
Single view with_mongo_db_(lo)Single view with_mongo_db_(lo)
Single view with_mongo_db_(lo)
MongoDB
 
Ad

More from Capgemini (20)

Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022
Capgemini
 
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022
Capgemini
 
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022Commercial Banking Trends book 2022
Commercial Banking Trends book 2022
Capgemini
 
Top Trends in Payments 2022
Top Trends in Payments 2022Top Trends in Payments 2022
Top Trends in Payments 2022
Capgemini
 
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022
Capgemini
 
Retail Banking Trends book 2022
Retail Banking Trends book 2022Retail Banking Trends book 2022
Retail Banking Trends book 2022
Capgemini
 
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022Top Life Insurance Trends 2022
Top Life Insurance Trends 2022
Capgemini
 
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーですキャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
Capgemini
 
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021
Capgemini
 
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021Life Insurance Top Trends 2021
Life Insurance Top Trends 2021
Capgemini
 
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021
Capgemini
 
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021
Capgemini
 
Top Trends in Payments: 2021
Top Trends in Payments: 2021Top Trends in Payments: 2021
Top Trends in Payments: 2021
Capgemini
 
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021Health Insurance Top Trends 2021
Health Insurance Top Trends 2021
Capgemini
 
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021
Capgemini
 
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous PlanningCapgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous Planning
Capgemini
 
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020
Capgemini
 
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020
Capgemini
 
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020
Capgemini
 
Top Trends in Payments: 2020
Top Trends in Payments: 2020Top Trends in Payments: 2020
Top Trends in Payments: 2020
Capgemini
 
Top Healthcare Trends 2022
Top Healthcare Trends 2022Top Healthcare Trends 2022
Top Healthcare Trends 2022
Capgemini
 
Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022Top P&C Insurance Trends 2022
Top P&C Insurance Trends 2022
Capgemini
 
Commercial Banking Trends book 2022
Commercial Banking Trends book 2022Commercial Banking Trends book 2022
Commercial Banking Trends book 2022
Capgemini
 
Top Trends in Payments 2022
Top Trends in Payments 2022Top Trends in Payments 2022
Top Trends in Payments 2022
Capgemini
 
Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022Top Trends in Wealth Management 2022
Top Trends in Wealth Management 2022
Capgemini
 
Retail Banking Trends book 2022
Retail Banking Trends book 2022Retail Banking Trends book 2022
Retail Banking Trends book 2022
Capgemini
 
Top Life Insurance Trends 2022
Top Life Insurance Trends 2022Top Life Insurance Trends 2022
Top Life Insurance Trends 2022
Capgemini
 
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーですキャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
キャップジェミニ、あなたの『RISE WITH SAP』のパートナーです
Capgemini
 
Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021Property & Casualty Insurance Top Trends 2021
Property & Casualty Insurance Top Trends 2021
Capgemini
 
Life Insurance Top Trends 2021
Life Insurance Top Trends 2021Life Insurance Top Trends 2021
Life Insurance Top Trends 2021
Capgemini
 
Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021Top Trends in Commercial Banking: 2021
Top Trends in Commercial Banking: 2021
Capgemini
 
Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021Top Trends in Wealth Management: 2021
Top Trends in Wealth Management: 2021
Capgemini
 
Top Trends in Payments: 2021
Top Trends in Payments: 2021Top Trends in Payments: 2021
Top Trends in Payments: 2021
Capgemini
 
Health Insurance Top Trends 2021
Health Insurance Top Trends 2021Health Insurance Top Trends 2021
Health Insurance Top Trends 2021
Capgemini
 
Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021Top Trends in Retail Banking: 2021
Top Trends in Retail Banking: 2021
Capgemini
 
Capgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous PlanningCapgemini’s Connected Autonomous Planning
Capgemini’s Connected Autonomous Planning
Capgemini
 
Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020Top Trends in Retail Banking: 2020
Top Trends in Retail Banking: 2020
Capgemini
 
Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020Top Trends in Life Insurance: 2020
Top Trends in Life Insurance: 2020
Capgemini
 
Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020Top Trends in Health Insurance: 2020
Top Trends in Health Insurance: 2020
Capgemini
 
Top Trends in Payments: 2020
Top Trends in Payments: 2020Top Trends in Payments: 2020
Top Trends in Payments: 2020
Capgemini
 
Ad

Recently uploaded (20)

Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdfMicrosoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
MinniePfeiffer
 
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
ASHISHKUMAR504404
 
fundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptxfundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptx
Sunkod
 
cardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugscardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugs
Mohammed Ahmed Bamashmos
 
Approach to diabetes Mellitus, diagnosis
Approach to diabetes Mellitus,  diagnosisApproach to diabetes Mellitus,  diagnosis
Approach to diabetes Mellitus, diagnosis
Mohammed Ahmed Bamashmos
 
Effects of physical activity, exercise and sedentary behaviors to
Effects of physical activity, exercise and sedentary behaviors toEffects of physical activity, exercise and sedentary behaviors to
Effects of physical activity, exercise and sedentary behaviors to
DancanNyabuto
 
Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.
NeoRakodu
 
Bidding World Conference 2027 - Ghana.pptx
Bidding World Conference 2027 - Ghana.pptxBidding World Conference 2027 - Ghana.pptx
Bidding World Conference 2027 - Ghana.pptx
ISGF - International Scout and Guide Fellowship
 
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvvBasic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
hkthmrz42n
 
NASIG ISSN 2025 updated for the_4-30meeting.pptx
NASIG ISSN 2025 updated for the_4-30meeting.pptxNASIG ISSN 2025 updated for the_4-30meeting.pptx
NASIG ISSN 2025 updated for the_4-30meeting.pptx
reine1
 
ICONX - Presentation - Mining RACE - english - international
ICONX - Presentation - Mining RACE - english - internationalICONX - Presentation - Mining RACE - english - international
ICONX - Presentation - Mining RACE - english - international
Bitcoin Mining RACE
 
Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025
Noraini Yunus
 
kurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptxkurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptx
TayyabaSiddiqui12
 
2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx
aschenakidawit1
 
Wood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City LibraryWood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City Library
Woods for the Trees
 
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
patricialago3459
 
Reflections on an ngo peace conference in zimbabwe
Reflections on an ngo peace conference in zimbabweReflections on an ngo peace conference in zimbabwe
Reflections on an ngo peace conference in zimbabwe
jujuaw05
 
2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx
Dale Wells
 
Bidding World Conference 2027-NSGF Senegal.pdf
Bidding World Conference 2027-NSGF Senegal.pdfBidding World Conference 2027-NSGF Senegal.pdf
Bidding World Conference 2027-NSGF Senegal.pdf
ISGF - International Scout and Guide Fellowship
 
Updated treatment of hypothyroidism, causes and symptoms
Updated treatment of hypothyroidism,  causes and symptomsUpdated treatment of hypothyroidism,  causes and symptoms
Updated treatment of hypothyroidism, causes and symptoms
Mohammed Ahmed Bamashmos
 
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdfMicrosoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
Microsoft Azure Data Fundamentals (DP-900) Exam Dumps & Questions 2025.pdf
MinniePfeiffer
 
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
THE SEXUAL HARASSMENT OF WOMAN AT WORKPLACE (PREVENTION, PROHIBITION & REDRES...
ASHISHKUMAR504404
 
fundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptxfundamentals of communicationclass notes.pptx
fundamentals of communicationclass notes.pptx
Sunkod
 
cardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugscardiovascular outcome in trial of new antidiabetic drugs
cardiovascular outcome in trial of new antidiabetic drugs
Mohammed Ahmed Bamashmos
 
Effects of physical activity, exercise and sedentary behaviors to
Effects of physical activity, exercise and sedentary behaviors toEffects of physical activity, exercise and sedentary behaviors to
Effects of physical activity, exercise and sedentary behaviors to
DancanNyabuto
 
Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.Key Elements of a Procurement Plan.docx.
Key Elements of a Procurement Plan.docx.
NeoRakodu
 
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvvBasic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
Basic.pptxsksdjsdjdvkfvfvfvfvfvfvfvfvfvvvv
hkthmrz42n
 
NASIG ISSN 2025 updated for the_4-30meeting.pptx
NASIG ISSN 2025 updated for the_4-30meeting.pptxNASIG ISSN 2025 updated for the_4-30meeting.pptx
NASIG ISSN 2025 updated for the_4-30meeting.pptx
reine1
 
ICONX - Presentation - Mining RACE - english - international
ICONX - Presentation - Mining RACE - english - internationalICONX - Presentation - Mining RACE - english - international
ICONX - Presentation - Mining RACE - english - international
Bitcoin Mining RACE
 
Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025Speech 3-A Vision for Tomorrow for GE2025
Speech 3-A Vision for Tomorrow for GE2025
Noraini Yunus
 
kurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptxkurtlewin theory of motivation -181226082203.pptx
kurtlewin theory of motivation -181226082203.pptx
TayyabaSiddiqui12
 
2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx2. Asexual propagation of fruit crops and .pptx
2. Asexual propagation of fruit crops and .pptx
aschenakidawit1
 
Wood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City LibraryWood Age and Trees of life - talk at Newcastle City Library
Wood Age and Trees of life - talk at Newcastle City Library
Woods for the Trees
 
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
ICSE 2025 Keynote: Software Sustainability and its Engineering: How far have ...
patricialago3459
 
Reflections on an ngo peace conference in zimbabwe
Reflections on an ngo peace conference in zimbabweReflections on an ngo peace conference in zimbabwe
Reflections on an ngo peace conference in zimbabwe
jujuaw05
 
2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx2025-05-04 A New Day Dawns 03 (shared slides).pptx
2025-05-04 A New Day Dawns 03 (shared slides).pptx
Dale Wells
 
Updated treatment of hypothyroidism, causes and symptoms
Updated treatment of hypothyroidism,  causes and symptomsUpdated treatment of hypothyroidism,  causes and symptoms
Updated treatment of hypothyroidism, causes and symptoms
Mohammed Ahmed Bamashmos
 

CWIN 17 / sessions data vault modeling - f2-f - nishat gupta

  • 1. Data Vault Modeling – An Insight Nishant Gupta Bangalore, September 27th #CWIN17
  • 2. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 2 Table of Contents  Introduction  Data Vault Modeling Components  Building Data Vault model  DVM – An Answer  DVM 2.0 – An Agile way
  • 3. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 3 Introduction
  • 4. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 4 What is Data Vault Modeling ? The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of enterprise data warehouses. --Dan Linstedt The Data Vault is functionally based, not subject oriented as defined by Bill Inmon
  • 5. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 5 Why Data Vault Modeling ?  3 NF has complex Pks when cascading snapshot dates (time driven PKs)  Real time loading is a challenge  Drill Down Analysis & Queries are complex and tedious  Top down approach results in unavoidable Flexibility and Scalability Issues  Star Schema – difficult to implement /re engineer fact tables for granularity changes  Data Redundancy & Helper Tables isolates subject related information  Inconsistent Query linkages due to incompatible grains  Synchronization issues during Real Time Data Load  Limited Enterprise views and Data Mining capabilities Challenges with DW Data Modeling Architectures
  • 6. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 6 Data Vault Modeling Place in Evolution  1960s - Codd, Date et. al Normal Forms  1970s - Peter Chen created E-R diagramming  1980s - Normal Forms adapted to DWs courtesy Bill Inmon  1985+ - Star Schema for OLAP by Ralph Kimball  1990s - Data Vault concept developed Dan Linstedt  2000+ - Data Vault concept published by Dan Linstedt
  • 7. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 7 Data Vault Modeling Evolution
  • 8. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 8 Where does a Data Vault Fit ?
  • 9. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 9 Data Vault Modeling Components
  • 10. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 10 Data Vault Modeling - Components  Hub Entities  Candidate Keys + Load Time + Source  Link Entities  FKs from Hub + Load Time + Source  Satellite Entities  Descriptive Data + Load Time + Source + End Time • Dimension = Hub+ Satellite • Fact = Satellite + Link [+ Hub ] Data Vault Modeling has 3 Keys Components summarized below with their attributes
  • 11. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 11 DVM Components - Hubs  Hub Entities  Hub Entities, or Hubs, are a single table carrying at a minimum a unique list of business keys  For example, Invoice number, Employee number, Customer Number, Part number and VIN etc.  Hubs allows to integrate multiple source systems, hence, should be source system agnostic  It contains: • A Hub PK The Business Key column(s) • The Load Date (LOAD_DTS) The Source for the record (REC_SRC)  New in DV 2.0, the Hub PK is a calculated field includes MD5 to link with Hadoop/No Sql Hub = Business Key This is the most important aspect of Data Vault modeling. It has to identified it correctly yo build an integrated enterprise data warehouse for an organization.
  • 12. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 12 DVM Components – Hubs contd…
  • 13. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 13 DVM Components – Hubs contd…
  • 14. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 14 DVM Components - Links  Link Entities  Link Entities or Links, are a physical representation of a many-to-many 3NF relationship  A Link is therefore an intersection of business keys . A Link must have at least two Hubs, but they may be composed of many Hubs  A Link table’s grain is defined by the number of parent keys it contains similar to Fact Table in Dimensional Modeling  It contains: • A Link PK (Hash Key) The PKs from the parent Hubs – used for lookups • The Load Date (LOAD_DTS) The Source for the record (REC_SRC)  New in DV 2.0, The Business Key column(s) – for faster query Links = Associations Link captures and records the relationship of business elements at the lowest possible grain that shouldn’t change over time . It includes transactions and hierarchies.
  • 15. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 15 DVM Components – Links contd…  Many-to-Many relationships benefits :  Flexibility  Granularity  Dynamic adaptability  Scalability  No need to change the EDW structure  Existing Data is fine  New Data is added Modeling Links – 1:M or M:M With Link in the Data Vault If a Link structure is compromised, then the flexibility of the data vault model is immediately compromised .
  • 16. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 16 DVM Components – Links contd…
  • 17. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 17 DVM Components - Satellites  Satellite Entities  Satellite Entities or Satellites, are Hub key context (descriptive) information  It is a time-dimensional table housing detailed information about the Hub’s or Link’s business keys at a point in time or over a time period  The Change Data Capture (CDC) is done and History is stored .  It’s concept and structure is similar to Type 2 Slowly Changing Dimension  It contains: • Satellite Primary Key: Hub or Link Primary Key & Load Date Time Stamp • Satellite Optional Primary Key: Sequence Surrogate Number • Other Attributes including End Data The Source for the record (REC_SRC)  New in DV 2.0, HASH_DIFF columns for Change Data Capture (CDC) Satellite = Descriptors Satellites are typically arranged by type or classification of data, and rate of change, this results in, Satellite to split away groups of fields that change more quickly than others.
  • 18. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 18 DVM Components – Satellites contd…
  • 19. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 19 DVM Components – Satellites contd…
  • 20. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 20 Building Data Vault
  • 21. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 21 Building a Data Vault  Model the Hubs. This requires an understanding of business keys and their usage across the designated scope.  Model the Links. Forming the relationships between the keys – formulating an understanding of how the business operates today in context to each business key.  Model the Satellites. Providing context to each of the business keys as well as the transactions (Links) that connect the Hubs together. This begins to provide the complete picture of thebusiness.  Model the point-in-time tables. This is a Satellite derivative, of which the structure and definition is outside the scope of this document (due to space constraints). The Data Vault should be built as follows
  • 22. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 22 Reference Rule for Data Vault Modeling  Hub keys cannot migrate into other Hubs (no parent/child like Hubs).  Hubs must be connected through Links.  More than two Hubs can be connected through Links.  Links can be connected to other Links.  Links must have at least two Hubs associated with them in order to be instantiated.  Surrogate keys may be utilized for Hubs and Links.  Surrogate keys may not be utilized for Satellites.  Hub keys always migrate outward.  Hub business keys never change, Hubs primary keys never change.  Satellites may be connected to Hubs or Links.  Satellites always contain either a load date-time stamp, or a numeric reference to a stand-alone load date-time stamp sequence table.  Stand-alone tables such as calendars, time, code and description tables may be utilized.  Links may have a surrogate key.  If a hub has two or more satellites, a point-in-time table may be constructed for ease of joins.  Satellites are always delta driven, duplicate rows should not appear.  Data is separated into Satellite structures based on: 1) type of information 2) rate of change - BK with low propensity for Change become Hub key -Transactions and Integrated keys become link tables. - Descriptive data always fits in a Satellite
  • 23. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 23 Data Vault Loading Sequence  Hubs for Dimensions  Links for Dimensions  Satellites for Dimensions  Hubs for Fact (if any )  Links for Fact  Satellites for Fact Typically data loading for a Data Vault is in the following sequence :
  • 24. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 24 Sample Data Vault Model
  • 25. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 25 World’s Smallest Data Vault  The Data Vault doesn’t have to be “BIG”  An data vault can be built incrementally  Reverse engineering one component is possible  Building one section of Data Vault, and the connect it with mart is the right strategy  The smallest EDW consists of two tables :  One Hub  One Satellite
  • 26. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 26 DVM – An Answer to EDW Architecture Pain
  • 27. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 27 EDW Needs Today ! In today’s world where we are receiving loads of data across multiple sources/channels we need Data Model that is architected specifically to meet the below stated EDW needs :  Extensive possibilities for data attribution.  All data relationships are key driven.  Relationships can be dropped and created on-the-fly.  Data Mining can discover new relationships between elements  Artificial Intelligence can be utilized to rank the relevancy of the relationships to the user configured outcome.
  • 28. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 28 Data Vault Modeling Benefits – An Answer ! The Data Vault Model is a data integration architecture; a series of standards, and definitional elements or methods by way information is connected within an RDBMS data store in order to make sense of it. Business Benefits: • Manage and Enforce Compliance to various regulation in your Enterprise Data Warehouse • Spot business problems that were never visible previously • Rapidly Reduce business cycle time for implementing changes • Merge new business units into the organization rapidly • Rapid ROI and Delivery of information to new Star Schemas • Consolidate disparate data stores., ie: Master Data Management • SEI CMM Level 5 compliant (Repeatable, consistent, redundant architecture) • Trace all data back to the source systems
  • 29. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 29 Data Vault Modeling Benefits – An Answer ! Technical Benefits:  Near-Real-Time Loads  Traditional Batch Loads  In-Database Data Mining  Terabytes to Petabytes of information (Big Data)  Incremental Build Out  Seamless integration of unstructured data (NoSQL)  Dynamic Model Adaptation – self healing  Business Rule Changes (with Ease)
  • 30. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 30 DVM 2.0 – An Agile way !
  • 31. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 31 Agile manifesto for DW  User Stories instead of detailed requirements  Time-boxed iterations  Iteration has a standard length  Select User stories that is doable  Rework is included in the overall scheme of things  No missed requirements only • Not Delivered • Not Discovered
  • 32. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 32 Agile Data Vault Modeling  Model Iteratively  Use DVM to create basic components  Add more in due course of time  Virtualize the Access Layer  No Facts and dimension creation upfront  ETL and Testing can take long  Create Database View on Top of pattern based DV Model  User sees reports with live data  Plan for performance in later iteration
  • 33. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 33 What’s New in DV2.0 ?  Modeling Structure Includes…  No SQL, and Non-Relational DB Systems, Hybrid Systems  Minor Structure Changes to support NoSQL  New ETL Implementation Standards  For True Real time support  For NoSQL support  New Architecture Standards  Support for NoSQL data management systems  New Methodology components  CMMI, Six Sigma and TQM  Project Planning and Tracking  Agile Delivery Mechanisms  Standards, and template for Projects The Model is fully compliant with Hadoop and needs NO changes to work properly
  • 34. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 34 References  http://tdan.com/data-vault-series  https://www.slideshare.net/kgraziano/introduction-to-data-vault-modeling  Building a Scalable Data Warehouse with Data Vault 2.0 by Michael Olschimke and Dan Linstedt  Super Charge your Data warehouse by Kent Graziano and Dan Linstedt
  • 35. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 35 Thank You! Phone: +91 8884400312 [email protected] Name Nishant Gupta Title Manager Role Technical Manager @Twitter Account Speaker 1 Photo Phone: +XX XXXXXXXX [email protected] Name NAME Title Role @Twitter Account Speaker 2 Photo
  • 36. Session’s Title | Date Copyright © 2017 Capgemini and Sogeti. All rights reserved. 36 Appendix
  • 37. www.capgemini.com The information contained in this presentation is proprietary. Copyright © 2017 Capgemini and Sogeti. All rights reserved. Rightshore® is a trademark belonging to Capgemini. www.sogeti.com About Capgemini and Sogeti With more than 180,000 people in over 40 countries, Capgemini is one of With more than 190,000 people in over 40 countries, Capgemini is one of the world's foremost providers of consulting, technology and outsourcing services. The Group reported 2016 global revenues of EUR 12.5 billion. Together with its clients, Capgemini creates and delivers business, technology and digital solutions that fit their needs, enabling them to achieve innovation and competitiveness. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business Experience™, and draws on Rightshore®, its worldwide delivery model. Sogeti is a leading provider of technology and software testing, specializing in Application, Infrastructure and Engineering Services. Sogeti offers cutting-edge solutions around Testing, Business Intelligence & Analytics, Mobile, Cloud and Cyber Security. Sogeti brings together more than 23,000 professionals in 15 countries and has a strong local presence in over 100 locations in Europe, USA and India. Sogeti is a wholly-owned subsidiary of Cap Gemini S.A., listed on the Paris Stock Exchange.