Chapter-02
Chapter-02
Chapter 2
Data Models
1
Learning Objectives
●
Discuss data modeling and why data models are important
●
Describe the basic data-modeling building blocks
●
Define what business rules are and how they influence database
design
●
Understand how the major data models evolved
●
List emerging alternative data models and the needs they fulfill
●
Explain how data models can be classified by their level of
abstraction
2
Definitions
●
Data modeling
– The process of creating a specific data model for a
determined problem domain.
●
Data model
– A representation, usually graphic, of a complex “real-
world” data structure. Data models are used in the
database design phase of the Database Life Cycle.
3
Data Modeling Process
●
Iterative
●
Start with simple understanding
●
Continually add as understanding increases
●
Use experience
●
Talk to stakeholders - collaborate
●
Final model is a “blue print” for the database
4
Importance of Data Models
●
Communication
●
Communication
●
Communication
●
A good data model communicates the various
views of the data
5
Components of Data Model
●
Entity
●
Attribute
●
Relationship
●
Constraints
6
Entities
●
A person, place, or thing about which data will
be collected and stored
– Think nouns
●
Store data by occurrence
– Student, customer, employee, product
●
Occurrences are distinguishable (unique)
7
Attributes
●
Characteristic of an entity
●
Student
– Name
– Address
– Etc.
8
Relationships
●
How entities are related
– An order is placed by one customer
– A customer can place many orders
●
Two-way street - bidirectional
●
Three types
– One-to-one: 1:1
– One-to-Many: M:M
– Many-to-many: M:N
9
Constraint
●
Restriction placed on the data
– LHU student number: 6 numeric digits
– SSN: ###-##-####
– The number of items ordered cannot be less than 0
●
Constraint can be a relationship
– A customer is handled by one sales representative
– One sales representative handles many customers
10
Back to Data Modeling
●
Data modeling is the process identifying all the
building blocks
●
What entities are needed to support the project?
●
What entity attributes are needed?
●
How are the entities related?
●
What other constraints are imposed on the data?
11
Think About Inputs and Outputs
●
Inputs ●
Think functionality
– Data that comes into ●
Example: registering
the system for classes
●
Outputs
– Data/information that
goes out of the system
12
Business Rules
●
The answers to the previous questions lead to
business rules
●
Business rule
– A brief unambiguous description of a policy, procedure,
or principle within an organization
●
Business rules help to identify entities, attributes,
relationships, and constraints
13
Identifying Business Rules
●
Discussions and communications with:
– Company managers
– Policy makers
– Department managers
– End users
●
Written documentation
14
Modeling Business Rules
●
Nouns are entities and/or attributes
●
Verbs translate to relationships
●
Ask:
– How many instances of A are related to one instance of
B?
– How many instances of B are related to one instance of
A?
15
Naming Conventions
●
Entity name requirements
– Be descriptive of the objects in the business environment
– Use terminology that is familiar to the users
●
Attribute name
– Required to be descriptive of the data represented by the attribute
●
Proper naming
– Facilitates communication between parties
– Promotes self-documentation
16
Data Models Though the Years
17
Hierarchical Model
●
Supports 1:M relationships
●
Parent (also called a segment) can have many children
●
Children can only have one parent
●
IBM Information Management System (IMS)
– Used by COBOL programs
– LHU registration system
18
Network Model
●
Improvement on hierarchical
●
Allows children to have more than one parent
●
Introduced terminology still in use today
19
Terminology
●
Schema and subschema
– Schema: a logical grouping of database objects
– Subschema: part of the schema available to application program
●
Data Manipulation Language (DML)
– Commands used to view, insert, delete and change data
●
Data Definition Language
– Commands used to create, delete, and change database objects
20
Relational Model
●
Developed by E. F. Codd (IBM)
●
Treated databases as a mathematics
●
Basic idea: a mathematical relation
– Two dimensional structure of intersecting rows and columns
●
A relation can be thought of as a table
– Rows represent occurrences of an entity (tuple)
– Columns are an attributes of the entity
21
Relational Model
●
A mathematical algebra is defined using
relations and operators
– Allows one to prove characteristics of the relational
model
●
More in Chapter 3
22
RDBMS
●
Relational Database Management System
●
Implements the relational concepts
●
Users only see the database as a collection of
tables
●
Tables are related through a common attribute
23
Linking Tables
24
Relational Diagram
25
Advantages of RDBMS
●
Physical storage is hidden from the user
– Structural independence
●
Query language
– Allows users to easily work with data and database
objects
– Structured Query Language (SQL) is the common
language used
26
Components of an RDBMS
●
End-user interface
– Allows end user to interact with the data
●
Collection of tables stored in the database
– Each table is independent from another
– Rows in different tables are related based on common values in
common attributes
●
SQL engine
– Executes all queries
27
Entity Relationship Model (ERM)
●
Graphical representation of a data model
●
Closely related to relational model
●
Entity Relationship Diagram (ERD)
– Graphical representation of the database
components
28
ERM Definitions
●
Entity instance (entity occurrence)
– A row in a relational table
●
Entity set
– A collection of like entities
●
Connectivity
– The type of relationship between entities
– Classifications include 1:1, 1:M, and M:N
29
ERD Styles
●
Chen notation
– A representation of the entity relationship diagram that uses shapes to
identify entities, attributes, and relationships
●
Crow’s foot notation
– A representation of the entity relationship diagram that uses a three-pronged
symbol to represent the “many” sides of the relationship
●
Class diagram notation
– The set of symbols used in the creation of class diagrams
– Part of UML
30
ERD Styles
31
For COMP255
●
We will be using Crow’s Foot notation
32
Object-Oriented Data Model
●
Applies Object-Oriented concepts to the database
●
Instead of tables there are objects
●
Objects include facts(data), relationships between the
facts, relationships with other objects
●
Applies classes, inheritance, etc. to the data model
33
Two More Models
●
Extended relational data model (ERDM)
– Supports OO features, extensible data types based on classes,
and inheritance
– Object/relational database management system (O/R DBMS):
based on ERDM
●
Extensible Markup Language (XML)
– Manages unstructured data for efficient and effective exchange
of structured, semistructured, and unstructured data
34
Big Data and NoSQL
●
Goals of Big Data ●
Characteristics of Big
– Find new and better Data
ways to manage large – Volume
amounts of web and
sensor-generated data – Velocity
– Provide high – Variety
performance at a
reasonable cost
35
Big Data and NoSQL
●
Challenges of Big Data ●
New technologies of
– Volume doesn’t allow Big Data
usage of conventional – Hadoop
structures
– Hadoop Distributed File
– Expensive
System (HDFS)
– OLAP tools proved
inconsistent dealing with
– MapReduce
unstructured data – NoSQL
36
Big Data and NoSQL
●
NoSQL – Not Only SQL - databases
– Not based on the relational model
– Support distributed database architectures
– Provide high scalability, high availability, and fault tolerance
– Support large amounts of sparse data
– Geared toward performance rather than transaction consistency
– Provides a broad umbrella for data storage and manipulation
37
Data Models
38
Advantages/Disadvantages
39
For COMP255
●
Focus on:
– Relational Data Model
– Entity Relationship Data Model
●
Talk more about NoSQL databases at the end
of the semester
40
Data Abstraction
41
External Model
●
What the end user sees
– Based on what they are allowed to access
42
Conceptual Model
●
The whole database
43
Internal Model
●
The DBMS
representation
●
Implementation of the
conceptual module
44
Physical Model
●
Operates at lowest level of abstraction
●
Describes the way data are saved on storage
media such as magnetic, solid state, or optical
media
45