COMP5320 2025 Wk2 L2 Data Modelling 1 - Tagged
COMP5320 2025 Wk2 L2 Data Modelling 1 - Tagged
Modelling 1
CO M P5 3 2 0 : DATA B A SE SYST E MS
Stefa n Ma rr ( s. ma rr @ ke n t. a c .u k)
Va n es sa B o n thuy s ( v.b o n th u ys @ ke n t. a c. u k)
Ke mi Ad e m o ye ( k. a dem o y e@ ken t .a c. u k)
Acknowledgements: Many thanks to Stefan Marr & Peter Rodgers for lecture notes from previous years.
2
Stages in Database
Development
Database Database
Problem Analysis Database Design
Implementation Monitoring/ Tuning
Identify and understand • Produce data • Structure data in • Monitor DB usage
requirements: models the physical • Restructure and
• Data to be collected • Identify entities, database. optimise database
• Constraints attributes, • E.g. tables, columns,
• Facts and enterprise relationships and rows, … in a
rules • Select DBMS relational database
• Etc.
3
Database Design
Dependence on
DBMS Specific
Phase Product Type DBMS
1. Conceptual Conceptual data No No
database design model
4
Conceptual • Analyse the requirements.
Database Design • Map real-world facts into a high-level description of the data.
• Identify entities, attributes, and the relationships between entities.
• Draw the conceptual data model.
Select
Type of Independent of DBMS and physical considerations.
DBMS
Logical Database • Refine and map conceptual data model to logical data model.
Design • Check and validate the structure is correct for the type of DBMS
selected.
Need to know the Type of DBMS (i.e., relational,
Select hierarchical, NoSQL, etc.), not what DBMS.
Specific
DBMS
5
Database Design: Today’s Focus
Dependence on
DBMS Specific
Phase Product Type DBMS
1. Conceptual Conceptual data No No Today
database design model
6
Conceptual Data
Modelling
HIGHEST LEVEL, CONCERNED WITH PROBLEM DOMAIN
7
Data Modelling
•Concerned with understanding the problem domain.
◦ Identify, classify, and structure elements in the design.
8
Entity-Relationship (ER)
Modelling
•A technique for conceptual data modelling
•Entity-Relationship (ER) modelling is a top-down approach that begins with
identifying the important data: entities and relationships.
•More details about these entities and relationships that we want to store are
then added, i.e., attributes and constraints.
•We will use the UML (Unified Modelling Language) notation in the module.
◦ Various diagrammatic notations are available for an ER diagram, please do not use them!
9
ER Modelling Example
Student An ER model for a
student_id {PK} Course Administration System
name year
… coursework_mark
{Total, OR} exam_mark
0..* Registers On 1..*
PG UG Module
research_topic degree module_code {PK}
stage title
0..*
0..* year 0..*
Tutors 1..1
Staff is_convenor
11
Attributes
•Attributes refer to properties of an entity or a relationship.
•For example:
Student Entity
student_id
name Attributes
dob
address
•Attribute domain is defined as a set of allowable values for an attribute. For
example:
◦ Qty is an integer between 1 and 10
◦ DOB is a date that fits within a range, …
◦ Name is a character string, …
12
Attribute Types
Attributes may be:
•Single-valued
◦ Attribute contains a single value for each instance of an entity
◦ E.g., date_of_birth
•Multi-valued
◦ Attribute may contain more than one value for an entity instance
◦ E.g., hobby, telephone number
13
Primary Key (PK)
•The candidate key selected to uniquely identify each instance of an entity.
•Every instance of an entity is uniquely identified by a primary key.
o nc ept u a l m o del!
o fore ig n ke ys in c o de l.
N the re latio n al m
re s pe cifi c to
They a
14
Primary Key (PK) contd.
•Every instance of an entity is uniquely identified by a primary key.
•The PK is the candidate key selected to uniquely identify each instance
(occurrence) of an entity.
Student
student_id {PK}
name
dob
address
•Primary Key may consist of more than one attribute (known as a composite key).
•We do identify candidate keys and primary l m o d e l!
s in conc e pt u a
keys in the conceptual model. No fo re ig n ke y m o del.
r e l atio n a l
a re s pe ci fi c to the
They
15
Question 1: Candidate and
Primary Keys
Consider the following Staff entity:
•Identify all candidate keys.
•Select a primary key for the table.
Staff
name
staff_no
dob
address
ni_number Go to: vevox.app
login Session code: 122-893-207
salary https://vevox.app/#/m/122893207
16
Homework
1
READ THE NEXT TWO SLIDES ON
PRIMARY KEY GUIDELINES
17
Primary Key (PK) Guidelines (1)
PK Characteristic Guideline and Rationale
Unique values The PK attribute (or combination of attributes) must uniquely identify each
entity instance (a row in the table).
No NULLs PK attribute(s) cannot contain nulls.
Should not change over If you use attributes with semantic meaning, they may be subject to updates
time in future.
(should be permanent and E.g., if you make the full name a primary key, e.g., Vickie J. Smith, what if the
unchangeable) person changes their name? It means all foreign keys would need to be
updated too.
Preferably single-attribute PKs should have the minimum number of attributes possible. Single-valued
attributes are desirable but not required. You may need to use a composite
key (multi-valued key). Composite keys are useful when decomposing many-
to-many relationships – more on this later.
18
Primary Key (PK) Guidelines (2)
PK Characteristic Guideline and Rationale
Preferably numeric Numeric values can be easier to manage, e.g., the DBMS can automatically
do this for you using autoincrement values.
Security-compliant PK attribute(s) should not be composed of any attribute(s) that violate GDPR
concerns and may pose a security risk.
E.g., using the NI Number in an employee table is not a good idea.
19
iv e r el ati o ns hips
Aim to g n am e s .
u n iq u e
Relationships meaning f u l ,
Location
Ternary Relation
20
Relationships: Multiplicity (1)
•Help us to define the constraints on entities participating in a relationship as
perceived in the real-world (i.e., business rules).
•Entities can participate to varying degrees in relationships with other entities.
•Multiplicity defines the number, or range (i.e., min..max), of possible occurrences
of an entity that may participate in a relationship with another entity type.
◦ How many possible (min and max) occurrences of entity “B” may relate to a single occurrence of
entity “A”, and vice versa?
21
Multiplicities Explained (2)
Staff Teaches Module Relationship
•Consider the ‘Teaches’ relationship.
•How:
◦ Look at each entity participating in the relationship one at a time.
◦ Important: multiplicities are based on the business rules of the company you are designing
the database for.
22
Multiplicities Explained (3):
Staff Entity
Staff Teaches Module Relationship
•Looking at the first entity: Staff
◦ For each member of ‘Staff’, how many occurrences of ‘Module’ can they be associated with?
How many possible modules can a single member of staff teach?
◦ One member of Staff may teach 0 or more (many) modules
•Add this multiplicity for the Staff entity to the diagram first:
◦ Multiplicities are specified at the “far end” (opposite side) of the entity being considered.
Teaches 0..*
Staff Module
23
Multiplicities Explained (4): Module
Entity
Staff Teaches Module Relationship
•Our diagram with multiplicity constraints so far:
Teaches 0..*
Staff Module
24
Multiplicities Explained (5):
Staff Teaches Module Relationship
•Adding in the multiplicity constraints for the Module entity gives the following:
◦ Multiplicities are specified at the “far end” (opposite side) of the entity being considered.
•The multiplicity constraints for the ‘Teaches’ relationship are now complete
◦ Both entities participating in the relationship have been considered.
25
Relationships: Multiplicity (6)
•Entities can participate to varying degrees in relationships with other entities.
•Multiplicity defines the number, or range (i.e., min..max), of possible occurrences
of an entity that may participate in a relationship with another entity type.
26
Multiplicity Constraints: Business
Rules
•Multiplicity constraints are determined by the business rules of the organisation
that owns the data you are trying to model.
•In the Staff ‘Teaches’ Module relationship, the business rules are:
◦ A staff member may not yet teach any module (e.g., they are a new member of staff and have
not yet been given a module to teach) – min: 0
◦ A staff member may teach multiple modules (e.g., Kemi teaches COMP3280, COMP5820,
COMP5830, COMP5320) – max: * (many, more than 1)
◦ A module must be taught by a member of staff (i.e., someone must teach the module) - min: 1
◦ A module may be taught by 1 or more (max of 5) members of staff – max: 5 (or you can also use
* for many, more than 1, and create a more flexible solution)
27
Question 2: Multiplicity
Constraints
•The University of Kent has Academic Schools,
e.g., KBS, Computing, etc.
•Schools offer courses to students, e.g.:
◦ In Computing we offer 5 undergraduate courses.
◦ KBS offers 11 undergraduate courses.
28
Relationships: Multiple
Associations
•There may be more than one association between two entities.
◦ Will depend on the business information requirements.
•For example:
Owns A person owns a car.
Person Drives Car A person drives a car.
29
Relationships: Instance
Diagrams
•Instance diagrams are used to visualise properties of a relationship.
•For example, the relationship, Owns, between Person and Car.
Person entity Owns Car entity
relationship
P1 C1
P2 C2
P3 We can see that:
C3 • Person P1 Owns Car C1
P4 C4 • Person P2 Owns Cars C2, C3
P5 C5 • Person P3 Owns Car C4
• Person P4 does not Own any
car.
Relationship instances • Person P5 Owns Car C5
30
Data SPECIALIS ATION/
Abstraction
GENERALIS ATION
31
Abstraction
•Represents a very high-level approach to data modeling
•The following types of abstractions are the building blocks for all data models:
◦ Classification (grouping concepts) done
◦ Aggregation (composing)
◦ Specialisation/generalisation (hierarchy) we are interested in this next
32
Specialisation/Generalisation
•Defines a hierarchical class (superclass) for a collection of classes (subclasses)
with common attributes.
•Attributes of the superclass are inherited by subclasses.
Student
•A subclass is associated with the superclass by student_id
the is-a relationship. E.g.: name
◦ UG is-a Student address
◦ PG is-a Student
UG PG
degree research_topic
33
Specialisation/Generalisation
In Data Models
•The basic ER model has some limitations.
◦ These hierarchical relationships cannot be directly implemented in a relational database.
◦ That is, they cannot be used in logical data models.
34
Specialisation/Generalisation
Constraints
Criterion for specialization/generalization abstraction is described by two
important coverage properties.
•Participation constraint
•Disjoint/overlapping constraint
35
Specialisation/Generalisation
Coverage Properties: Participation
Constraint
Two types of participation constraints:
•Mandatory: each instance of the superclass must also be a member of a
subclass (i.e., Total Participation).
•Optional: each instance of a superclass need not (may not) be a member of a
subclass (i.e., Partial Participation)
Student
student_id IT_Specialist
name
address {Partial}
{Total}
Databases Java UNIX PHP
UG PG
degree research_topic
36
Specialisation/Generalisation
Coverage Properties: Disjoint/Overlapping
Constraint
Disjoint/overlapping constraint is defined as follows:
•Disjoint: A superclass instance can be a member of only one subclass (OR).
•Overlapping: A superclass instance can be a member of more than one
subclass (AND).
Student
IT_Specialist
studentID
name
address {Partial, AND}
{Total, OR}
Databases Java UNIX PHP
UG PG
degree researchTopic
37
Specialisation/Generalisation
Coverage Properties: Example (1) -
Student
•Total (Mandatory): Every instance of the superclass (Student) must belong to
one of the student subclasses (UG or PG)
◦ E.g. If I am a student, I must be either a PG OR UG student (must belong to a subclass)
UG PG
degree research_topic
38
Specialisation/Generalisation
Coverage Properties: Example (2) - IT
Specialist
•Partial (Optional): Every instance of the superclass (IT Specialist) need not
belong to one of the subclasses (i.e., Database, Java, UNIX, or PHP specialist)
◦ E.g. I may be an IT specialist but not a specialist in any of Databases, Java, PHP, or UNIX (do
not belong to a subclass).
39
Homework
2
READ THE CASE STUDY ON THE
FOLLOWING SLIDES
TRY THE PRACTICE QUESTIONS
40
Course Administration System
Case Study: Requirements
Basic Facts
•Staff
◦ Each staff has a unique staff ID, name, date of birth, etc.
◦ They may teach modules, tutor undergraduate students, and supervise post-graduate students.
•Students
◦ Each student has a unique student ID, name, date of birth, etc.
◦ Undergraduates are on a degree program and at a stage of study, and registered on modules
◦ Get coursework and exam marks for each of their registered modules in the year they studied the module.
◦ Each postgraduate works on a specific research topic.
•Modules
◦ Each module has a unique module code, title, and is taught by one or more staff.
◦ Staff teaching a module, and the convenor, may change each year.
pti o n c o rre s p o n ds
i
Constraints: On various attribute domains Suc h a de s c r n t s a n alysis
e m e
to a requir
41
ER Modelling Tips
The following are some tips to help you identify the data elements.
•Entities: nouns and noun phases identified during requirements gathering for the database (the
case study description in this case) are usually your entities.
◦ Remember they represent a group of real-world (physical) or abstract objects we are interested in storing
data about.
◦ E.g., person, place, thing, concept, event.
•Attributes: Think of attributes as qualities or characteristics an entity has – these are also often
nouns.
•Entity Occurrence: remember to consider how many possible minimum and maximum occurrences
of the entity will exist: zero (0), one (1), or many (*), or fixed (e.g. 5).
•Relationships: How are the different entities connected – what is the relationship between them?
Think of relationships as verbs (what the entity can do or experience).
42
Course Administration System Legend
Case Study: Identifying Entities, etc. Bold underline: entities
italics: attributes
Basic Facts Red italics: relationships
•Staff
◦ Each staff has a unique staff ID, name, date of birth, etc.
◦ They may teach modules, tutor undergraduate students and supervise post-graduate students.
•Students
◦ Each student has a unique student ID, name, date of birth, etc..
◦ Undergraduates are on a degree program and at a stage of study, and registered on modules
◦ Get coursework and exam marks for each of their registered modules in the year they studied the module.
◦ Each postgraduate works on a specific research topic.
•Modules
◦ Each module has a unique module code, title, and is taught by one or more staff.
◦ Staff teaching a module, and the convenor, may change each year.
43
Course Administration System
Case Study: Entities
•Identify the major entities
Staff Student Module
44
Course Administration System
Case Study: Relationships
•Establish relationships between the entities
year
Student coursework_mark
exam_mark
Registers On
PG UG Module
year
Supervises Tutors
is_convenor
Staff
Teaches
45
Course Administration System
Case Study: Entity Attributes
•Identify attributes associated with each entity
Staff Student Module
staff_id student_id module_code
name name title
dob dob
… …
UG PG
degree research_topic
stage
46
Course Administration System
Case Study: ER Data Model
(Conceptual)
Student An ER model for a
student_id {PK} Course Administration System
name year
… coursework_mark
{Total, OR} exam_mark
0..* Registers On 1..*
PG UG Module
research_topic degree module_code {PK}
stage title
0..*
0..* year 0..*
Tutors 1..1
Staff is_convenor
49
Practice
Questions
50
Practice Questions
•What is the difference between the candidate key and primary key of a table
(entity)?
•What are the two main requirements of a primary key?
•A staff entity includes the unique attributes: NI Number and Staff ID.
◦ Which attribute is better to use as the primary key, and why?
◦ Why is the other attribute not a good choice?
51
Practice Question: May 2021
Exam
•Consider the following conceptual data model
diagram, which shows a hierarchical relationship.
•It is designed for the personal database of a
collector of coins and banknotes.
•Identify the appropriate coverage properties of
this hierarchy and explain why.
Currency
Banknotes Coins
52
Summary
•Conceptual data modelling involves examining properties of a system to
determine:
◦ What data objects are relevant?
◦ How are they related to each other?
◦ How to describe an object?
◦ What constraints must always be satisfied?
53
Next MORE DATA
Lecture
MODELLING
54
References
•Connolly, T. and Begg, C. (2014). Database Systems:
A Practical Approach to Design, Implementation,
and Management, 6th Edition. Pearson.
•Chapters 10, 12, & 13
55