DBMS Notes 1-8
DBMS Notes 1-8
1. What is Data?
a. Data is a collection of raw, unorganized facts and details like text, observations, figures, symbols,
and descriptions of things etc.
In other words, data does not carry any specific purpose and has no significance by itself.
Moreover, data is measured in terms of bits and bytes – which are basic units of information in the
context of computer storage and processing.
b. Data can be recorded and doesn’t have any meaning unless processed.
2. Types of Data
a. Quantitative
i. Numerical form
ii. Weight, volume, cost of an item.
b. Qualitative
i. Descriptive, but not numerical.
p
ii. Name, gender, hair color of a person.
3. What is Information?
a. Info. Is processed, organized, and structured data.
el
b. It provides context of the data and enables decision making.
c. Processed data that make sense to us.
d. Information is extracted from the data, by analyzing and interpreting pieces of data.
e. E.g., you have data of all the people living in your locality, its Data, when you analyze and interpret
the data and come to some conclusion that:
eH
i. There are 100 senior citizens.
ii. The sex ratio is 1.1.
iii. Newborn babies are 100.
These are information.
4. Data vs Information
a. Data is a collection of facts, while information puts those facts into context.
b. While data is raw and unorganized, information is organized.
od
c. Data points are individual and sometimes unrelated. Information maps out that data to provide a
big-picture view of how it all fits together.
d. Data, on its own, is meaningless. When it’s analyzed and interpreted, it becomes meaningful
information.
e. Data does not depend on information; however, information depends on data.
f. Data typically comes in the form of graphs, numbers, figures, or statistics. Information is typically
presented through words, language, thoughts, and ideas.
C
g. Data isn’t sufficient for decision-making, but you can make decisions based on information.
5. What is Database?
a. Database is an electronic place/system where data is stored in a way that it can be easily accessed,
managed, and updated.
b. To make real use Data, we need Database management systems. (DBMS)
6. What is DBMS?
a. A database-management system (DBMS) is a collection of interrelated data and a set of
programs to access those data. The collection of data, usually referred to as the database,
contains information relevant to an enterprise. The primary goal of a DBMS is to provide a way to
store and retrieve database information that is both convenient and efficient.
b. A DBMS is the database itself, along with all the software and functionality. It is used to perform
different operations, like addition, access, updating, and deletion of the data.
somewhere in DISK
7.
p
a. File-processing systems has major disadvantages.
i. Data Redundancy and inconsistency
ii. Difficulty in accessing data
el
iii. Data isolation
iv. Integrity problems
v. Atomicity problems
vi. Concurrent-access anomalies
vii. Security problems
eH
b. Above 7 are also the Advantages of DBMS (answer to “Why to use DBMS?”)
od
C
LEC-2: DBMS Architecture
p
e. Logical level / Conceptual level:
i. The conceptual schema describes the design of a database at the conceptual level,
describes what data are stored in DB, and what relationships exist among those data.
el
ii. User at logical level does not need to be aware about physical-level structures.
iii. DBA, who must decide what information to keep in the DB use the logical level of
abstraction.
iv. Goal: ease to use.
f. View level / External level:
eH
i. Highest level of abstraction aims to simplify users’ interaction with the system by
providing different view to different end-user.
ii. Each view schema describes the database part that a particular user group is interested
and hides the remaining database from that user group.
iii. At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
iv. At views also provide a security mechanism to prevent users from accessing certain parts
od
of DB.
C
g.
2. Instances and Schemas
a. The collection of information stored in the DB at a particular moment is called an instance of DB.
b.The overall design of the DB is called the DB schema.
c.Schema is structural description of data. Schema doesn’t change frequently. Data may change
frequently.
d. DB schema corresponds to the variable declarations (along with type) in a program.
e. We have 3 types of Schemas: Physical, Logical, several view schemas called subschemas.
f. Logical schema is most important in terms of its effect on application programs, as programmers
construct apps by using logical schema.
g. Physical data independence, physical schema change should not affect logical
schema/application programs.
3. Data Models:
a. Provides a way to describe the design of a DB at logical level.
b. Underlying the structure of the DB is the Data Model; a collection of conceptual tools for describing
data, data relationships, data semantics & consistency constraints.
c. E.g., ER model, Relational Model, object-oriented model, object-relational data model etc.
4. Database Languages:
p
a. Data definition language (DDL) to specify the database schema.
b. Data manipulation language (DML) to express database queries and updates.
c. Practically, both language features are present in a single DB language, e.g., SQL language.
el
d. DDL
i. We specify consistency constraints, which must be checked, every time DB is updated.
e. DML
i. Data manipulation involves
1. Retrieval of information stored in DB.
eH
2. Insertion of new information into DB.
3. Deletion of information from the DB.
4. Updating existing information stored in DB.
ii. Query language, a part of DML to specify statement requesting the retrieval of
information.
5. How is Database accessed from Application programs?
a. Apps (written in host languages, C/C++, Java) interacts with DB.
od
b. E.g., Banking system’s module generating payrolls access DB by executing DML statements from
the host language.
c. API is provided to send DML/DDL statements to DB and retrieve the results.
i. Open Database Connectivity (ODBC), Microsoft “C”.
ii. Java Database Connectivity (JDBC), Java.
6. Database Administrator (DBA)
a. A person who has central control of both the data and the programs that access those data.
C
b. Functions of DBA
i. Schema Definition
ii. Storage structure and access methods.
iii. Schema and physical organization modifications.
iv. Authorization control.
v. Routine maintenance
1. Periodic backups.
2. Security patches.
3. Any upgrades.
7. DBMS Application Architectures: Client machines, on which remote DB users work, and server machines
on which DB system runs.
a. T1 Architecture
i. The client, server & DB all present on the same machine.
b. T2 Architecture
i. App is partitioned into 2-components.
ii. Client machine, which invokes DB system functionality at server end through query
language statements.
iii. API standards like ODBC & JDBC are used to interact between client and server.
c. T3 Architecture
i. App is partitioned into 3 logical components.
ii. Client machine is just a frontend and doesn’t contain any direct DB calls.
iii. Client machine communicates with App server, and App server communicated with DB
system to access data.
iv. Business logic, what action to take at that condition is in App server itself.
v. T3 architecture are best for WWW Applications.
vi. Advantages:
1. Scalability due to distributed application servers.
2. Data integrity, App server acts as a middle layer between client and DB, which
p
minimize the chances of data corruption.
3. Security, client can’t directly access DB, hence it is more secure.
el
eH
od
C
LEC-3: Entity-Relationship Model
1. Data Model: Collection of conceptual tools for describing data, data relationships, data semantics, and consistency
constraints.
2. ER Model
1. It is a high level data model based on a perception of a real world that consists of a collection of basic objects, called
entities and of relationships among these objects.
2. Graphical representation of ER Model is ER diagram, which acts as a blueprint of DB.
3. Entity: An Entity is a “thing” or “object” in the real world that is distinguishable from all other objects.
1. It has physical existence.
2. Each student in a college is an entity.
3. Entity can be uniquely identified. (By a primary attribute, aka Primary Key)
4. Strong Entity: Can be uniquely identified.
5. Weak Entity: Can’t be uniquely identified., depends on some other strong entity.
1. It doesn’t have sufficient attributes, to select a uniquely identifiable attribute.
2. Loan -> Strong Entity, Payment -> Weak, as instalments are sequential number counter can be generated
p
separate for each loan.
3. Weak entity depends on strong entity for existence.
4. Entity set
el
1. It is a set of entities of the same type that share the same properties, or attributes.
2. E.g., Student is an entity set.
3. E.g., Customer of a bank
5. Attributes
eH
1. An entity is represented by a set of attributes.
2. Each entity has a value for each of its attributes.
3. For each attribute, there is a set of permitted values, called the domain, or value set, of that attribute.
4. E.g., Student Entity has following attributes
A. Student_ID
B. Name
C. Standard
D. Course
od
E. Batch
F. Contact number
G. Address
5. Types of Attributes
1. Simple
1. Attributes which can’t be divided further.
C
p
2. Unary, Only one entity participates. e.g., Employee manages employee.
3. Binary, two entities participates. e.g., Student takes Course.
4. Ternary relationship, three entities participates. E.g, Employee works-on branch, employee works-on job.
el
5. Binary are common.
7. Relationships Constraints
1. Mapping Cardinality / Cardinality Ratio
1. Number of entities to which another entity can be associated via a relationship.
eH
2. One to one, Entity in A associates with at most one entity in B, where A & B are entity sets. And an entity
of B is associated with at most one entity of A.
1. E.g., Citizen has Aadhar Card.
3. One to many, Entity in A associated with N entity in B. While entity in B is associated with at most one
entity in A.
1. e.g., Citizen has Vehicle.
4. Many to one, Entity in A associated with at most one entity in B. While entity in B can be associated with
N entity in A.
od
1. Basic ER Features studied in the LEC-3, can be used to model most DB features but when complexity increases, it is
better to use some Extended ER features to model the DB Schema.
2. Specialisation
1. In ER model, we may require to subgroup an entity set into other entity sets that are distinct in some way with other
entity sets.
2. Specialisation is splitting up the entity set into further sub entity sets on the basis of their functionalities,
specialities and features.
3. It is a Top-Down approach.
4. e.g., Person entity set can be divided into customer, student, employee. Person is superclass and other specialised
entity sets are subclasses.
1. We have “is-a” relationship between superclass and subclass.
2. Depicted by triangle component.
5. Why Specialisation?
1. Certain attributes may only be applicable to a few entities of
p
the parent entity set.
2. DB designer can show the distinctive features of the sub entities.
3. To group such entities we apply Specialisation, to overall refine the DB blueprint.
el
3. Generalisation
1. It is just a reverse of Specialisation.
2. DB Designer, may encounter certain properties of two entities are overlapping. Designer may consider to make a
new generalised entity set. That generalised entity set will be a super class.
eH
3. “is-a” relationship is present between subclass and super class.
4. e.g., Car, Jeep and Bus all have some common attributes, to avoid data repetition for the common attributes. DB
designer may consider to Generalise to a new entity set “Vehicle”.
5. It is a Bottom-up approach.
6. Why Generalisation?
1. Makes DB more refined and simpler.
2. Common attributes are not repeated.
4. Attribute Inheritance
od
steps make
diagram
* to ER 1-
③ "
Rel
"
& constraints
Mapping
Participation
__
*
F-R-moddof-B-ank.mg System
① (
Banking system - Branches .
maf )
② Bank → customers .
⑥ Accounts -
saving all
\ current a
/ c.
loan ≥ 1 customers .
payment schedules .
① Entity sets
① Branch ② Customer ⑦ Employee
⑨ Saving A/c ⑤ Current a/c
⑥ Loan ⑦ payment Kwan) ( weak entity?
② Attributes : -
② Customer → cent - id
,
name
,
address ,
contact no .
DOBT age ,
↓
↓
↓ Composite .
multicolored
.
derived .
③ Genployee → acid, name
,
contact no ,
, dependent name
,
years of service
,
stat date
- ↓
↓ mltivalued
derived altnhr single valued .
.
Unmet a/c
⑤ →
afÉ , pertanaetwn changes ,
amount
overdraft - .
⑥ Generalized Entity
" "
Account → ace - no ,
,
balance
⑦ Loan →
lber ,
amount
⑧ Weak
Payment Payment date , amount
Entity → no .
,
.
③ Rel "
& constants .
M : N
=
N % r
③ Loan loan -
payment Payment .
☆ % N
⑥ Customer deposit account
M % N
N % A
N % I
st⑦ a⊕
ÉÉÉ
→m⑤µi-tbran ⁿᵈ①
cuw①↓d →④
I
⑦
-
\ / i•n loan
t.nmton.IM M
borrow
_tLoanT±
loan=⊖am⑦
-
payment
banker
,w deposit aecont-mn-yhfybalane.ec
n /accounts
± / ±
④?PyT
managed
by
is -
a
uᵗa dai1ynthdm
limit
iammtak=-P lsay.mg#-
"
÷÷÷ changes
have
ovw-draftamowt@ pinterest.at#
?⃝
?⃝
?⃝
"#
② University
Lec -
6
* F-
R_dgran
① Features .
& use case .
① profile → men _
profiles → friends ,
⑨ Post → like ,
comment.
① identify entity sets .
① her _
profile ② her _
post ③ post comment
_
⑨ post like_
② AHribypes .
email ,
① her _ profile → Name
, heme , pound ,
contact no .
←
DOB
↓ t
composite . , age .
mvltrvdwd ,
mnltivdued ,
↓
derived .
② her _
post → pustid tent content , image ,
videos
,
mdltivded Intruded ,
created _
timestamp , madifed - time stop . .
③ post-commet-post-comm-t.at ,
tentcontent, timestamp
.
⑨ post _
like → pytkeid ,
timestamp
③
"
Rel rants
① Mer _
profile frienship her _
profile .
M ! N
② user _
profile posts user -
post .
p i N
=
③ her _
profile can post like
_ ,
1 : N
1 :
=N
⑤ her _
post has post comment
_
1 ! N
=
⑥ Mer _
post has pvstlike
1 : N
=
Toft t friendship
F
,P°st_c◦mmÉt[ comments
D④
qt /p◦At"J
/ -§ke
n
N m
T user
_
profile
'
can I post
/
II
ee⊕)⑦
'
1
has posts
'
④④
/ N
has
'
user
post
I \
×ⁿᵗ① t¥⑦⑧"
modified
F.BE#D
?⃝
LEC-7: Relational Model
1. Relational Model (RM) organises the data in the form of relations (tables).
2. A relational DB consists of collection of tables, each of which is assigned a unique name.
3. A row in a table represents a relationship among a set of values, and table is collection of such relationships.
4. Tuple: A single row of the table representing a single data point / a unique record.
5. Columns: represents the attributes of the relation. Each attribute, there is a permitted value, called domain of the
attribute.
6. Relation Schema: defines the design and structure of the relation, contains the name of the relation and all the
columns/attributes.
7. Common RM based DBMS systems, aka RDBMS: Oracle, IBM, MySQL, MS Access.
8. Degree of table: number of attributes/columns in a given table/relation.
9. Cardinality: Total no. of tuples in a given relation.
10. Relational Key: Set of attributes which can uniquely identify an each tuple.
11. Important properties of a Table in Relational Model
1. The name of relation is distinct among all other relation.
p
2. The values have to be atomic. Can’t be broken down further.
3. The name of each attribute/column must be unique.
4. Each tuple must be unique in a table.
el
5. The sequence of row and column has no significance.
6. Tables must follow integrity constraints - it helps to maintain data consistency across the tables.
12. Relational Model Keys
1. Super Key (SK): Any P&C of attributes present in a table which can uniquely identify each tuple.
eH
2. Candidate Key (CK): minimum subset of super keys, which can uniquely identify each tuple. It contains no
redundant attribute.
1. CK value shouldn’t be NULL.
3. Primary Key (PK):
1. Selected out of CK set, has the least no. of attributes.
4. Alternate Key (AK)
1. All CK except PK.
5. Foreign Key (FK)
od
p
primary key must contain unique as well as not null values.
6. FOREIGN KEY: Whenever there is some relationship between two entities, there must be some common
attribute between them. This common attribute must be the primary key of an entity set and will become the
el
foreign key of another entity set. This key will prevent every action which can result in loss of connection
between tables.
eH
od
C
LEC-8: Transform - ER Model to Relational Model
1. Both ER-Model and Relational Model are abstract logical representation of real world enterprises. Because the two
models implies the similar design principles, we can convert ER design into Relational design.
2. Converting a DB representation from an ER diagram to a table format is the way we arrive at Relational DB-design from
an ER diagram.
3. ER diagram notations to relations:
1. Strong Entity
1. Becomes an individual table with entity name, attributes becomes columns of the relation.
2. Entity’s Primary Key (PK) is used as Relation’s PK.
3. FK are added to establish relationships with other relations.
2. Weak Entity
1. A table is formed with all the attributes of the entity.
2. PK of its corresponding Strong Entity will be added as FK.
3. PK of the relation will be a composite PK, {FK + Partial discriminator Key}.
3. Single Values Attributes
p
1. Represented as columns directly in the tables/relations.
4. Composite Attributes
1. Handled by creating a separate attribute itself in the original relation for each composite attribute.
el
2. e.g., Address: {street-name, house-no}, is a composite attribute in customer relation, we add address-street-
name & address-house-name as new columns in the attribute and ignore “address” as an attribute.
5. Multivalued Attributes
1. New tables (named as original attribute name) are created for each multivalued attribute.
eH
2. PK of the entity is used as column FK in the new table.
3. Multivalued attribute’s similar name is added as a column to define multiple values.
4. PK of the new table would be {FK + multivalued name}.
5. e.g., For Strong entity Employee, dependent-name is a multivalued attribute.
1. New table named dependent-name will be formed with columns emp-id, and dname.
2. PK: {emp-id, name}
3. FK: {emp-id}
6. Derived Attributes: Not considered in the tables.
od
7. Generalisation
1. Method-1: Create a table for the higher level entity set. For each lower-level entity set, create a table that
includes a column for each of the attributes of that entity set plus a column for each attribute of the primary key
of the higher-level entity set.
For e.g., Banking System generalisation of Account - saving & current.
1. Table 1: account (account-number, balance)
C
p
el
eH
od
C