CSE_CSPC403_DBMS-1-14
CSE_CSPC403_DBMS-1-14
UNIT – I Introduction
File System vs. DBMS – Views of data – Data Models – Database Languages – Database Management System Services
– Overall System Architecture – Data Dictionary – Entity – Relationship (E-R) – Enhanced Entity – Relationship Model.
File System vs. DBMS
There are following differences between DBMS and File system:
File system is a collection of data. In this system, DBMS is a collection of data. In DBMS, the user is
the user has to write the procedures for managing not required to write the procedures.
the database.
File system provides the detail of the data DBMS gives an abstract view of data that hides the
representation and storage of data. details.
File system doesn't have a crash mechanism, i.e., if DBMS provides a crash recovery mechanism, i.e.,
the system crashes while entering some data, then DBMS protects the user from the system failure.
the content of the file will lost.
It is very difficult to protect a file under the file DBMS provides a good protection mechanism.
system.
File system can't efficiently store and retrieve the DBMS contains a wide variety of sophisticated
data. techniques to store and retrieve the data.
In the File system, concurrent access has many DBMS takes care of Concurrent access of data
problems like redirecting the file while other using some form of locking.
deleting some information or updating some
information.
Views of data
View of data in DBMS narrate how the data is visualized at each level of data abstraction. Data
abstraction allow developers to keep complex data structures away from the users. The developers achieve
this by hiding the complex data structures through levels of abstraction.
There is one more feature that should be kept in mind i.e. the data independence. While changing the data
schema at one level of the database must not modify the data schema at the next level. In this section, we will
discuss the view of data in DBMS with data abstraction, data independence, data schema in detail.
Content: View of Data in DBMS
1. Data Abstraction
2. Data Independence
3. Instance and Schema
4. Key Takeaways
1. Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface of the
system. It is done because many of the users interacting with the database system are not that much
computer trained to understand the complex data structures of the database system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the database
at three levels discussed below:
Three-Schema Architecture:
The main objective of this architecture is to have an effective separation between the user interface and
the physical database. So, the user never has to be concerned regarding the internal storage of the
database and it has a simplified interaction with the database system.
The three-schema architecture defines the view of data at three levels:
i. Physical level (internal level)
ii. Logical level (conceptual level)
iii. View level (external level)
i. Physical Level/ Internal Level
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest level
and it has complex data structures. Only the database administrator operates at this level.
ii. Logical Level/ Conceptual Level
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data adds
the security. As users are restricted to access some particular parts of the database.
It is the developer and database administrator who operates at the logical or the conceptual level.
iii. View Level/ User level/ External level
It is the highest level of data abstraction and exhibits only a part of the whole database. It exhibits the
data in which the user is interested. The view level can describe many views of the same data. Here, the
user retrieves the information using different application from the database.
The figure below describes the three-schema architecture of the database:
1. Relational Data Model: This type of model designs the data in the form of rows and columns within a
table. Thus, a relational model uses tables for representing data and in-between relationships. Tables are
also called relations. This model was initially described by Edgar F. Codd, in 1969. The relational data
model is the widely used model which is primarily used by commercial data processing applications.
2. Semistructured Data Model: This type of data model is different from the other three data models
(explained above). The semistructured data model allows the data specifications at places where the
individual data items of the same type may have different attributes sets. The Extensible Markup
Language, also known as XML, is widely used for representing the semistructured data. Although XML
was initially designed for including the markup information to the text document, it gains importance
because of its application in the exchange of data.
3. Entity-Relationship Data Model: An ER model is the logical representation of data as objects and
relationships among them. These objects are known as entities, and relationship is an association among
these entities. This model was designed by Peter Chen and published in 1976 papers. It was widely used
in database designing. A set of attributes describe the entities. For example, student_name, student_id
describes the 'student' entity. A set of the same type of entities is known as an 'Entity set', and the set of
the same type of relationships is known as 'relationship set'.
4. Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and
object identity, as well. This model supports a rich type system that includes structured and collection
types. Thus, in 1980s, various database systems following the object-oriented approach were developed.
Here, the objects are nothing but the data carrying its properties.
Database Languages
• A DBMS has appropriate languages and interfaces to express database queries and updates.
• Database languages can be used to read, store and update the data in the database.
Types of Database Language
The architecture of a database system is greatly influenced by the underlying computer system on which the
database is running:
i. Centralized.
ii. Client-server.
iii. Parallel (multi-processor).
iv. Distributed
Database Users:
Users are differentiated by the way they expect to interact with the system:
• Application programmers:
o Application programmers are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces.
o Rapid application development (RAD) tools are tools that enable an application programmer
to construct forms and reports without writing a program.
• Sophisticated users:
o Sophisticated users interact with the system without writing programs. Instead, they form their
requests in a database query language.
o They submit each such query to a query processor, whose function is to break down DML
statements into instructions that the storage manager understands.
• Specialized users :
o Specialized users are sophisticated users who write specialized database applications that do
not fit into the traditional data-processing framework.
o Among these applications are computer-aided design systems, knowledge base and expert
systems, systems that store data with complex data types (for example, graphics data and audio
data), and environment-modeling systems.
• Naive users :
o Naive users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously.
o For example, a bank teller who needs to transfer $50 from account A to account B invokes a
program called transfer. This program asks the teller for the amount of money to be transferred,
the account from which the money is to be transferred, and the account to which the money is
to be transferred.
Database Administrator:
• Coordinates all the activities of the database system. The database administrator has a good
understanding of the enterprise’s information resources and needs.
• Database administrator's duties include:
o Schema definition: The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
o Storage structure and access method definition.
o Schema and physical organization modification: The DBA carries out changes to the
schema and physical organization to reflect the changing needs of the organization, or to alter
the physical organization to improve performance.
o Granting user authority to access the database: By granting different types of
authorization, the database administrator can regulate which parts of the database various
users can access.
o Specifying integrity constraints.
o Monitoring performance and responding to changes in requirements.
Query Processor:
The query processor will accept query from user and solves it by accessing the database.
Parts of Query processor:
• DDL interpreter
This will interprets DDL statements and fetch the definitions in the data dictionary.
• DML compiler
a. This will translates DML statements in a query language into low level instructions that the query
evaluation engine understands.
b. A query can usually be translated into any of a number of alternative evaluation plans for same
query result DML compiler will select best plan for query optimization.
• Query evaluation engine
This engine will execute low-level instructions generated by the DML compiler on DBMS.
Storage Manager/Storage Management:
• A storage manager is a program module which acts like interface between the data stored in a
database and the application programs and queries submitted to the system.
• Thus, the storage manager is responsible for storing, retrieving and updating data in the database.
• The storage manager components include:
o Authorization and integrity manager: Checks for integrity constraints and authority of
users to access data.
o Transaction manager: Ensures that the database remains in a consistent state although there
are system failures.
o File manager: Manages the allocation of space on disk storage and the data structures used
to represent information stored on disk.
o Buffer manager: It is responsible for retrieving data from disk storage into main memory. It
enables the database to handle data sizes that are much larger than the size of main memory.
o Data structures implemented by storage manager.
o Data files: Stored in the database itself.
o Data dictionary: Stores metadata about the structure of the database.
o Indices: Provide fast access to data items.
Data Dictionary
Data Dictionary consists of database metadata. It has records about objects in the database.
Data Dictionary consists of the following information:
1. Name of the tables in the database
2. Constraints of a table i.e. keys, relationships, etc.
3. Columns of the tables that related to each other
4. Owner of the table
5. Last accessed information of the object
6. Last updated information of the object
An example of Data Dictionary can be personal details of a student:
Example
<StudentPersonalDetails>
Student_ID Student_Name Student_Address Student_City
The following is the data dictionary for the above fields:
Field Name Datatype Field Length Constraint Description
Student_ID Number 5 Primary Key Student id
Student_Name Varchar 20 Not Null Name of the student
Student_Address Varchar 30 Not Null Address of the student
Student_City Varchar 20 Not Null City of the Student
Types of Data Dictionary
Here are the two types of data dictionary:
Active Data Dictionary
The DBMS software manages the active data dictionary automatically. The modification is an automatic task
and most RDBMS has active data dictionary. It is also known as integrated data dictionary.
Passive Data Dictionary
Managed by the users and is modified manually when the database structure change. Also known as non-
integrated data dictionary.
The ER model defines the conceptual view of a database. It works around real-world entities and the
associations among them. At view level, the ER model is considered a good option for designing databases.
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable. For example,
in a school database, students, teachers, classes, and courses offered can be considered as entities. All these
entities have some attributes or properties that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with attribute sharing
similar values. For example, a Students set may contain all the students of a school; likewise a Teachers set
may contain all the teachers of a school from all faculties. Entity sets need not be disjoint.
A real-world thing either living or non-living that is easily recognizable and nonrecognizable. It is anything in
the enterprise that is to be represented in our database. It may be a physical thing or simply a fact about the
enterprise or an event that happens in the real world.
An entity can be place, person, object, event or a concept, which stores data in the database. The characteristics
of entities are must have an attribute, and a unique key. Every entity is made up of some 'attributes' which
represent that entity.
Examples of entities:
• Person: Employee, Student, Patient
• Place: Store, Building
• Object: Machine, product, and Car
• Event: Sale, Registration, Renewal
• Concept: Account, Course
An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as
rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be taken as
an entity.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain any key attribute
of its own. The weak entity is represented by a double rectangle.
Relationship (E-R)
A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the
relationship.
2. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the right
associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only specific
scientist.
3. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the right
associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
4. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on the right
associates with the relationship then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
Aggregation –
An ER diagram is not capable of representing relationship between an entity and a relationship which may be
required in some scenarios. In those cases, a relationship with its corresponding entities is aggregated into a
higher level entity. For Example, Employee working for a project may require some machinery. So, REQUIRE
relationship is needed between relationship WORKS_FOR and entity MACHINERY. Using aggregation,
WORKS_FOR relationship with its entities EMPLOYEE and PROJECT is aggregated into single entity and
relationship REQUIRE is created between aggregated entity and MACHINERY.