Module 1-Csc 222 Database Management Systems Note 2024
Module 1-Csc 222 Database Management Systems Note 2024
COURSE OUTLINE
MODULE 1: BASICS OF DATABASE
MODULE 2: DATA MODELS AND ARCHITECTURE OF DBMS
MODULE 3: RELATIONAL DATABASE MANAGEMENT SYSTEM
MODULE 4: DEVELOPING ENTITY-RELATIONSHIP DIAGRAM
MODULE 5: NORMALIZATION
MODULE 6: MANAGING DATA USING STRUCTURED QUERY LANGUAGE (SQL)
MODULE 7: INTRODUCTION TO PL/SQL
1
MODULE 1: BASICS OF DATABASE
MODULE 1 OBJECTIVES
Understanding the meaning of data and information.
Knowing how database and database management systems are useful in organizations
to keep records.
Examples of database management system.
Components of database system.
Characteristics of data and DBMS.
Differences between file-based management systems.
Limitations of DBMS
1.1 Introduction
In the current era, people of all ages use database in one way or the other. Everyone uses
database in different ways. For example, school children use database of e-mail programs and
mobile phones, youngsters use online movie and railway ticket booking database to book
tickets, housewives use database of books to order books online or access various community
site’s database, businessmen use database of airlines to book their trips, academicians use
online journals database to do research work and many more. Nowadays, computers are used
everywhere. We may reform the proverb ‘Where there is a will, there is a way!’ as ‘Where
there is a computer, there is a database.’ Computerized Databases have made our life very
easy and comfortable. We can search any place, product, area, thing, etc., with the help of
stored data in a fraction of a second. Stored data processed with the help of database
management systems extracts the desired information, every time. Let us understand the
database in some more detail.
1.2.2 Information
2
When we process related data, it gives some information. Information is useful to take
decisions, it can be stored for future use, it has some meaning. To obtain information, we
need data. For example, when we process students’ attendance data, we can get a list of
students with low attendance, students who are attending lectures regularly, students who
come to college to attend particular lectures, pattern of class bunking for each student, etc. On
the basis of this information, the college may decide the attendance policy, reschedule the
time-table to improve attendance, decide whether to inform parents or not, determine which
students should be allowed to sit for an examination, etc. This information could also be
stored for future use. In case, when students need a transcript, this information can be used to
fill up lecture-wise attendance details of each student or to generate attendance certificates
which may be required along with migration certificates when students change universities.
Data can be stored manually or electronically. Similarly, stored data may be processed
manually or electronically. Table 1.1 shows some examples of data and information. We can
show the relationship between data and information as given in Figure 1.1. Figure 1.2 shows
an example of data and information. Table 1.1 shows some examples of data, processes which
should be applied on stored data and information which could be obtained after processing
certain data. Table 1.2 shows a student’s examination result data which can be processed as
per the following condition to obtain grade-wise Result analysis.
3
4
The above information may be stored and processed further to represent the result analysis
graphically or pictorially using bar charts as represented in Figure 1.3. X-axis will contain
class code and grades, and Y-axis contains total number of students
5
1.3 Database
As the name suggests, database is a collection of data, i.e., database is a storage area where
we can store all related data and process them. To understand the concept of database, let us
take some real-time examples of database (storage). One logical database which we carry
with us all the time is our brain. The brain stores all thoughts, ideas and things which we
learn, view, etc. and it relates them. We can retrieve, change or remove these stored ideas and
thoughts any time. The example of real-time physical database is a grain warehouse. When it
is the season for some grain/pulses, we store them and use them later as per the process
requirements. When we process the grains/pulses we obtain the information in the form of
floor, sprouts, etc., which could be used in further processing to cook food. The pulses/grains
which we find useless could be removed from the warehouse and could be replaced (updated)
with fresh stock. In real-life, we use the concepts of data, information and database
everywhere. Figure 1.4 shows an example of real-life database of children’s’ schoolbag. It is a
stationery database which contains entities such as notebook, textbook, compass box,
geometry case, etc. Entity Notebook has distinguished notebooks of various subjects; Entity
Textbook has distinguished textbooks of various subjects; Entity Compass box has pencils,
erasers, sharpeners, ruler, etc., and Entity Geometry box has common mathematical tools.
6
Database contains data stored in computer. To process the stored data, we need application
programs. The processed data could be again stored into database for future use. The data, on
which we can do some processing, is known as operational data. Any organization contains
operational data. Table 1.3 contains some examples of organizations and operational data of a
particular organization. A database stores data of various entities. These entities can be related
using relationships. Data also contains description, which is known as metadata. Along with
the data, one can keep constraints on its data types. A cylindrical shape, as shown in Figure
1.5, is used to represent physical database. Physical database is useful for the computer (i.e.,
how a machine sees data), while logical database is useful for the user (i.e., how a human
being sees data). It is a database of a university, which contains various related entities, such
as course, college, student, class, attendance, exam, etc. There are many colleges in a
university; each college contains many students in different courses and classes. Students
attend lectures, appear in exams and get results. The ‘University’ database contains
interrelated data which could be shared by different application programs to obtain
meaningful information.
7
a. Naive User, or End-user, or Layman: The clerk of the university uses the ‘university’
database to enter the data of applicants who have applied for various courses and the same
data are retrieved to generate a merit list. The clerk does not know anything about the
technical features of the database or the language, using which data is entered or retrieved.
He is completely unaware about the technology. Therefore, he/she is known as an end-user or
Layman or Naive user. Table 1.4 shows some examples of databases and end-users of that
database.
b. Software Programmer, or Application Programmer, or Application Developer: A software
programmer is a person who writes application programs or logic in some specific language
to insert, delete, update or fetch data to/from database. An application programmer has brief
knowledge about database and Query Language which is used for writing programs. Query
Language is a generalized language which is available with all databases. A programmer may
or may not have deep understanding about database concepts, but he/she is able to operate on
data stored in the database.
8
c. Database Designer: A database designer decides about entities (data files) which should be
stored within database, constraints to be applied on data, data types, format and other
specifications regarding data. The database designer is responsible for designing of data files.
d. Database Administrator: A database administrator (DBA) is the person who is the overall
in-charge of a database. He/she assigns authorization to users, writes validation procedures,
decides backup and recovery policies, and manages users and privileges. In short, DBA keeps
control on database.
2. Hardware: Hardware is a permanent storage where the database is stored. It may be a hard-
disc, or any other secondary memory. One single database may be stored on more than one
storage devices depending on the volume of data stored within the database. For security
purpose, a copy of database could be kept on some other storage device. Besides storage
device, other hardware, such as computer, peripherals, etc., are also required to perform
database-oriented operations.
3. Software (data dictionary management, database schema management, SQL): Software are
programs or applications which are used to access data from database. These applications
reside in DBMS or there may be some applications which could be interfaced with DBMS to
manage data. For example, programming languages are used to display data on monitor.
There are some software programs, which are part of DBMS, that manage data dictionary or
metadata, define schema for the database objects, and are used to write query on database.
The common language available with all the databases is known as Structured Query
Language; if which is popularly known as SQL and sometimes pronounced as ‘Sequel’.
4. Data: Data is the most important component of a database system. Data is discussed in
detail in Section 1.1. When data is stored in database, it should be stored along with its
definition, data type and size, constraints, such as duplicate values are allowed or not,
possible range of values, formula if it is derived from some other data, etc., display format,
format in which it should be entered, validation rules, etc. Some examples of data
files/entities (tables) and data stored within the entity are given in Tables 1.5, 1.6 and 1.7.
These data files are inter-related data files which are part of the playschool’s database.
9
When data are entered into tables, Kindergarten, Class and Kindergarten Details (Tables 1.5,
1.6 and 1.7 respectively); the correctness of data are checked. Invalid data cannot be entered
into data files. Tables 1.8, 1.9 and 1.10 contain some valid data values for the tables
Kindergarten, Class and Kindergarten Details, respectively.
10
The data in a database must have the following characteristics:
● Same data should be shared between different applications. For example, if there are two
departments, namely ‘accounts and ‘examination’, in a university, then data related to student
should be shared by these two departments. There should be no need to create a copy of the
same data.
● When data are shared, there is a question of integration. Integration means, changes in one
data file should also be reflected in the related data file. For example, if a clerk in the
accounts department deletes a record of any student, then it should also be deleted from
‘member data file’ used by the ‘library’ department of that university.
● When data are properly integrated, there are minimum chances of inconsistent data. Data
will be consistent if they are integrated properly.
● Data should be non-redundant: If possible, to avoid duplication of data in different files,
data should be stored in one file, and whenever required, it should be referenced from the
original file. It is not possible to remove redundancy at all, but we should try to avoid
redundancy. Redundant data causes inconsistency within a database. For example, if a
student’s address is stored in the ‘enrolment’ file as well as in the ‘alumni’ file, then ‘address’
entry for the same student would be redundant. Now, when the student’s address is changed,
the clerk changes the ‘address value’ in the ‘student’ file. He forgets to change address in the
‘alumni’ file. So, now database will show different addresses for the same table which is
conflicting. This is called ‘data inconsistency’, which occurs due to redundant data.
11
● Data should represent complete details. For example, only customer’s first name entered in
the name field represents incomplete detail. It should contain at least first name of the
customer along with the surname
12
1.6 Need for A Database
Following are some reasons for the need of a database:
● Database is required for efficient and easy storage, retrieval, updating and deletion of data
records.
● Interrelated data should be grouped in one named storage area for easy access. This storage
area may be physical or logical which resides in computer.
● For avoiding unnecessary repetition of data values, checking correctness of data by
applying some validation rule, and searching the required information faster thus saving time
and effort, etc.
● Database is required for flexibility, i.e., as and when required we can connect the database
with different front-ends.
● Once a database is created, it can be shared by many users. Hence, to share data with many
applications a database is required.
● Database is needed for storing high volume and complex data, such as documents files,
photographs or images, multimedia data, mobile user’s data, audio and video files.
● For managing multi-dimensional data.
● Database is required for proper transaction management or transaction handling.
13
● Complex data structures, such as pointers, cannot be handled easily by a file-based system.
● When the same data file is required by different programs at the same time, data sharing is
not possible. To use same files at the same time, copy of that data file must be created and
used. When these are two or more copies of same data file, it may result in inconsistent and
redundant data, because changes made in one file may not be carried out in the other files.
● In a file-based system, the programs should only be written in a structured manner.
● It is not possible to set relationships between data files. Programs should be written to
relate them.
● Security settings cannot be applied on data files.
● Set of data files created in a specific file-based system cannot be used with other filebased
systems as storage formats of different file-based systems vary. Database system is required
to overcome the limitations of file-based management system. The traditional database
system contains data files which could be used to store data. The examples of simple database
management system are dBASE and FoxPro. These DBMS contains CUI (Character-based
User Interface) which provides faster access of data using commands. There is no need to
create data files manually. In simple DBMS, data files with data field names and its data type
can be created. However, a simple DBMS does not provide the facility to define keys.
14
As keys cannot be defined, it is not possible to define relationship between data files either. If
user wants to relate data files, then he/she has to write programs to relate two or more file. An
example of such a program is given below in Figure 1.8. But the advantage of simple DBMS,
over file-based system, is that we can share data files between applications. Simple
commands can be used to search, insert, update, delete and view data.
15
● Validation rules can be applied on data before data is entered in the database. It will prevent
wrong data inputs.
● Change in data file structure becomes very easy.
● Security can be enforced on data by assigning privileges for different users.
● Appropriate backup procedure is available to avoid loss of data in any adverse
circumstances, such as power failure, server failure, hardware crash. In case of failure, the
data can be recovered using recovery procedures.
● DBMS provides Import and Export facility using which data files can be imported from
one DBMS and exported to another. Table 1.12 shows the difference between file-based
management system and database management system.
Nothing is 100% perfect. Advantages also bring along limitations with them. Database
management system also has some limitations. They can be described as:
● Cost of database management system is very high. As the number of users increase, we
need to pay more.
● To install database in a network, high-end hardware and skilled personnel to manage the
network and database is required.
● As data can be shared through DBMS, it is difficult to control and keep track of data
accessed by users. Proper encryption and decryption techniques are required to secure data
over a network.
● Efficient employees are required to handle users and decide policies about data access,
which requires considerable and constant training.
● If data volume is very high, performance will be poor. Also, when too many users are using
database at the same time, it may generate traffic on network and slow down the response
time. ● It will be more complex when DBMS contains many databases within it. It may
reduce the speed of data access.
16
Summary
● Data means raw facts. It may be any values, such as integer numbers, float numbers,
characters, dates, images, Boolean.
● Examples of integer type of data are roll numbers form number, order number; float type of
data are salary, balance amount, fees, product price; character type of data are person’s name,
address, qualification, product name; date type of data are birth date, admission date;
retirement date, order date; image type of data are person’s photo, image of property location,
image of property; Boolean type of data are customer status, payment status, gender.
● Interrelated data represent any entity, i.e., data are characteristics of entity. For example,
student name, student birth date and student gender are data (characteristics) related to
student entity. An entity is a distinguishable object of real-world.
● Data related to an entity are kept together in a data file, i.e., data file is a collection of
related data.
● Data may be stored manually or electronically. When we apply any process on stored data,
it gives some valuable information. The process on data stored electronically can be applied
by writing application programs.
● The data on which we do some operation, is known as operational data. Operational data
belongs to any organization. For example, student’s data is an operational data for the
‘University’ organization. By processing student’s data, we can generate information like a
student’s mark sheet, list of college-wise total number of students, etc.
● Database is a collection of data files or tables which contain data within it. Relationship can
be set to access data from different files.
● The process of managing data within database is called database management.
● Database system contains the components data, user, hardware and software.
● Using database, we can share and integrate data between applications.
● Database management system is a collection of software programs through which database
can be managed.
● File-based management system requires manual creation of data files which are very
difficult to handle. Within file-based management system, independent programs should be
written to do operations such as insert, delete, update and view data.
● Database management system provides structured query language to store and access data
from database. There is no need to write long programs to access data. Data redundancy and
data inconsistency problems can be avoided using database management system. Database
management system provides automatic transaction management, backup and recovery
facility, export and import facility, user management and other functionalities.
● The limitations of database management systems are: they are complex, expensive, requires
knowledge to use them, data control is difficult, performance may suffer because of high data
volume, etc.
17