1.introduction To DBMS
1.introduction To DBMS
Course Overview
Course Overview
Literature
Literature
Data, Database, DBMS
Data: Known facts that can be recorded and have an implicit meaning; raw
Database: A highly organized, interrelated, and structured set of data about a
particular enterprise
Controlled by a database management system (DBMS)
DBMS:
A collection of programs that enables users to create and maintain a database
Set of programs to access the data
An environment that is both convenient and efficient to use
A software package/system to facilitate the creation and maintenance of a computerized
database
Database System:
The DBMS software together with the data itself. Sometimes, the applications are also
included.
Mini-world:
Some part of the real world about which data is stored in a database. For example,
student grades and transcripts at a university.
Data warehouses
Mobile databases
Maintenance of the database and associated programs over the lifetime of the
database application
Application Programs and DBMS
Transactions: that may read some data and “update” certain values or generate new
data and store that in the database
Simplified database system environment
Example of a Database
Mini-world for the example:
Part of a UNIVERSITY environment
COURSEs
(Academic) DEPARTMENTs
INSTRUCTORs
Example of a Database
Some mini-world relationships:
SECTIONs are of specific COURSEs
The above entities and relationships are typically expressed in a conceptual data
model, such as the entity-relationship (ER) data or UML class model.
Example of a Database
The relational model
Main Characteristics of the Database
Approach
A number of characteristics distinguish the database approach from the much older approach of programming with files. In
traditional file processing, each user defines and implements the files needed for a specific software application as part of
programming the application. For example, one user, the grade reporting office, may keep files on students and their grades.
Programs to print a student’s transcript and to enter new grades are implemented as part of the application. A second user,
the accounting office, may keep track of students’ fees and their payments. Although both users are interested in data about
students, each user maintains separate files—and programs to manipulate these files—because each requires some data not
available from the other user’s files. This redundancy in defining and storing data results in wasted storage space and in
redundant efforts to maintain common up-to-date data.
In the database approach, a single repository maintains data that is defined once and then accessed by various users. In file
systems, each application is free to name data elements independently. In contrast, in a database, the names or labels of
data are defined once, and used repeatedly by queries, transactions, and applications.
Main Characteristics of the Database
Approach
The main characteristics of the database approach versus the file-processing
approach are the following:
Self-describing nature of a database system
If we want to add another piece of data to each STUDENT record, say the Birth_date, such a program will no
longer work and must be changed. By contrast, in a DBMS environment, we only need to change the
description of STUDENT records in the catalog to reflect the inclusion of the new data item Birth_date; no
programs are changed. The next time a DBMS program refers to the catalog, the new structure of STUDENT
records will be accessed and used.
Main Characteristics of the Database
Approach (continued)
Data Abstraction
The characteristic that allows program-data independence is called data abstraction.
A DBMS provides users with a conceptual representation of data that does not include
many of the details of how the data is stored or how the operations are implemented.
A data model is a type of data abstraction that is used to provide this conceptual
representation.
Data model hides storage and implementation details that are not of interest to most
database users.
A data model is used to hide storage details and present the users with a conceptual
view of the database.
The data model uses logical concepts, such as objects, their properties, and their
interrelationships, that may be easier for most users to understand than computer
storage concepts.
Programs refer to the data model constructs rather than data storage details.
Main Characteristics of the Database
Approach (continued)
End-users
End users are the people whose jobs require access to the database for querying,
updating, and generating reports; the database primarily exists for their use.
They use the data for queries, reports and some of them update the database content.
System Analysts and Application Programmers
System analysts: System analysts determine the requirements of end users and develop
specifications for standard transactions that meet these requirements.
Application programmers: Application programmers implement these specifications
(developed by analysts) as programs and test and debug them before deployment.
Advantages of Using the Database
Approach
Controlling redundancy in data storage and in development and maintenance efforts
Sharing of data among multiple users.
In traditional software development utilizing file processing, every user group maintains its own files for
handling its data-processing applications. For example, consider the UNIVERSITY database example. Here, two
groups of users might be the course registration personnel and the accounting office. In the traditional
approach, each group independently keeps files on students. The accounting office keeps data on registration
and related billing information, whereas the registration office keeps track of student courses and grades.
Other groups may further duplicate some or all of the same data in their own files.
This redundancy in storing the same data multiple times leads to several problems. First, there is the need to
perform a single logical update - such as entering data on a new student - multiple times: once for each file
where student data is recorded. This leads to duplication of effort. Second, storage space is wasted when the
same data is stored repeatedly, and this problem may be serious for large databases. Third, files that represent
the same data may become inconsistent. This may happen because an update is applied to some of the files
but not to others. Even if an update - such as adding a new student - is applied to all the appropriate files, the
data concerning the student may still be inconsistent because the updates are applied independently by each
user group. For example, one user group may enter a student’s birth date erroneously as ‘JAN-19-1988’,
whereas the other user groups may enter the correct value of ‘JAN-29-1988’.
Advantages of Using the Database
Approach
In the database approach, the views of different user groups are integrated during database design. Ideally, we should
have a database design that stores each logical data item - such as a student’s name or birth date - in only one place in
the database. This is known as data normalization, and it ensures consistency and saves storage space. However, in
practice, it is sometimes necessary to use controlled redundancy to improve the performance of queries. For example, we
may store Student_name and Course_number redundantly in a GRADE_REPORT file because whenever we retrieve a
GRADE_REPORT record, we want to retrieve the student name and course number along with the grade, student number,
and section identifier. By placing all the data together, we do not have to search multiple files to collect this data. This is
known as denormalization. In such cases, the DBMS should have the capability to control this redundancy in order to
prohibit inconsistencies among the files.