0% found this document useful (0 votes)
22 views

1-Introduction To DBMS

Data base in management system notes.

Uploaded by

flagpro077
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

1-Introduction To DBMS

Data base in management system notes.

Uploaded by

flagpro077
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

CHAPTER I: INTRODUCTION TO DATABASES AND TRANSACTIONS

Topic Covered:
Introduction to Databases and Transactions
What is database system, purpose of database system, view of data, relational databases, database architecture,
transaction management.

Data: It can be defined as the raw information which is processed by computer. Database: It is nothing but collection

of different tables which contains data.

Database Management System (DBMS):


A database-management system (DBMS) is a collection of interrelated data and a set of programs to access those data.
---------------------------------------------------------------------------------------------------------
Q. What is Database Management System (DBMS)?

 A database-management system (DBMS) is a collection of interrelated data and a


Set of programs to access those data. The collection of data, usually referred to as the Database, contains
information relevant to an enterprise.
 The primary goal of a DBMS is to provide a way to store and retrieve database information that is both convenient and
efficient.
 Database systems are designed to manage large bodies of information. Management of data involves both defining
structures for storage of information and providing mechanisms for the manipulation of information. In addition, the
database system must ensure the safety of the information stored, despite system crashes or attempts at unauthorized
access.
 If data are to be shared among several users, the system must avoid possible anomalous results.
 A database is a computer generated software program which can be used to access the data stored in database in an
organized manner.
 The term database is a structured collection of data stored which can be stored in digital form. Before the actual data is
stored in the database, we should clearly specify the schema of the database and different techniques used to manipulate
the data stored in a database.
 Database shouldn‘t only care about the insertion and modification of the data in the database. At times, it should also
focus on how to protect the data stored in the database from unauthorized access.
 DBMS must provide efficient techniques in order to protect the data from accidental system crashes

Q. What are the Applications of Database Management System?

 Banking: For customer information, accounts, loans, and banking transactions.


 Airlines: For reservations and schedule information. Airlines were among the first to use databases in a
geographically distributed manner.
 Universities: For student information/ course registrations, and grades.
 Credit card transactions: For purchases on credit cards and generation of monthly statements.
 Telecommunication: for keeping records of calls made, generating monthly bills, maintaining balances on prepaid calling
cards, and storing information about the communication networks.
 Finance: for storing information about holdings, sales, and purchases of financial instruments such as stocks and bonds;
also for storing real-time market data to enable on-line trading by customers and automated trading by the firm.
 Sales: For customer, product, and purchase information.
 On-line retailers: For sales data noted above plus on-line order tracking/ generation of recommendation lists, and
maintenance of on-line product evaluations.
 Manufacturing: For management of the supply chain and for tracking production of items in factories, inventories of
items in warehouses and stores, and orders for items.
 Human resources: For information about employees, salaries, payroll ,taxes, benefits, and for generation of paychecks.

Q. Explain purpose of Database Management Systems. OR


Q. What are the Drawbacks of File Processing System?
----------------------------------------------------------------------
• Data redundancy and inconsistency:
 Since different programmers create the files and application programs over a long period, the various files are likely
to have different structures and the programs may be written in several programming languages.
 Therefore, the same information may be duplicated in several places (files).
 This redundancy leads to higher storage and access cost.
 In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree.
• Difficulty in accessing data:
 The conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient
manner.
 For Example, consider that one of the university clerks needs to find out the names of all students who live within a
particular postal-code area. The clerk asks the data-processing department to generate such a list.
 Because the designers of the original system did not anticipate this request, there is no application program on hand
to meet it. There is, however, an application program to generate the list of all students.
 The university clerk has now two choices: either obtain the list of all students and extract the needed information
manually or ask a programmer to write the necessary application program. Both alternatives are obviously
unsatisfactory.
 Therefore extraction of the required data is difficult.
• Data isolation:
 Because data are scattered in various files, and files may be in different formats, writing new application programs
to retrieve the appropriate data is difficult.
• Integrity problems:
 The data values stored in the database must satisfy certain types of consistency constraints.
 Suppose the university maintains an account for each department, and records the balance amount in each account.
 Suppose also that the university requires that the account balance of a department may never fall below zero.
 Developers enforce these constraints in the system by adding appropriate code in the various application
programs.
 However, when new constraints are added, it is difficult to change the programs to enforce them.
 The problem is compounded when constraints involve several data items from different files.
• Atomicity problems:
 A computer system, like any other device, is subject to failure.
 In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that
existed prior to the failure.
 It is difficult to ensure atomicity in a conventional file-processing system.
• Concurrent-access anomalies:
 To increase the overall performance of the system and faster response, many systems allow multiple users to update
the data simultaneously.
 In such an environment, interaction of concurrent updates is possible and may result in inconsistent data.
• Security problems:
 Not every user of the database system should be able to access all the data.
 since application programs are added to the file-processing system in an ad hoc manner, enforcing such security
constraints is difficult.
------------------------------------------------------------------------------------------------------
Q. What are the advantages of Database Management System?
The different advantages of DBMS are as follows.
1. Improved data sharing.
The DBMS helps create an environment in which end users have better access to more and better- managed data. Such access makes it
possible for end users to respond quickly to changes in their environment.
2. Improved data security.
The more users access the data, the greater the risks of data security breaches. Corporations invest considerable amounts of
time, effort, and money to ensure that corporate data are used properly. A DBMS provides a framework for better enforcement
of data privacy and security policies.
3. Better data integration.
Wider access to well-managed data promotes an integrated view of the organization‘s operations and a clearer view of the big
picture. It becomes much easier to see how actions in one segment of the company affect other segments.
4. Minimized data inconsistency.
Data inconsistency exists when different versions of the same data appear in different places.
The probability of data inconsistency is greatly reduced in a properly designed database.
5. Improved data access.
The DBMS makes it possible to produce quick answers to ad hoc queries. From a database perspective, a query is a specific
request issued to the DBMS for data manipulation—for example, to read or update the data. Simply put, a query is a question,
and an ad hoc query is a spur-of-the- moment question. The DBMS sends back an answer (called the query result set) to the
application.
6. Improved decision making.
Better-managed data and improved data access make it possible to generate better-quality information, on which better
decisions are based. The quality of the information generated depends on the quality of the underlying data. Data quality is a
comprehensive approach to promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee
data quality, it provides a framework to facilitate data quality initiatives.
7. Increased end-user productivity.
The availability of data, combined with the tools that transform data into usable information, empowers end users to make quick,
informed decisions that can make the difference between success and failure in the global economy.

Q. What are the Disadvantages of Database Management System?


1. Increased costs.
Database systems require sophisticated hardware and software and highly skilled personnel. The cost of maintaining the
hardware, software, and personnel required to operate and manage a database system can be substantial. Training, licensing, and
regulation compliance costs are often overlooked when database systems are implemented.
2. Management complexity.
Database systems interface with many different technologies and have a significant impact on a company‘s resources and culture.
The changes introduced by the adoption of a database system must be properly managed to ensure that they help advance the
company‘s objectives. Given the fact that database systems hold crucial company data that are accessed from multiple sources,
security issues must be assessed constantly.
3. Maintaining currency.
To maximize the efficiency of the database system, you must keep your system current. Therefore, you must perform frequent
updates and apply the latest patches and security measures to all components. Because database technology advances rapidly,
personnel training costs tend to be significant. Vendor dependence. Given the heavy investment in technology and personnel
training,
companies might be reluctant to change database vendors. As a consequence, vendors are less likely to offer pricing point
advantages to existing customers, and those customers might be limited in their choice of database system components.
4. Frequent upgrade/replacement cycles.
DBMS vendors frequently upgrade their products by adding new functionality. Such new features often come bundled in new
upgrade versions of the software. Some of these versions require hardware upgrades. Not only do the upgrades themselves cost
money, but it also costs money to train database users and administrators to properly use and manage the new features.

Q. Differentiate between DBMS and File System.


File processing System DBMS
High rate of redundancy of data exist in a typical file 1 Redundancy is reduced.
processing system.

There is inconsistency of data. 2 Inconsistency of data is reduced.

No provision for data security. 3 Provision for data security is made.

No standard representation of data. 4 Standard representation of data is


achieved using relational data model.

Data integrity is not there. 5 Data integrity is there.


Data cannot be accessed easily. 6 Data can be accessed easily through table structure
(row and columns).
Data cannot be shared. 7 Data can be shared.

Retrieval of data is time consuming 8 Retrieval of data is easy.

View of Data
A major purpose of a database system is to provide users with an abstract view of the data.
 Data Abstraction
Many database-system users are not computer trained, developers hide the complexity from users through several levels of
abstraction, to simplify users interactions with the system:

Figure 1.1 The three levels of data abstraction.

• Physical level:
 The lowest level of abstraction describes how the data are actually stored.
 The physical level describes complex low-level data structures in detail.
• Logical level:
 The next-higher level of abstraction describes what data are stored in the database, and what relationships exist
among those data.
 The logical level thus describes the entire database in terms of a small number of relatively simple structures.
 Although implementation of the simple structures at the logical level may involve complex physical-level
structures, the user of the logical level does not need to be aware of this complexity. This is referred to as
physical data independence.
 Database administrators, who must decide what information to keep in the database, use the logical level of
abstraction.
• View level.
 The highest level of abstraction describes only part of the entire database. Even though the logical level uses
simpler structures, complexity remains because of the variety of information stored in a large database.
 Many users of the database system do not need all this information; instead, they need to access only a part of
the database. The view level of abstraction exists to simplify their interaction with the system. The system may
provide many views for the same database.

Instances and Schemas


Similar to types and variables in programming languages
n Schema –
The overall design of the database is called the database schema.
The logical structure of the database.
l Example: The database consists of information about a set of customers and accounts and the
relationship between them)
l Analogous to type information of a variable in a program
l Physical schema: database design at the physical level
l Logical schema: database design at the logical level
n Instance –
The collection of information stored in the database at a particular moment is called an
instance of the database.
The actual content of the database at a particular point in time
l Analogous to the value of a variable

Data Independence
Data independence can be classified into two types
1. Logical data independence
 It is the ability to modify the conceptual schema without affecting the existing external schemas.
 In logical data independence, the users are shielded from changes in the logical structure of the data or changes in the
choice of relations to be stored.
 The changes to the conceptual schema, such as the addition and deletion of entities, addition and deletion of attributes, or
addition and deletion of relationships must be possible without changing existing external schemas or having to rewrite
application programs.
 Only the view definition and the mapping need be changed in a DBMS that supports logical data independence.
2. Physical data independence
 The ability to modify the internal schema without having to change the conceptual or external schemas is called physical
data independence.
 In physical data independence, the conceptual schema insulates the users from changes in the physical storage of the data.
 The changes to the internal schema, such as using different file organizations or storage structures, using different storage
devices , modifying indexes or hashing algorithms
 must be possible without changing the conceptual or external schemas.
 In other words, physical data independence indicates that the physical storage structures or devices used for storing the data
could be changed without necessitating a change in the conceptual view or any of the external views.
 Note:
The Logical data independence is difficult to achieve than physical data independence as it requires the flexibility in the
design of database and programmer has to anticipate the future requirements or modifications in the design of the
database.

Database Languages
 A database system provides a data-definition language to specify the database
 schema and a data-manipulation language to express database queries and updates.
 In practice, the data-definition and data-manipulation languages are not two separate languages; instead they
simply form parts of a single database language, such as the widely used SQL language.
 Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access or manipulate data as organized by the
appropriate data model. The types of access are:
• Retrieval of information stored in the database
• Insertion of new information into the database
• Deletion of information from the database
• Modification of information stored in the database There are
basically two types:
• Procedural DMLs require a user to specify what data are needed and how to get those data.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what
data are needed without specifying how to get those data.
Declarative DMLs are usually easier to learn and use than are procedural DMLs.
A query is a statement requesting the retrieval of information. The portion of a DML that involves information retrieval is called
a query language.
 Data-Definition Language
 We specify a database schema by a set of definitions expressed by a special language called a data-definition language
(DDL). The DDL is also used to specify additional properties of the data.
 We specify the storage structure and access methods used by the database system by a set of statements in a special type
of DDL called a data storage and definition language. These statements define the implementation details of the
database schemas, which are usually hidden from the users.
 The data values stored in the database must satisfy certain consistency constraints.
 For example, suppose the university requires that the account balance of a department must never be negative.
 The DDL provides facilities to specify such constraints. The database system checks these constraints every time the
database is updated.

Relational Databases
 A relational database is based on the relational model and uses a collection of tables to represent both data and the
relationships among those data. It also includes a DML and DDL.

Attributes

Relational database terminology.


Example of tabular data in the relational model

Advantages:
1. Ease of use: The revision of any information as tables consisting of rows and columns is much easier to understand.
2. Flexibility: Different tables from which information has to be linked and extracted can be easily manipulated by
operators such as project and join to give information in the form in which it is desired.

3. Precision: The usage of relational algebra and relational calculus in the manipulation of the relations between the tables
ensures that there is no ambiguity, which may otherwise arise in establishing the linkages in a complicated network type
database.

4. Security: Security control and authorization can also be implemented more easily by moving sensitive attributes in a
given table into a separate relation with its own authorization controls. If authorization requirement permits, a particular
attribute could be joined back with others to enable full information retrieval.
5. Data Independence: Data independence is achieved more easily with normalization structure used in a relational
database than in the more complicated tree or network structure.
6. Data Manipulation Language: The possibility of responding to query by means of a language based on relational
algebra and relational calculus e.g SQL is easy in the relational database approach. For data organized in other structure
the query language either becomes complex or extremely limited in its capabilities.

Disadvantages:
1. Performance: A major constraint and therefore disadvantage in the use of relational database system is machine
performance. If the number of tables between which relationships to be established are large and the tables themselves
effect the performance in responding to the sql queries.

2. Physical Storage Consumption: With an interactive system, for example an operation like join would depend upon the
physical storage also. It is, therefore common in relational databases to tune the databases and in such a case the physical
data layout would be chosen so as to give good performance in the most frequently run operations. It therefore would
naturally result in the fact that the lays frequently run operations would tend to become even more shared.
3. Slow extraction of meaning from data: if the data is naturally organized in a hierarchical manner and stored as such,
the hierarchical approach may give quick meaning for that data.
Q.Write a Note on Database Architecture.
 Database Architecture

Data Storage and Querying

A database system is partitioned into modules that deal with each of the responsibilities
of the overall system. The functional components of a database system can be broadly divided into the storage manager and the
query processor components.
The storage manager is important because databases typically require a large amount of storage space. Corporate databases range in
size from hundreds of gigabytes to, for the largest databases, terabytes of data. A gigabyte is approximately 1000 megabytes
(actually 1024) (1 billion bytes), and a terabyte is 1 million megabytes (1 trillion bytes). Since the main memory of computers
cannot store this much information, the information is stored on disks. Data are moved between disk
storage and main memory as needed. Since the movement of data to and from disk is slow relative to the speed of the central
processing unit, it is imperative that the database system structure the data so as to minimize the need
to move data between disk and main memory.
The query processor is important because it helps the database system to simplify and facilitate access to data. The query processor
allows database users to obtain good performance while being able to work at the view level and not be burdened with
understanding the physical-level details of the implementation of the system. It is the job of the database system to translate
updates and queries written in a nonprocedural language, at the logical level, into an efficient sequence of operations at the
physical level.
1.7.1 Storage Manager
 The storage manager is the component of a database system that provides the interface between the low-level data
stored in the database and the application programs and queries submitted to the system.
 The storage manager is responsible for the interaction with the file manager. The raw data are stored on the disk
using the file system provided by the operating system. The storage manager translates the various DML statements
into low-level file-system commands.
1.7 Data Storage and Querying
 Thus, the storage manager is responsible for storing, retrieving, and updating data in the database.
 The storage manager components include:
• Authorization and integrity manager, which tests for the satisfaction of integrity constraints and
checks the authority of users to access data.
• Transaction manager, which ensures that the database remains in a consistent (correct) state despite
system failures, and that concurrent transaction executions proceed without conflicting.
• File manager, which manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
• Buffer manager, which is responsible for fetching data from disk storage into main memory, and
deciding what data to cache in main memory. The buffer manager is a critical part of the database
system, since it enables the database to handle data sizes that are much larger than the size of main
memory.
 The storage manager implements several data structures as part of the physical system implementation:
• Data files, which store the database itself.
• Data dictionary, which stores metadata about the structure of the database, in particular the schema of
the database.
• Indices, which can provide fast access to data items. Like the index in this textbook, a database index
provides pointers to those data items that hold a particular value. For example, we could use an index
to find the instructor record with a particular ID, or all instructor records with a particular name.
 Hashing is an alternative to indexing that is faster in some but not all cases.
1.7.2 The Query Processor
o The query processor components include:
o DDL interpreter, which interprets DDL statements and records the definitions in the data dictionary.
o DML compiler, which translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engine understands.
o A query can usually be translated into any of a number of alternative evaluation plans that all give the same
result. The DML compiler also performs query optimization; that is, it picks the lowest cost evaluation plan
from among the alternatives.
o Query evaluation engine, which executes low-level instructions generated by the DML compiler.
Database Users:
Users are differentiated by the way they expect to interact with the system n Application
programmers – interact with system through DML calls n Sophisticated users – form
requests in a database query language
n Specialized users – write specialized database applications that do not fit into the traditional data processing
framework
n Native users – invoke one of the permanent application programs that have been written previously
l Examples, people accessing database over the web, bank tellers, clerical staff Coordinates all the
activities of the database system; the database administrator has a good understanding of the
enterprise‘s information resources and needs.
n Database administrator's duties include:
l Schema definition
l Storage structure and access method definition l Schema
and physical organization modification l Granting user
authority to access the database l Specifying integrity
constraints
l Acting as liaison with users
l Monitoring performance and responding to changes in requirements

Transaction Management
 A transaction is a collection of operations that performs a single logical function in a database application. Each transaction
is a unit of both atomicity and consistency. Thus, we require that transactions do not violate any database consistency
constraints. That is, if the database was consistent when a transaction started, the database must be consistent when the
transaction successfully terminates.

Fig. SaleCo database relational diagram


Transaction Properties
Each individual transaction must display atomicity, consistency, isolation, and durability. These properties are sometimes referred to
as the ACID test. In addition, when executing multiple transactions, the DBMS must schedule the concurrent execution of the
transaction‘s operations.
The schedule of such transaction‘s operations must exhibit the property of serializability. Let‘s look briefly at each of the properties.
 Atomicity requires that all operations (SQL requests) of a transaction be completed; if not, the transaction is aborted. If a
transaction T1 has four SQL requests, all four requests must be successfully completed; otherwise, the entire transaction is
aborted. In other words, a transaction is treated as a single, indivisible, logical unit of work.
 Consistency indicates the permanence of the database‘s consistent state. A transaction takes a database from one consistent
state to another consistent state. When a transaction is completed, the database must be in a consistent state; if any of the
transaction parts violates an integrity constraint, the entire transaction is aborted.
 Isolation means that the data used during the execution of a transaction cannot be used by a second transaction until the first
one is completed. In other words, if a transaction T1 is being executed and is using the data item X, that data item cannot be
accessed by any other transaction (T2 ... Tn) until T1 ends. This property is particularly useful in multiuser database
environments because several users can access and update the database at the same time.
 Durability ensures that once transaction changes are done (committed), they cannot be undone or lost, even in the event of a
system failure.
 Serializability ensures that the schedule for the concurrent execution of the transactions yields consistent results. This property
is important in multiuser and distributed databases, where multiple transactions are likely to be executed concurrently. Naturally,
if only a single transaction is executed, serializability is not an issue.
 A single-user database system automatically ensures serializability and isolation of the database because only one transaction is
executed at a time. The atomicity, consistency, and durability of transactions must be guaranteed by the single-user DBMSs.
(Even a single-user DBMS must manage recovery from errors created by operating-system-induced interruptions, power
interruptions, and improper application execution.)
 Multiuser databases are typically subject to multiple concurrent transactions. Therefore, the multiuser DBMS must implement
controls to ensure serializability and isolation of transactions—in addition to atomicity and durability—to guard the database‘s
consistency and integrity. For example, if several concurrent transactions are executed over the same data set and the second
transaction updates the database before the first transaction is finished, the isolation property is violated and the database is no
longer consistent. The DBMS must manage the transactions by using concurrency control techniques to avoid such undesirable
situations.
Transaction Management with SQL
The American National Standards Institute (ANSI) has defined standards that govern SQL database transactions.
Transaction support is provided by two SQL statements: COMMIT and ROLLBACK. The ANSI standards require that when a
transaction sequence is initiated by a user or an application program, the sequence must continue through all succeeding SQL
statements until one of the following four events occurs:
 A COMMIT statement is reached, in which case all changes are permanently recorded within the database. The COMMIT
statement automatically ends the SQL transaction.
 A ROLLBACK statement is reached, in which case all changes are aborted and the database is rolled back to its previous
consistent state.
 The end of a program is successfully reached, in which case all changes are permanently recorded within the database. This
action is equivalent to COMMIT.
 The program is abnormally terminated, in which case the changes made in the database are aborted and the database is
rolled back to its previous consistent state. This action is equivalent to ROLLBACK.
The use of COMMIT is illustrated in the following simplified sales example, which updates a product‘s quantity on hand

You might also like