0% found this document useful (0 votes)
73 views

DBMS Unit I

The document discusses the differences between data and information, and describes some of the limitations of a manual database system and file processing system for storing and organizing data. It notes that a database is a collection of related information organized for easy access, retrieval, and analysis. A database management system (DBMS) helps address the problems with manual and file-based systems by allowing storage of data in one place with defined relationships and providing tools for consistent, flexible access and administration of the database.

Uploaded by

Anubhav Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views

DBMS Unit I

The document discusses the differences between data and information, and describes some of the limitations of a manual database system and file processing system for storing and organizing data. It notes that a database is a collection of related information organized for easy access, retrieval, and analysis. A database management system (DBMS) helps address the problems with manual and file-based systems by allowing storage of data in one place with defined relationships and providing tools for consistent, flexible access and administration of the database.

Uploaded by

Anubhav Gupta
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 30

DBMS/ Unit I

Data & Information


Data is the name given to basic facts and entities such as names & numbers. Data is plural of datum, means single piece of information. For example weights, prices, costs, number of items sold, employee names, addresses, marks etc. Information is data that has been converted into a more useful or intelligible form. It is the set of data that has been organized for direct utilization of mankind, as information helps human begins in their decision making process. For example time table, merit list, report card, printed documents, pay slips, receipts, and reports etc. the information is obtained by assembling items of data into a meaningful form. Comparison between data and information: Criteria Definition Example Atomicity Significance In Business Vital in Decision making Knowledge base Data Data is Raw facts & figures. 23 is data Data hold atomic level pieces of information. Data is no significant to a business and of itself Data does not help in decision making Eg in healthcare industry, much activity surrounds data collection. Nurses collect data every day like weights, vital signs etc. Information Information is Processed form of Data. Age 23 Information is a collection of data. Eg. Age and 23 collected together to form information Information is significant to a business and of itself. Eg. 23 is insignificant but age 23 is significant for a business Information helps in decision making Information however provides answers to questions that guide clinicians to change their practices.

Data Process Input


Field, Records & Files

Information Output

Field is the smallest unit of Data (Represented as value in Database). Eg value of field like Employee Name, Address, and Date of Birth etc. Record is a collection of logically related Data / Fields. Files are a collection of sequence of records. Emp No E001 E002 E003 Emp Name Smith Mathew Martin DOB 23-08-1978 12-03-1977 12-12-1965 City Berlin Moscow Paris

1 Record

Page 1 of 30

Data / Fields by Sushila Gupta Prepared

DBMS/ Unit I

Database
A data base is an organized collection of related information. The organization of data / information is necessary because unorganized information has no meaning. A database is designed, built and populated with data for a specific purpose.
Examples of Database: Address book, Dictionary, Telephone directory, student record register etc. in each of these the data is stored in some particular order i.e. in an organized form. In dictionary words are placed in alphabetic order, so as find a word from out of say 10,000 words easily.

Database can be defined as: Collection of interrelated data stored together without unnecessary redundancy.
Serves multiple applications in which each user has his own view of data. The data is protected from unauthorized access by security mechanism & Concurrent access to data is provided. Structured data used for adding new data, modifying existing data & perform other operation.

Database Types
Flat-file Hierarchical Network Relational Object-oriented Object-relational

Operation performed with database: To view the stored information ( see address of some person from address book) To add a new information. ( to add new person address) To modify existing information To delete unwanted information. Arranging information in a particular order ( Say alphabetically or marks ascending order etc) A Database consists of the following 4 components: Data Item Relationship Constraints Schema

Data Items

Relationships

Constraints

Schema

Physical Database
Data: Data item is a distinct piece of information. Relationship: Represent a correspondence / correlation b/w various data elements. Constraints: Are Rule that defines correct database states. Schema: Describes the organization of data & relationship within the database.

Page 2 of 30

Prepared by Sushila Gupta

DBMS/ Unit I 1.3 Manual Database & Its Problems


Manual database is the record keeping system in which human being manages the whole database without the support of computers. Problem with Manual Database:1. To Add New Information: where for eg a page of say M is finished in Address Book. One has to buy a new & large size Address book & copy the entire data from previous one, or to add some blank pages at the end of address book, & the order (alphabetical ) is disturbed. 2. To Compute calculation like gross salary from BASIC, DA, HRA etc using formulae, & to also calculate income tax, insurance at the end of every month, is tedious & error prone. 3. Maintaining list of subscriber for a magazine, renew them or to search a name from a list 10,000 names 4. Change of address & other updation for a Limited Share holder company & a number of such other requests. Providing calculation for dividend amount etc.. 5. Database becomes large in size . Difficult to manage. Database & Computers Computerized database are more suited than manual database because: Computer has large storage capacity to store thousand of records. High Speed, for searching records Arrange records in particular order (number-ascending, alphabets alphabetical) Do the calculation on the data Accuracy with the data & calculations. There are 2 approaches for storing data in Computers:1. Traditional Files Processing System 2. Database Approach 1. Traditional File Processing System File processing system was an attempt to computerize the manual filing system. A File system is a method for storing & organizing computer files & the data they contain to make it easy to find & access them. File system may use storage device such as hard disk or CD-ROM & involve physical location of files. The manual filing system works well when the number of items to be stored is small. The manual system breaks down when we have to large data & cross reference or process of information in the files. The file based system was developed in response to the needs of industry for more efficient data access. Characteristics of File Processing System It is a group of files for storing of an organization. Each file is independent from one another, Each file is called a flat file. Each file contains & process information for 1 specific purpose, ie accounts, inventory etc. Files are designed by using programs written in programming language. Like COBOL, C, C++ For more complex system, file processing system offer little flexibility & many limitation. So not suited for complex system.

Page 3 of 30

Prepared by Sushila Gupta

DBMS/ Unit I User1 User 2

File Definition & handling routines

File Definition & handling routines

Office
Roll No Name Address Class Percentage

Accounts
Roll No Name Address Course Fees

Hostel
Roll No Name Address Room-Type Rent

Library

Roll No Name Address

1.

Limitation / Disadvantages of File Processing System Separated & Isolated Data: To make a decision, a user might need data from separate files. First the files were evaluated by analysts and programmer to determine the specific data required from each file and the relationships between the data and then applications could be written in a programming language to process and extract the needed data. 2. Duplication of Data: Once the data is stored in separate files, there is always a possibility of duplication of data and lots of disadvantages: a. Duplication is wasteful. It costs times & money to enter the data more than once. b. It takes additional storage space again with associated costs.. c. Duplication leads to loss of data integrity & Inconsistency. Consider Student Database with separate copy of files required at Office, Accounts, Hostel &Accounts, and Hostel & Library. Any change at Office is not shown at accounts & hence the data for the same student is different at office & accounts which is wrong. 3. No Standard Maintained / Incompatible file formats As the structure of file is embedded in the application programs, and are dependent on the application programming language. Users have different usability approach. For ex someone who is comfortable in MS-Excel with use that whereas accounts people will be happy to work in TALLY. Different use of software will leave inconsistency & will not have any inference drawn. The direct incompatibility of such files makes them difficult to process jointly. 4. Difficulty in representing data from user view. To create useful applications for the user, often data from various files must be combined. In file processing, it is difficult to determine the relationship b/w isolated data in order to meet user requirements. 5. Data Inflexibility Program data- interdependency & data isolation leads to data inflexibility in providing ad-hoc information requests. 6. Poor Data Control File Processing system being decentralize in nature, it could be very common for the data field to have multiple names defined by the various departments of an organization & depending on the file.

Page 4 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
7. Limited or no data sharing Each application will have its own private file & user have little or no choice to share data outside their own application. 8. Inadequate data manipulation capabilities Calculation based on the data is difficult & user finds it hard to implement. 9. Excessive programming effort Each new application required by programmers essentially will start from scratch by designing new file formats & description and then write the file access logic for each new program 10. Security problems Each user should be allowed access the data concerning his area of application only. Since application programs are added to the file oriented system in an ad-hoc manner, it was difficult to enforce such security system.

1.6 Database Approach


In order to remove all the anomalies / limitation of the File Based Approach, Database approach was required. Unlike the file oriented system with many separated & unrelated files, the database system consists of logically related data stored in a single Data Dictionary. The Database approach represent the change in the way end user data are stored, accessed & managed. It emphasizes the integration & sharing of data throughout the organization. The Database is a single, large repository of data, which can be used simultaneously by many departments & users. Instead of disconnected files with redundant data, all data items are integrated with a minimum amount of duplication. The database is no longer owned by one department but is shared corporate resource. The Database holds not only the organizational data but also a description of data (Data Dictionary). Data Dictionary / Meta Data is self describing nature of database that provides program data independence.

User Office
Office Application Program

Users A/c
Account Application Program

Data Entry & Reports

Data Entry & Reports

Application Program

Application Program Database Management System

Application Program

Physical Database

Office

Account s

Hostel

Library

A Database System Page 5 of 30 Prepared by Sushila Gupta

DBMS/ Unit I

Page 6 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

Page 7 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

DBMS Database Management System (DBMS) is a set of computer programs that allows users to define, create & maintain a database & provides controlled access to the data. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, and restoring database. DBMS is an intermediate layer b/w program & data. Program access the DBMS which then access the data. There are different types of DBMS ranging from DBMS that are known small system that run on personal computers to huge systems that run on main frames eg. Computerized library system Automated teller machines Flight Reservation System Computerize Inventory System.

Page 8 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
A DBMS is a piece of Software that provides services for accessing a Database while maintaining all the features of data. Commercially available Database System in Market are: DBASE, FOXPRO, IMS, ACCESS, Oracle, Sybase, MY-SQL etc. Advantages of DBMS Database Management System (DBMS) is a software package that allows data to be effectively stored, retrieved and manipulated and the data stored in a DBMS package can be accessed by multiple users and by multiple application programs like (SQL Server, Oracle, Ms-Access). The DBMS (Database Management System) is preferred ever the conventional file processing system due to the following advantages: 1. Controlling Data Redundancy - In the conventional file processing system, every user group maintains its own files for handling its data files. This may lead to Duplication of same data in different files. Wastage of storage space, since duplicated data is stored. Errors may be generated due to updation of the same data in different files. Time in entering data again and again is wasted. Computer Resources are needlessly used. It is very difficult to combine information. Example in College Database, there may be no. of applications like Office, Accounts, Library & Hostel that have private files Office Library Hostel Office Roll No Roll No Roll No Roll No Name Name Name Name Class Class Class Class Father Name DOB Father Name Address DOB Address DOB Phone No Address Phone No Address Fee Phone No Books_Issue Phone No Installments Previous Record d Room No Discount Attendance Fine Mess Bill Balance Marks Etc. Etc. Total Etc. etc In this case there is some common data of student like Roll No, Name, Class Phone No, Address will lead to redundancy & if some data changes in one like Office the same is not reflected in other sections. Eg Student Rohit changes his phone & intimates to Office, the same is not reflected in Accounts. 2. Elimination of Inconsistency When data is duplicated & changes are made at one site, which is not propagated at other site, this may lead to inconsistent data as the two data will not agree. So we need to remove this duplication of data in multiple file to eliminate inconsistency. For example: - In Office, say Roll No=5 lives at Rohini, but in Library the same person is indicated in say Pitam pura . This state where entries of the same object do not agree to each other(One is updated & other is not). At such time database is said to be inconsistent. On centralizing the data base the duplication will be controlled and hence inconsistency will be removed. Data inconsistency are often encountered in every day life Consider an another example, w have all come across situations when a new address is communicated to an organization that we deal it (Eg Telecom, Gas Company, Bank). We find that some of the communications from that organization are received at a new address while other continued to be mailed to the old address. Let us again consider the example of Result system. Suppose that a student having Roll no -201 changes his course from 'Computer' to 'Arts'. The change is made in the SUBJECT file but not in RESULT'S file. This may lead to inconsistency of the data. So we need to centralize the database so

Page 9 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
that changes once made are reflected to all the tables where a particulars field is stored. Thus the update is brought automatically and is known as propagating updates. 3. Data Can be Shared Data Sharing is one such good aspect of DBMS where different user access the same database. 4. Better service to the users - Centralizing the data in the database also means that user can obtain new and combined information easily that would have been impossible to obtain otherwise. Also use of DBMS should allow users that don't know programming to interact with the data more easily, unlike file processing system where the programmer may need to write new programs to meet every new demand. 5. Flexibility of the System is Improved - Since changes are often necessary to the contents of the data stored in any system, these changes are made more easily in a centralized database than in a conventional system. Applications programs need not to be changed on changing the data in the database. 6. Integrity can be improved - Since data of the organization using database approach is centralized and would be used by a number of users at a time. It is essential to enforce integrity-constraints. Integrity ensures data in database is always accurate so that no incorrect information cannot be stored in database. A DBMS should provide capabilities for defining & enforcing constraints. For example: - The example of result system that we have already discussed. Since multiple files are to maintained, as sometimes you may enter a value for course which may not exist. Suppose course can have values (Computer, Accounts, Economics, and Arts) but we enter a value 'Hindi' for it, so this may lead to an inconsistent data, so lack of Integrity. Even if we centralized the database it may still contain incorrect data. For example: Salary of full time employ may be entered as Rs. 500 rather than Rs. 5000. A student may be shown to have borrowed books but has no enrollment. A list of employee numbers for a given department may include a number of non existent employees. 7. Standards can be enforced - Since all access to the database must be through DBMS, so standards are easier to enforce. Standards may relate to the naming of data, format of data, structure of the data etc. Standardizing stored data formats is usually desirable for the purpose of data interchange or migration between systems. 8. Security can be improved - In conventional systems, applications are developed in an adhoc/temporary manner. Often different system of an organization would access different components of the operational data, in such an environment enforcing security can be quiet difficult. Setting up of a database makes it easier to enforce security restrictions since data is now centralized. It is easier to control that has access to what parts of the database. Different checks can be established for each type of access (retrieve, modify, delete etc.) to each piece of information in the database. Consider an Example of banking in which the employee at different levels may be given access to different types of data in the database. A clerk may be given the authority to know only the names of all the customers who have a loan in bank but not the details of each loan the customer may have. It can be accomplished by giving the privileges to each employee. 9. Organization's requirement can be identified - All organizations have sections and departments and each of these units often consider the work of their unit as the most important and therefore consider their need as the most important. Once a database has been setup with centralized control, it will be necessary to identify organization's requirement and to balance the needs of the competating units. So it may become necessary to ignore some requests for information if they conflict with higher priority need of the organization.

Page 10 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
It is the responsibility of the DBA (Database Administrator) to structure the database system to provide the overall service that is best for an organization. For example: - A DBA must choose best file Structure and access method to give fast response for the high critical applications as compared to less critical applications. 10. Overall cost of developing and maintaining systems is lower - It is much easier to respond to unanticipated requests when data is centralized in a database than when it is stored in a conventional file system. Although the initial cost of setting up of a database can be large, one normal expects the overall cost of setting up of a database, developing and maintaining application programs to be far lower than for similar service using conventional systems, Since the productivity of programmers can be higher in using non-procedural languages that have been developed with DBMS than using procedural languages. 11. Data Model must be developed - Perhaps the most important advantage of setting up of database system is the requirement that an overall data model for an organization be build. In conventional systems, it is more likely that files will be designed as per need of particular applications demand. The overall view is often not considered. Building an overall view of an organization's data is usual cost effective in the long terms. 12. Provides backup and Recovery - Centralizing a database provides the schemes such as recovery and backups from the failures including disk crash, power failures, software errors which may help the database to recover from the inconsistent state to the state that existed prior to the occurrence of the failure, though methods are very complex. The disadvantages are as follows: Complexity: Database developers ,DBA & end user must understand the functionality f DBMS o Size: DBMS require large piece of software, occupying many megabytes & memory to run it. o High cost of software & Hardware: Initial + Recurrent annual maintenance cost. o Technical expertise is required. o Power dependency. o Overhead for providing security, concurrency, recovery & Integrity functions. o Cost of Conversion: from Manual / File system to DBMS

DBMS Vs File Management System When a computer user wants to store data electronically they must do so by placing data in files. Files are stored in specific locations on the hard disk (directories). The user can create new files to place data in, delete a file that contains data, rename the file, etc -- all known as file management; a function provided by the Operating System (OS). Criteria Size of System Cost No of Files Type of Data Structural Complexity Redundancy Inconsistency Data Isolation Integrity File System Small System Cheap Few Files Files Simple Redundant data Inconsistent Isolation Programmer can few option for Integrity DBMS Large System Expensive Many Files Tables Complex Reduced Redundancy Consistent Data Can be shared Vast option & rigorous Integrity

Page 11 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
Checks Security Backup / Recovery No. of User accessing Application No Security Simple, primitive backup/recovery Often single user Rigorous Security Complex & Sophisticated backup/recovery Multiple users.

1.9 Schema, Sub-Schema & Instances


Schema is layout plan of database is known as schema. Schema gives the name of entities & attributes. It Specifies the relationship b/w them. It specifies the framework into which the values of the data items / fields are fitted. Schema includes the definition of Database name, the record type & the components that make that make those records. The schema will remain the same while the values filled into it change from instant to instant. Example Schema Diagram of M/s ABC Company with 3 tables PRODUCT, CUSTOMER & SALES files is the schema of the database. PRODUCT PROD-ID CUSTOMER CUST-ID SALES CUST-ID

PROD-DESC

UNIT-COST

CUST-NAME

CUST-STREET

CUST-CITY

PROD-ID

PORD-QTY

PROD-PRICE

Sub-Schema
A sub schema is a subset of the schema, portion of the database seen by the application programs giving desired information of the database. Subschema refers to a users view of the data item types and record types, which he or she uses. It gives the users a window through which he or she can view only that part of the database, which is of interest to him. In the Above example has 3 Sub Schema corresponding to the 3 Table Structure: Sub Schema of PRODUCT Table PRODUCT PROD-ID PROD-DESC Sub Schema of CUSTOMER Table CUSTOME R CUST-ID

CUST-NAME

CUST-CITY

Sub Schema of SALES Table


SALES CUST-ID PROD-ID PORD-QTY

Different application program can have different View of Data because of different Sub Schema. Individual application programs can change their respective subschema without affecting Subschema views of others. DBMS Software derives the subschema data requested by the application programs from schema data. The DBA (Database Administrator) ensures that the subschema requested by the application program is derivable from Schema.

Page 12 of 30

Prepared by Sushila Gupta

DBMS/ Unit I Instances


When the schema framework is filled with data items in the database, at any point of time is referred to as instance of the database. The collection of information stored in the database at a particular moment is called an instance of the database. It is also called as State of Database or Snapshot. Example of Database instance of M/s ABC Company PRODUCT PROD-ID
A12141 B14147 A12123 T11092 CH0014

PROD-DESC
Almirah Dryer Freeze Table Chair

UNIT-COST
4000 1500 8500 800 1200

Remark:Database Schema Refers to Database Definition (Structure of Database) Database Sub Schema - Refers to Subset of Database Definition (Structure of a part of Database) Database Instances Logical Records at any point of time in Database.
1.3 Functions of a DBMS The functions performed by a typical DBMS are the following: Data Definition The DBMS provides functions to define the structure of the data in the application. These include defining and modifying the record structure, the type and size of fields and the various constraints/conditions to be satisfied by the data in each field. Data Manipulation Once the data structure is defined, data needs to be inserted, modified or deleted. The functions which perform these operations are also part of the DBMS. These function can handle planned and unplanned data manipulation needs. Planned queries are those which form part of the application. Unplanned queries are ad-hoc queries which are performed on a need basis. Data Security & Integrity The DBMS contains functions which handle the security and integrity of data in the application. These can be easily invoked by the application and hence the application programmer need not code these functions in his/her programs. Data Recovery & Concurrency Recovery of data after a system failure and concurrent access of records by multiple users are also handled by the DBMS. Data Dictionary Maintenance Maintaining the Data Dictionary which contains the data definition of the application is also one of the functions of a DBMS. Performance

Page 13 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
Optimizing the performance of the queries is one of the important functions of a DBMS. Hence the DBMS has a set of programs forming the Query Optimizer which evaluates the different implementations of a query and chooses the best among them. Thus the DBMS provides an environment that is both convenient and efficient to use when there is a large volume of data and many transactions to be processed. 1.4 Role of the Database Administrator (DBA) Database administrator (DBA) is a person or group in charge for implementing DBMS in an organization. Database Administrators job requires a high degree if technical expertise and the ability to understand and interpret management requirements at a senior level. In practice the DBA may consist of team of people rather than just one person. The Database Administrator (DBA) who is like the super-user of the system. The role of the DBA is very important and is defined by the following functions.

Defining the Conceptual Schema and database creation: The DBA defines the schema which contains the structure of the data in the application. The DBA determines what data needs to be present in the system and how this data has to be represented and organized. Plans storage structures and access strategies: The DBA must also decide how the data is to be represented in the database and must specify the representation by writing the storage structure definition (using internal data definition language). Also the associated mapping between the storage structure definition and the conceptual schema must also be specified. Provides support to Users: It is the responsibility of DBA to provide support to the user, to ensure that the data they require is available and to write the necessary external schemas. Also the mapping between any given external schema and conceptual schema must also be specified. The DBA needs to interact continuously with the users to understand the data in the system and its use. Defining Security & Integrity Checks: The DBA finds about the access restrictions to be defined and defines security checks such that no malicious users can access database and it must remain protected. Data Integrity checks are also defined by the DBA.
Defining Backup / Recovery Procedures: In case of damage to any portion of the databases caused by human error or a failure in the hardware or supporting operating system- it is essential to be able to repair the data concerned with a minimum of delay and with as little effect as possible on the rest of the system. The DBA also defines procedures

for backup and recovery. Defining backup procedures includes specifying what data is to backed up, the periodicity of taking backups and also the medium and storage place for the backup data. Monitoring Performance and responding to changes in requirements: The DBA has to continuously monitor the performance of the queries and take measures to optimize all the queries in the application.

Three Level Architecture:


American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) organised the need of three level architecture in 1975. There are following three levels or layers of DBMS architecture: External Level Conceptual Level Internal Level

Objectives of Three-Level Architecture


All users should be able to access same data. A user's view is immune to changes made in other views. Users should not need to know physical database storage details. DBA should be able to change database storage structures without affecting the users' views. Prepared by Sushila Gupta

Page 14 of 30

DBMS/ Unit I Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users.

1. The Internal Level: The internal level has an internal schema, which describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database. The internal level is the one that concerns the way the data are physically stored on the hardware. The internal level is concerned with such things as Storage space allocation for data and indexes Record description for storage (with stored sizes for data items) Record placement Data compression and data encryption techniques 2. Conceptual Level or Logical level: the conceptual level has a conceptual schema, which

Field Name

describes the structure of the whole database for a community of users. This level describes what data is stored in the database and the relationships among the data. The conceptual schema hides the details of physical storage structure and concentrates on describing entities, data types, relationships, user operations and constraints. For example, in case of student database Rollno, Name, Class, Address etc. are attributes of entity student Student Rollno 1234 1249 2315

Name Nitesh Amit Kumar Dinesh

Class B.Tech BBA MBA

Address Rohini, Delhi Engineers Enclave 32, Lok Vihar, Delhi

TM 500 500 1600

MO 382 410 1120

%age 74 82 70

Field/ Column /Attribute

The constraints on the data, Semantic information about the data and security and integrity information The conceptual level supports each external view, in that any data available to a user must be contained in or derivable from the conceptual level. However this level must not contain any storage dependent details.

Page 15 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
3. External Level: The external or view level includes a number of external schema or user

views. Each external schema describes the part of the database that a particular user group is interested in and hides the rest of database from that user group. A view involves only those portions of a database which are of concern to a user. Therefore same database can have different views for different users. The external view insulates users from the details of the internal and conceptual levels. External level is also known as the view level. For example one user may view dates in the form (day, month, year) while another may view dates as (year, month, day). There will be only one conceptual view, consisting of the abstract representation of the database in its entirely. Similarly there will be only one internal or physical view representing the total database, as it is physically stored. The three schema architecture is a convenient tool with which the user can visualize the schema levels in a database system. Most DBMS do not separate the three levels completely, but support the three schema architecture to some extent.

Data Independence: the three schema architecture can be used to further explain the concept of data
independence, which can be defined as the capacity to change the schema at one level of a database system without having to change the schema at the next higher level. Specifically, data independence means changes in storage structure and data access technique does NOT affect application program. There are two types of data independence: 1. Logical Data Independence is the capacity to change the conceptual schema without having to change external schemas or application programs. The change would be absorbed by the mapping between the external and conceptual levels. We may change the conceptual schema to expand the database (by adding a record type or data item), to change constraints or to reduce the database (by removing a record type or data item).

2. Physical Data Independence indicates that the physical storage structures or devices should be changed
without affecting conceptual schema. The change would be absorbed by the mapping between the conceptual and internal levels. Physical data independence is achieved by the presence of the internal level of the database and the mapping or transformation from the conceptual level of the database to the internal level. If there is a need to change the file organisation or the type of physical device uses as a result of growth in the database or new technology, a change is required in the conceptual/internal mapping between the conceptual and internal levels. The logical data independence is difficult to achieve than physicals data independence as it requires the flexibility in the designs of database and programmer has to foresee future requirements in the design.

Page 16 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

End Users:
End users are those persons who interact with the application directly. They are responsible to insert, delete and update data in the database. They get information from the system as and when required. Different types of end users are as follows: (a) Casual end users- These users occasionally access the database but they need different

information each time. They use sophisticated database query language to specify their request for examples- middle or high level managers.
(b) Nave or parametric end users- Naive users are those users who do not have any technical

knowledge about the DBMS. They access the sizeable portion of database. They use the database through application programs by using simple user interface. They perform all operations by using simple commands provided in the user interface. Example: The data entry operator in an office is responsible for entering records in the database. He performs this task by using menus and buttons etc. He does not know anything about database or DBMS. He interacts with the database through the application program. Also the ATM user is instructed through each step of institution, they can check account balances, withdrawals, etc. (c) Online end users These users may communicate with the database via an online terminal. These users are aware of the presence of the database system and may require a certain amount of expertise within the interaction with the database.
(d) Sophisticated end users- Sophisticated users are the users who are familiar with the structure

of database and facilities of DBMS. These users include engineers, scientist, and business analyst. Such users can use a query language such as SQL to perform the required operations on databases. Some sophisticated users can also write application programs.
(e) Standalone end users- These users maintain personal database by using readymade programs

on packages that provide easy to use menu based on graphic based interfaces for example the user of text package that stores variety of personal, financial data for text purchase.
(f) Application programmers and system analyst- Application programmer is the person who is

responsible for implementing the required functionality of database for the end user. Application Page 17 of 30 Prepared by Sushila Gupta

DBMS/ Unit I programmer works according to the specification provided by the system analyst. System analyst determines the requirement of end user especially parametric end users.
(g) Database Administrator- Database administrator is responsible for, managing the whole

database system. He designs creates and maintains the database. He manages the users who can access this database, and controls integrity issues. He also monitors the performance of the system and makes changes in the system as and when required.
DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the following. Menu-Based interfaces for Web Clients or Browsing: These interfaces present user with lists of options, called menus that lead the user through the formulation a request. Menus do away with the need to memorize the specific commands and syntax of a query language. Pull-down menus are a very popular technique in Web based user interfaces. They are also often used in browsing interfaces, which allows a user to look through the contents of a database. Forms-Based Interfaces: A forms-based interface displays a form to each user. Forms are usually designed and programmed for naive users as interfaces to canned transactions. Some systems have utilities that define a form by letting the end user interactively construct a sample form on the screen. Graphical User Interfaces: A graphical interface (GUI) typically displays a schema to the user in diagrammatic form. The user can then specify a query by manipulating diagram. In many cases, GUls utilize both menus and forms. Most GUls use a pointing device, such as a mouse, to pick certain parts of the displayed schema diagram. Natural Language Interfaces: These interfaces accept requests written in English, Hindi or some other language and attempt to "understand" them. A natural language interface usually has its own "schema," which is similar to the database conceptual schema, as well as a dictionary of important words. Interfaces for Parametric Users: Parametric users, such as bank tellers, often have a small set of operations that they must perform repeatedly. Systems analysts and programmers design and implement a special interface for each known class of na? ve users. Usually, a small set of abbreviated commands is included, with the goal of minimizing the number of keystrokes required for each request. For example, function keys in a terminal can be programmed to initiate the various commands. Interfaces for the DBA. Most database systems contain privileged commands that can be used only by the DBA's staff. These include commands for creating account, setting system parameters, granting account authorization, changing a schema, and reorganizing the storage structures of a database.

When not to use a DBMS Main costs of using a DBMS: High initial investment in hardware, software, training and possible need for additional hardware. Overhead for providing generality, security, recovery, integrity, and concurrency control. Generality that a DBMS provides for defining and processing data. When a DBMS may be unnecessary: If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. If access to data by multiple users is not required. Database languages
Once the design of the database is completed and a DBMS is chosen to implement the database, the next step is to define the conceptual and internal schema for the data and any mappings between the two. The languages that are used to do so are :

Page 18 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

1. Data Definition Language (DDL) statements: DDL statements are used to define, alter or
drop database objects. The following table gives an overview about usage of DDL statements in ORACLE

S. No.
1. 2. 3. 4.

Need and Usage


Create schema objects Alter schema objects Delete scheme objects Rename schema objects

The SQL statement

CREATE ALTER DROP RENAME 2. Data Manipulation Language (DML) statements: Once the tables have been created, the DML statements enable users to query or manipulate data in existing schema objects. DML: statements are normally the most frequently used commands. The following table gives an overview about the usage of DML statements in ORACLE.

S. No.
1. 2. 3. 4.

Need and Usage


Remove rows from tables or views Add new rows of data into table or view Retrieve data from one or more table Change column values in existing rows of a table or view

The SQL statement


DELETE INSERT SELECT UPDATE

3. Data Control Language (DCL) statements: A privilege can either be granted to a user
with the help of GRANT statement. The privileges assigned can be SELECT, ALTER, DELETE and EXECUTE, INSERT, INDEX etc in addition to granting of privileges, you can also revoke it by using REVOKE command. The following table gives an overview about the usage of DCL statements in ORACLE.

S. No.
1. 2.

Need and Usage


Grant and take away privileges and roles Add a comment to the data dictionary

The SQL statement


GRANT REVOKE COMMENT

4. Transaction Control Language (TCL) statements: Once the tables have been created,
the DML statements enable users to query or manipulate data in existing schema objects. TCL statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions. The following table gives an overview about the usage of TCL statements in ORACLE.

S. No.
1. 2. 3.

Need and Usage


To save the work done Identify a point in a transaction to which you can later roll back To restore the database to original since the last commit

The SQL statement


COMMIT SAVEPOINT ROLLBACK

Examples of Database: Address book, Dictionary, Telephone director etc. In dictionary words are placed in alphabetic order, so as find a word from out of say 10,000 words easily.

Page 19 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

The various database models


A data model is an integrated collection of concepts for describing and manipulating data, relationship between data, and constraints on the data in an organization. It provides a clearer and more accurate description and representation of data. The purpose of a data model is to represent data and to make the data understandable. There have been many data models but they fall into three broad categories: Object-based models (conceptual schema) Record-based models (external schema) Physical data models (internal schema)

Importance of Data models


Data models representations, usually graphical, of complex real-world data structures Facilitate interaction among the designer, the applications programmer and the end user End-users have different views and needs for data Data model organizes data for various users

1. Record-based data model


A record based data model is used to specify the overall logical structure of the database. In this model the database consists of a no. of fixed formats of different types. Each record type defines a fixed no. of fields having a fixed length. There are 3 principle types of record based data model. They are: I. Hierarchical data model. II. Network data model. III. Relational data model. 1. Hierarchical Model Hierarchical data model is one of the oldest database models. The hierarchical data model organizes data in a tree structure. There is a hierarchy of parent and child data segments. This structure implies that a record can have repeating information, generally in the child data segments Hierarchical DBMSs were popular from the late 1960s, with the introduction of IBMs Information Management System (IMS) DBMS, through the 1970s. Logical structure represented as an upside-down tree Hierarchical structure contains levels or segments Depicts a set of one-to-many (1: M) relationships Between a parent and its children segments Each parent can have many children Each child has only one parent

Page 20 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

Page 21 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

Advantages & Disadvantages of Hierarchical model The main Advantages of this database model are: Simplicity: Since the database is based on the hierarchical structure, the relationship between the various layers is logically simple. Thus the design of a hierarchical database is simple. Data security: The hierarchical model was the first database model that offered the data security is provided and enforced by the DBMS. Data integrity: Since the hierarchical model is based on parent/child relationship, there is always a link between the parent segment and child segments under it. The child segments are always referred by its parent, so this model promotes data integrity. Efficiency: The hierarchical model is very efficient, one when the database contains a large number of 1:N relationships and when the users require large number of transactions using data whose relationships are fixed. Disadvantages Implementation Complexity: This model is simple and easy to design but it is quite complex to implement. The database designers should have very good knowledge of the physical storage characteristics. Database management problems: If you make any changes in the database structure of hierarchical database, then you need to make the necessary changes in all the application programs that access the database. Lacks structural independence: Structural independence exists when the changes in the database structure does not affect the DBMSs ability to access the data. This model use physical storage paths to navigate to the different data segments. Thus in hierarchical database, the benefits of data independence is limited by structural dependence.

Page 22 of 30

Prepared by Sushila Gupta

DBMS/ Unit I Programs Complexity: Due to structural dependence, the end users must know how the data is distributed physically in the database in order to access data. This requires knowledge of complex pointer system, which is often beyond the ordinary users. Operational Anomalies: the Hierarchical model suffers from the Insert, Update, Delete and Retrieve anomalies. Thus hierarchical model is not suitable for all the cases. Implementation Limitation: Many of the common relationships do not confirm to 1:N relationship. The many to many (N:N) relationships, which are more common in real life are very difficult to implement in a hierarchical model.

Network Data Model: The network model replaces the hierarchical tree with a graph thus allowing more general connections among the nodes. Its ability to handle many to many (N:N) relations. In other words, it allows a record to have more than one parent.
2.

In network database terminology, a relationship is a set. Each set is made up of at least two types of records: an owner record (parent in hierarchical) and a member record (child record in hierarchical).
To define a network database one needs to define: (a) the database record types which consist of data items,

(b) the set-types.

A member record type can have that role in more than one set; hence the multiparent concept is supported. An owner record type can also be a member or owner in another set. The data model is a simple network, and link and intersection record types may exist, as well as sets between them. Thus, the complete network of

Page 23 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
relationships is represented by several pair wise sets; in each set some (one) record type is owner and one or more record types are members. A network structure thus allows 1: M relationship, M:M relationship, although 1:1 is permitted among entities.

ADVANTAGES The network model retains almost all the advantages of the hierarchical model while eliminating some of its shortcomings.

Simplicity: The network model provide very efficient High-speed retrieval Data Integrity: In a network model, no member can exist without an owner. A user must therefore first define the owner record and then the member record. This ensures the integrity. Ease of data access: the data access is easier and flexible than the hierarchical model. Ability to handle more relationship types. The network model can handle the one-to-many and many-to- many relationships. it is conceptually simple and easy to design. Data Independence: The network model draws a clear line of demarcation between programs and the complex physical storage details. The application programs work independently of the data. Any changes made in the data characteristics do not affect the application program.

DISADVANTAGES

System complexity: In a network model, data are accessed one record at a time. This males it
essential for the database designers, administrators, and programmers to be familiar with the internal data structures to gain access to the data. Therefore, a user friendly database management system cannot be created using the network model.

Operational Anomalies: Network models insertion, deletion and updating operations of any record
require large number of pointer adjustment, which makes its implementation very complex and complicated.

Lack of Structural independence: Making structural modifications to the database is very difficult in
the network database model as the data access method is navigational. Any changes made to the database structure require the application programs to be modified before they can access data. Though the network model achieves data independence, it still fails to achieve structural independence.

3. The

relational model (RDBMS, Relational database management system): In the Relational

Model, unlike the Hierarchical and Network models, there are no physical links. The data is stored in two-dimensional tables (rows and columns). The data is manipulated based on the relational theory of mathematics.

(RDBMS - relational database management system) A database based on the relational model developed by E.F. Codd. A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular Relational DBMSs.

Properties of Relational Tables:


Values Are Atomic Each Row is Unique Column Values Are of the Same Kind The Sequence of Columns is Insignificant

Page 24 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
The Sequence of Rows is Insignificant Each Column Has a Unique Name

Terminology used in Relational Model: Term Relation Tuple Attribute A table A row or a record in a relation. A field or a column in a relation. Meaning Eg. from the given Case Example Ord_Aug, Customers, Items etc. A row from Customers relation is a Customer tuple. Ord_Date, Item#, CustName etc. Cardinality of Ord_Items relation is 8 Degree of Customers relation is 3. Domain of Qty in Ord_Items is the set of all values which can represent quantity of an ordered item. Primary Key of Customers relation is Cust#. Ord# and Item# combination forms the primary Key of Ord_Items Cust# in Ord_Aug relation is a foreign key creating reference from Ord_Aug to Customers. This is required to indicate the relationship between Orders in Ord_Aug and Customers.

Cardinality of The number of tuples in a a relation relation. Degree of a relation Domain of an attribute The number of attributes in a relation. The set of all values that can be taken by the attribute. An attribute or a combination of attributes that uniquely defines each tuple in a relation. An attribute or a combination of attributes in one relation R1 which indicates the relationship of R1 with another relation R2.

Primary Key of a relation Foreign Key

Page 25 of 30

Prepared by Sushila Gupta

DBMS/ Unit I
The foreign key attributes in R1 must contain values matching with those of the values in R2 Ord# and Item# in Ord_Items are foreign keys creating references from Ord_Items to Ord_Aug and Items respectively.

Advantages of the Relational Model


Structural independence Improved conceptual simplicity Easier database design, implementation, management, and use Ad hoc query capability Powerful database management system

Disadvantages of the Relational Model


Substantial hardware and system software overhead May not fit all business models Can facilitate poor design and implementation May promote "islands of information" problems

Page 26 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

Comparison of Data Models


The following table gives a comparative study of the three traditional data models. S.NO HIERARCHICAL DATA NETWORK DATA MODEL RELATIONAL DATA MODEL

MODEL Relationship between records is of the parent child type

1.

Relationship between records is expressed in the form of pointers or links

Relationship between records is represented by a . relation that contains a key C for each record involved in the relationship

2.

Many to many relationship Many to many relationship Many - to many. relationship cannot be expressed in this can also be implemented can be easily implemented model It is a simple, straightforward and natural method of a implementing record relationships This type of model is useful only when there is some hierarchical character in the database. In order to represent links among records, pointers are used. Thus relations among records are physical. Searching for a record is very difficult since one can retrieve a child only after going through its parent record. Record relationship, implementation is very complex due to the use of pointers Network model is useful for representing such records which have many to many relationships Relationship implementation is very easy through the use of a key or composite key field(s)

3.

4.

Relational model is useful for representing most of the real world objects and relationships among them Relational model does not maintain physical connection among records. Data is organized logically in the form of rows and columns, and stored in table

5.

In Network model also the record relations are physical.

6.

Searching a record is easy since there are multiple access paths to data elements.

A unique indexed key field is used to search for a data element.

Page 27 of 30

Prepared by Sushila Gupta

DBMS/ Unit I

STRUCTURE OF DBMS
DBMS (Database Management System) acts as an interface between the user and the database. The user requests the DBMS to perform various operations (insert, delete, update and retrieval) on the database. The components of DBMS perform these requested operations on the database and provide necessary data to the users. The various components of DBMS are shown below: -

1. DDL Compiler - Data Description Language compiler processes schema definitions specified in the DDL. It includes metadata information such as the name of the files, data items, storage details of each file, mapping information and constraints etc. 2. DML Compiler and Query optimizer - The DML commands such as insert, update, delete, retrieve from the application program are sent to the DML compiler for compilation into object code for database access. The object code is then optimized in the best way to execute a query by the query optimizer and then send to the data manager. 3. Data Manager - The Data Manager is the central software component of the DBMS also knows as Database Control System. The Main Functions Of Data Manager Are: Convert operations in user's Queries coming from the application programs or combination of DML Compiler and Query optimizer which is known as Query Processor from user's logical view to physical file system. Controls DBMS information access that is stored on disk. It also controls handling buffers in main memory. It also enforces constraints to maintain consistency and integrity of the data. It also synchronizes the simultaneous operations performed by the concurrent users. It also controls the backup and recovery operations. 4. Data Dictionary - Data Dictionary is a repository of description of data in the database. It contains information about Data - names of the tables, names of attributes of each table, length of attributes, and number of rows in each table. Page 28 of 30 Prepared by Sushila Gupta

DBMS/ Unit I Relationships between database transactions and data items referenced by them which is useful in determining which transactions are affected when certain data definitions are changed. Constraints on data i.e. range of values permitted. Detailed information on physical database design such as storage structure, access paths, files and record sizes. Access Authorization - is the Description of database users their responsibilities and their access rights. Usage statistics such as frequency of query and transactions. Data dictionary is used to actually control the data integrity, database operation and accuracy. It may be used as a important part of the DBMS. Importance of Data Dictionary Data Dictionary is necessary in the databases due to following reasons: It improves the control of DBA over the information system and user's understanding of use of the system. It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions. It helps in searching the views on the database definitions of those views. It provides great assistance in producing a report of which data elements (i.e. data values) are used in all the programs. It promotes data independence i.e. by addition or modifications of structures in the database application program are not effected. 5. Data Files - It contains the data portion of the database. 6. Compiled DML - The DML complier converts the high level Queries into low level file access commands known as compiled DML. 7. End Users - They are already discussed in previous section. COMPONENTS OF DATABASE SYSTEM A database system is composed of four components; Data Hardware Software Users which coordinate with each other to form an effective database system.

Fig. 1.1 Data Base System

Page 29 of 30

Prepared by Sushila Gupta

DBMS/ Unit I 1. Data - It is a very important component of the database system. Most of the organizations generate, store and process 1arge amount of data. The data acts a bridge between the machine parts i.e. hardware and software and the users which directly access it or access it through some application programs. Data may be of different types. User Data - It consists of a table(s) of data called Relation(s) where Column(s) are called fields of attributes and rows are called Records for tables. A Relation must be structured properly. Metadata - A description of the structure of the database is known as Metadata. It basically means "data about data". System Tables store the Metadata which includes. - Number of Tables and Table Names - Number of fields and field Names - Primary Key Fields Application Metadata - It stores the structure and format of Queries, reports and other applications components. ' 2. Hardware - The hardware consists of the secondary storage devices such as magnetic disks (hard disk, zip disk, floppy disks), optical disks (CD-ROM), magnetic tapes etc. on which data is stored together with the Input/Output devices (mouse, keyboard, printers), processors, main memory etc. which are used for storing and retrieving the data in a fast and efficient manner. Since database can range from those of a single user with a desktop computer to those on mainframe computers with thousand of users, therefore proper care should be taken for choosing appropriate hardware devices for a required database. 3. Software - The Software part consists of DBMS which acts as a bridge between the user and the database or in other words, software that interacts with the users, application programs, and database and files system of a particular storage media (hard disk, magnetic tapes etc.) to insert, update, delete and retrieve data. For performing these operations such as insertion, deletion and updation we can either use the Query Languages like SQL, QUEL, Gupta SQL or application softwares such as Visual 3asic, Developer etc. 4. Users - Users are those persons who need the information from the database to carry out their primary business responsibilities i.e. Personnel, Staff, Clerical, Managers, Executives etc. On the basis of the job and requirements made by them they are provided access to the database totally or partially. The various types of users which can access the database are: Database Administrators (DBA) Database Designers End Users Application Programmers

Page 30 of 30

Prepared by Sushila Gupta

You might also like