DBMS Unit I
DBMS Unit I
Information Output
Field is the smallest unit of Data (Represented as value in Database). Eg value of field like Employee Name, Address, and Date of Birth etc. Record is a collection of logically related Data / Fields. Files are a collection of sequence of records. Emp No E001 E002 E003 Emp Name Smith Mathew Martin DOB 23-08-1978 12-03-1977 12-12-1965 City Berlin Moscow Paris
1 Record
Page 1 of 30
DBMS/ Unit I
Database
A data base is an organized collection of related information. The organization of data / information is necessary because unorganized information has no meaning. A database is designed, built and populated with data for a specific purpose.
Examples of Database: Address book, Dictionary, Telephone directory, student record register etc. in each of these the data is stored in some particular order i.e. in an organized form. In dictionary words are placed in alphabetic order, so as find a word from out of say 10,000 words easily.
Database can be defined as: Collection of interrelated data stored together without unnecessary redundancy.
Serves multiple applications in which each user has his own view of data. The data is protected from unauthorized access by security mechanism & Concurrent access to data is provided. Structured data used for adding new data, modifying existing data & perform other operation.
Database Types
Flat-file Hierarchical Network Relational Object-oriented Object-relational
Operation performed with database: To view the stored information ( see address of some person from address book) To add a new information. ( to add new person address) To modify existing information To delete unwanted information. Arranging information in a particular order ( Say alphabetically or marks ascending order etc) A Database consists of the following 4 components: Data Item Relationship Constraints Schema
Data Items
Relationships
Constraints
Schema
Physical Database
Data: Data item is a distinct piece of information. Relationship: Represent a correspondence / correlation b/w various data elements. Constraints: Are Rule that defines correct database states. Schema: Describes the organization of data & relationship within the database.
Page 2 of 30
Page 3 of 30
Office
Roll No Name Address Class Percentage
Accounts
Roll No Name Address Course Fees
Hostel
Roll No Name Address Room-Type Rent
Library
1.
Limitation / Disadvantages of File Processing System Separated & Isolated Data: To make a decision, a user might need data from separate files. First the files were evaluated by analysts and programmer to determine the specific data required from each file and the relationships between the data and then applications could be written in a programming language to process and extract the needed data. 2. Duplication of Data: Once the data is stored in separate files, there is always a possibility of duplication of data and lots of disadvantages: a. Duplication is wasteful. It costs times & money to enter the data more than once. b. It takes additional storage space again with associated costs.. c. Duplication leads to loss of data integrity & Inconsistency. Consider Student Database with separate copy of files required at Office, Accounts, Hostel &Accounts, and Hostel & Library. Any change at Office is not shown at accounts & hence the data for the same student is different at office & accounts which is wrong. 3. No Standard Maintained / Incompatible file formats As the structure of file is embedded in the application programs, and are dependent on the application programming language. Users have different usability approach. For ex someone who is comfortable in MS-Excel with use that whereas accounts people will be happy to work in TALLY. Different use of software will leave inconsistency & will not have any inference drawn. The direct incompatibility of such files makes them difficult to process jointly. 4. Difficulty in representing data from user view. To create useful applications for the user, often data from various files must be combined. In file processing, it is difficult to determine the relationship b/w isolated data in order to meet user requirements. 5. Data Inflexibility Program data- interdependency & data isolation leads to data inflexibility in providing ad-hoc information requests. 6. Poor Data Control File Processing system being decentralize in nature, it could be very common for the data field to have multiple names defined by the various departments of an organization & depending on the file.
Page 4 of 30
DBMS/ Unit I
7. Limited or no data sharing Each application will have its own private file & user have little or no choice to share data outside their own application. 8. Inadequate data manipulation capabilities Calculation based on the data is difficult & user finds it hard to implement. 9. Excessive programming effort Each new application required by programmers essentially will start from scratch by designing new file formats & description and then write the file access logic for each new program 10. Security problems Each user should be allowed access the data concerning his area of application only. Since application programs are added to the file oriented system in an ad-hoc manner, it was difficult to enforce such security system.
User Office
Office Application Program
Users A/c
Account Application Program
Application Program
Application Program
Physical Database
Office
Account s
Hostel
Library
DBMS/ Unit I
Page 6 of 30
DBMS/ Unit I
Page 7 of 30
DBMS/ Unit I
DBMS Database Management System (DBMS) is a set of computer programs that allows users to define, create & maintain a database & provides controlled access to the data. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, and restoring database. DBMS is an intermediate layer b/w program & data. Program access the DBMS which then access the data. There are different types of DBMS ranging from DBMS that are known small system that run on personal computers to huge systems that run on main frames eg. Computerized library system Automated teller machines Flight Reservation System Computerize Inventory System.
Page 8 of 30
DBMS/ Unit I
A DBMS is a piece of Software that provides services for accessing a Database while maintaining all the features of data. Commercially available Database System in Market are: DBASE, FOXPRO, IMS, ACCESS, Oracle, Sybase, MY-SQL etc. Advantages of DBMS Database Management System (DBMS) is a software package that allows data to be effectively stored, retrieved and manipulated and the data stored in a DBMS package can be accessed by multiple users and by multiple application programs like (SQL Server, Oracle, Ms-Access). The DBMS (Database Management System) is preferred ever the conventional file processing system due to the following advantages: 1. Controlling Data Redundancy - In the conventional file processing system, every user group maintains its own files for handling its data files. This may lead to Duplication of same data in different files. Wastage of storage space, since duplicated data is stored. Errors may be generated due to updation of the same data in different files. Time in entering data again and again is wasted. Computer Resources are needlessly used. It is very difficult to combine information. Example in College Database, there may be no. of applications like Office, Accounts, Library & Hostel that have private files Office Library Hostel Office Roll No Roll No Roll No Roll No Name Name Name Name Class Class Class Class Father Name DOB Father Name Address DOB Address DOB Phone No Address Phone No Address Fee Phone No Books_Issue Phone No Installments Previous Record d Room No Discount Attendance Fine Mess Bill Balance Marks Etc. Etc. Total Etc. etc In this case there is some common data of student like Roll No, Name, Class Phone No, Address will lead to redundancy & if some data changes in one like Office the same is not reflected in other sections. Eg Student Rohit changes his phone & intimates to Office, the same is not reflected in Accounts. 2. Elimination of Inconsistency When data is duplicated & changes are made at one site, which is not propagated at other site, this may lead to inconsistent data as the two data will not agree. So we need to remove this duplication of data in multiple file to eliminate inconsistency. For example: - In Office, say Roll No=5 lives at Rohini, but in Library the same person is indicated in say Pitam pura . This state where entries of the same object do not agree to each other(One is updated & other is not). At such time database is said to be inconsistent. On centralizing the data base the duplication will be controlled and hence inconsistency will be removed. Data inconsistency are often encountered in every day life Consider an another example, w have all come across situations when a new address is communicated to an organization that we deal it (Eg Telecom, Gas Company, Bank). We find that some of the communications from that organization are received at a new address while other continued to be mailed to the old address. Let us again consider the example of Result system. Suppose that a student having Roll no -201 changes his course from 'Computer' to 'Arts'. The change is made in the SUBJECT file but not in RESULT'S file. This may lead to inconsistency of the data. So we need to centralize the database so
Page 9 of 30
DBMS/ Unit I
that changes once made are reflected to all the tables where a particulars field is stored. Thus the update is brought automatically and is known as propagating updates. 3. Data Can be Shared Data Sharing is one such good aspect of DBMS where different user access the same database. 4. Better service to the users - Centralizing the data in the database also means that user can obtain new and combined information easily that would have been impossible to obtain otherwise. Also use of DBMS should allow users that don't know programming to interact with the data more easily, unlike file processing system where the programmer may need to write new programs to meet every new demand. 5. Flexibility of the System is Improved - Since changes are often necessary to the contents of the data stored in any system, these changes are made more easily in a centralized database than in a conventional system. Applications programs need not to be changed on changing the data in the database. 6. Integrity can be improved - Since data of the organization using database approach is centralized and would be used by a number of users at a time. It is essential to enforce integrity-constraints. Integrity ensures data in database is always accurate so that no incorrect information cannot be stored in database. A DBMS should provide capabilities for defining & enforcing constraints. For example: - The example of result system that we have already discussed. Since multiple files are to maintained, as sometimes you may enter a value for course which may not exist. Suppose course can have values (Computer, Accounts, Economics, and Arts) but we enter a value 'Hindi' for it, so this may lead to an inconsistent data, so lack of Integrity. Even if we centralized the database it may still contain incorrect data. For example: Salary of full time employ may be entered as Rs. 500 rather than Rs. 5000. A student may be shown to have borrowed books but has no enrollment. A list of employee numbers for a given department may include a number of non existent employees. 7. Standards can be enforced - Since all access to the database must be through DBMS, so standards are easier to enforce. Standards may relate to the naming of data, format of data, structure of the data etc. Standardizing stored data formats is usually desirable for the purpose of data interchange or migration between systems. 8. Security can be improved - In conventional systems, applications are developed in an adhoc/temporary manner. Often different system of an organization would access different components of the operational data, in such an environment enforcing security can be quiet difficult. Setting up of a database makes it easier to enforce security restrictions since data is now centralized. It is easier to control that has access to what parts of the database. Different checks can be established for each type of access (retrieve, modify, delete etc.) to each piece of information in the database. Consider an Example of banking in which the employee at different levels may be given access to different types of data in the database. A clerk may be given the authority to know only the names of all the customers who have a loan in bank but not the details of each loan the customer may have. It can be accomplished by giving the privileges to each employee. 9. Organization's requirement can be identified - All organizations have sections and departments and each of these units often consider the work of their unit as the most important and therefore consider their need as the most important. Once a database has been setup with centralized control, it will be necessary to identify organization's requirement and to balance the needs of the competating units. So it may become necessary to ignore some requests for information if they conflict with higher priority need of the organization.
Page 10 of 30
DBMS/ Unit I
It is the responsibility of the DBA (Database Administrator) to structure the database system to provide the overall service that is best for an organization. For example: - A DBA must choose best file Structure and access method to give fast response for the high critical applications as compared to less critical applications. 10. Overall cost of developing and maintaining systems is lower - It is much easier to respond to unanticipated requests when data is centralized in a database than when it is stored in a conventional file system. Although the initial cost of setting up of a database can be large, one normal expects the overall cost of setting up of a database, developing and maintaining application programs to be far lower than for similar service using conventional systems, Since the productivity of programmers can be higher in using non-procedural languages that have been developed with DBMS than using procedural languages. 11. Data Model must be developed - Perhaps the most important advantage of setting up of database system is the requirement that an overall data model for an organization be build. In conventional systems, it is more likely that files will be designed as per need of particular applications demand. The overall view is often not considered. Building an overall view of an organization's data is usual cost effective in the long terms. 12. Provides backup and Recovery - Centralizing a database provides the schemes such as recovery and backups from the failures including disk crash, power failures, software errors which may help the database to recover from the inconsistent state to the state that existed prior to the occurrence of the failure, though methods are very complex. The disadvantages are as follows: Complexity: Database developers ,DBA & end user must understand the functionality f DBMS o Size: DBMS require large piece of software, occupying many megabytes & memory to run it. o High cost of software & Hardware: Initial + Recurrent annual maintenance cost. o Technical expertise is required. o Power dependency. o Overhead for providing security, concurrency, recovery & Integrity functions. o Cost of Conversion: from Manual / File system to DBMS
DBMS Vs File Management System When a computer user wants to store data electronically they must do so by placing data in files. Files are stored in specific locations on the hard disk (directories). The user can create new files to place data in, delete a file that contains data, rename the file, etc -- all known as file management; a function provided by the Operating System (OS). Criteria Size of System Cost No of Files Type of Data Structural Complexity Redundancy Inconsistency Data Isolation Integrity File System Small System Cheap Few Files Files Simple Redundant data Inconsistent Isolation Programmer can few option for Integrity DBMS Large System Expensive Many Files Tables Complex Reduced Redundancy Consistent Data Can be shared Vast option & rigorous Integrity
Page 11 of 30
DBMS/ Unit I
Checks Security Backup / Recovery No. of User accessing Application No Security Simple, primitive backup/recovery Often single user Rigorous Security Complex & Sophisticated backup/recovery Multiple users.
PROD-DESC
UNIT-COST
CUST-NAME
CUST-STREET
CUST-CITY
PROD-ID
PORD-QTY
PROD-PRICE
Sub-Schema
A sub schema is a subset of the schema, portion of the database seen by the application programs giving desired information of the database. Subschema refers to a users view of the data item types and record types, which he or she uses. It gives the users a window through which he or she can view only that part of the database, which is of interest to him. In the Above example has 3 Sub Schema corresponding to the 3 Table Structure: Sub Schema of PRODUCT Table PRODUCT PROD-ID PROD-DESC Sub Schema of CUSTOMER Table CUSTOME R CUST-ID
CUST-NAME
CUST-CITY
Different application program can have different View of Data because of different Sub Schema. Individual application programs can change their respective subschema without affecting Subschema views of others. DBMS Software derives the subschema data requested by the application programs from schema data. The DBA (Database Administrator) ensures that the subschema requested by the application program is derivable from Schema.
Page 12 of 30
PROD-DESC
Almirah Dryer Freeze Table Chair
UNIT-COST
4000 1500 8500 800 1200
Remark:Database Schema Refers to Database Definition (Structure of Database) Database Sub Schema - Refers to Subset of Database Definition (Structure of a part of Database) Database Instances Logical Records at any point of time in Database.
1.3 Functions of a DBMS The functions performed by a typical DBMS are the following: Data Definition The DBMS provides functions to define the structure of the data in the application. These include defining and modifying the record structure, the type and size of fields and the various constraints/conditions to be satisfied by the data in each field. Data Manipulation Once the data structure is defined, data needs to be inserted, modified or deleted. The functions which perform these operations are also part of the DBMS. These function can handle planned and unplanned data manipulation needs. Planned queries are those which form part of the application. Unplanned queries are ad-hoc queries which are performed on a need basis. Data Security & Integrity The DBMS contains functions which handle the security and integrity of data in the application. These can be easily invoked by the application and hence the application programmer need not code these functions in his/her programs. Data Recovery & Concurrency Recovery of data after a system failure and concurrent access of records by multiple users are also handled by the DBMS. Data Dictionary Maintenance Maintaining the Data Dictionary which contains the data definition of the application is also one of the functions of a DBMS. Performance
Page 13 of 30
DBMS/ Unit I
Optimizing the performance of the queries is one of the important functions of a DBMS. Hence the DBMS has a set of programs forming the Query Optimizer which evaluates the different implementations of a query and chooses the best among them. Thus the DBMS provides an environment that is both convenient and efficient to use when there is a large volume of data and many transactions to be processed. 1.4 Role of the Database Administrator (DBA) Database administrator (DBA) is a person or group in charge for implementing DBMS in an organization. Database Administrators job requires a high degree if technical expertise and the ability to understand and interpret management requirements at a senior level. In practice the DBA may consist of team of people rather than just one person. The Database Administrator (DBA) who is like the super-user of the system. The role of the DBA is very important and is defined by the following functions.
Defining the Conceptual Schema and database creation: The DBA defines the schema which contains the structure of the data in the application. The DBA determines what data needs to be present in the system and how this data has to be represented and organized. Plans storage structures and access strategies: The DBA must also decide how the data is to be represented in the database and must specify the representation by writing the storage structure definition (using internal data definition language). Also the associated mapping between the storage structure definition and the conceptual schema must also be specified. Provides support to Users: It is the responsibility of DBA to provide support to the user, to ensure that the data they require is available and to write the necessary external schemas. Also the mapping between any given external schema and conceptual schema must also be specified. The DBA needs to interact continuously with the users to understand the data in the system and its use. Defining Security & Integrity Checks: The DBA finds about the access restrictions to be defined and defines security checks such that no malicious users can access database and it must remain protected. Data Integrity checks are also defined by the DBA.
Defining Backup / Recovery Procedures: In case of damage to any portion of the databases caused by human error or a failure in the hardware or supporting operating system- it is essential to be able to repair the data concerned with a minimum of delay and with as little effect as possible on the rest of the system. The DBA also defines procedures
for backup and recovery. Defining backup procedures includes specifying what data is to backed up, the periodicity of taking backups and also the medium and storage place for the backup data. Monitoring Performance and responding to changes in requirements: The DBA has to continuously monitor the performance of the queries and take measures to optimize all the queries in the application.
Page 14 of 30
DBMS/ Unit I Internal structure of database should be unaffected by changes to physical aspects of storage. DBA should be able to change conceptual structure of database without affecting all users.
1. The Internal Level: The internal level has an internal schema, which describes the physical storage structure of the database. The internal schema uses a physical data model and describes the complete details of data storage and access paths for the database. The internal level is the one that concerns the way the data are physically stored on the hardware. The internal level is concerned with such things as Storage space allocation for data and indexes Record description for storage (with stored sizes for data items) Record placement Data compression and data encryption techniques 2. Conceptual Level or Logical level: the conceptual level has a conceptual schema, which
Field Name
describes the structure of the whole database for a community of users. This level describes what data is stored in the database and the relationships among the data. The conceptual schema hides the details of physical storage structure and concentrates on describing entities, data types, relationships, user operations and constraints. For example, in case of student database Rollno, Name, Class, Address etc. are attributes of entity student Student Rollno 1234 1249 2315
%age 74 82 70
The constraints on the data, Semantic information about the data and security and integrity information The conceptual level supports each external view, in that any data available to a user must be contained in or derivable from the conceptual level. However this level must not contain any storage dependent details.
Page 15 of 30
DBMS/ Unit I
3. External Level: The external or view level includes a number of external schema or user
views. Each external schema describes the part of the database that a particular user group is interested in and hides the rest of database from that user group. A view involves only those portions of a database which are of concern to a user. Therefore same database can have different views for different users. The external view insulates users from the details of the internal and conceptual levels. External level is also known as the view level. For example one user may view dates in the form (day, month, year) while another may view dates as (year, month, day). There will be only one conceptual view, consisting of the abstract representation of the database in its entirely. Similarly there will be only one internal or physical view representing the total database, as it is physically stored. The three schema architecture is a convenient tool with which the user can visualize the schema levels in a database system. Most DBMS do not separate the three levels completely, but support the three schema architecture to some extent.
Data Independence: the three schema architecture can be used to further explain the concept of data
independence, which can be defined as the capacity to change the schema at one level of a database system without having to change the schema at the next higher level. Specifically, data independence means changes in storage structure and data access technique does NOT affect application program. There are two types of data independence: 1. Logical Data Independence is the capacity to change the conceptual schema without having to change external schemas or application programs. The change would be absorbed by the mapping between the external and conceptual levels. We may change the conceptual schema to expand the database (by adding a record type or data item), to change constraints or to reduce the database (by removing a record type or data item).
2. Physical Data Independence indicates that the physical storage structures or devices should be changed
without affecting conceptual schema. The change would be absorbed by the mapping between the conceptual and internal levels. Physical data independence is achieved by the presence of the internal level of the database and the mapping or transformation from the conceptual level of the database to the internal level. If there is a need to change the file organisation or the type of physical device uses as a result of growth in the database or new technology, a change is required in the conceptual/internal mapping between the conceptual and internal levels. The logical data independence is difficult to achieve than physicals data independence as it requires the flexibility in the designs of database and programmer has to foresee future requirements in the design.
Page 16 of 30
DBMS/ Unit I
End Users:
End users are those persons who interact with the application directly. They are responsible to insert, delete and update data in the database. They get information from the system as and when required. Different types of end users are as follows: (a) Casual end users- These users occasionally access the database but they need different
information each time. They use sophisticated database query language to specify their request for examples- middle or high level managers.
(b) Nave or parametric end users- Naive users are those users who do not have any technical
knowledge about the DBMS. They access the sizeable portion of database. They use the database through application programs by using simple user interface. They perform all operations by using simple commands provided in the user interface. Example: The data entry operator in an office is responsible for entering records in the database. He performs this task by using menus and buttons etc. He does not know anything about database or DBMS. He interacts with the database through the application program. Also the ATM user is instructed through each step of institution, they can check account balances, withdrawals, etc. (c) Online end users These users may communicate with the database via an online terminal. These users are aware of the presence of the database system and may require a certain amount of expertise within the interaction with the database.
(d) Sophisticated end users- Sophisticated users are the users who are familiar with the structure
of database and facilities of DBMS. These users include engineers, scientist, and business analyst. Such users can use a query language such as SQL to perform the required operations on databases. Some sophisticated users can also write application programs.
(e) Standalone end users- These users maintain personal database by using readymade programs
on packages that provide easy to use menu based on graphic based interfaces for example the user of text package that stores variety of personal, financial data for text purchase.
(f) Application programmers and system analyst- Application programmer is the person who is
responsible for implementing the required functionality of database for the end user. Application Page 17 of 30 Prepared by Sushila Gupta
DBMS/ Unit I programmer works according to the specification provided by the system analyst. System analyst determines the requirement of end user especially parametric end users.
(g) Database Administrator- Database administrator is responsible for, managing the whole
database system. He designs creates and maintains the database. He manages the users who can access this database, and controls integrity issues. He also monitors the performance of the system and makes changes in the system as and when required.
DBMS Interfaces
User-friendly interfaces provided by a DBMS may include the following. Menu-Based interfaces for Web Clients or Browsing: These interfaces present user with lists of options, called menus that lead the user through the formulation a request. Menus do away with the need to memorize the specific commands and syntax of a query language. Pull-down menus are a very popular technique in Web based user interfaces. They are also often used in browsing interfaces, which allows a user to look through the contents of a database. Forms-Based Interfaces: A forms-based interface displays a form to each user. Forms are usually designed and programmed for naive users as interfaces to canned transactions. Some systems have utilities that define a form by letting the end user interactively construct a sample form on the screen. Graphical User Interfaces: A graphical interface (GUI) typically displays a schema to the user in diagrammatic form. The user can then specify a query by manipulating diagram. In many cases, GUls utilize both menus and forms. Most GUls use a pointing device, such as a mouse, to pick certain parts of the displayed schema diagram. Natural Language Interfaces: These interfaces accept requests written in English, Hindi or some other language and attempt to "understand" them. A natural language interface usually has its own "schema," which is similar to the database conceptual schema, as well as a dictionary of important words. Interfaces for Parametric Users: Parametric users, such as bank tellers, often have a small set of operations that they must perform repeatedly. Systems analysts and programmers design and implement a special interface for each known class of na? ve users. Usually, a small set of abbreviated commands is included, with the goal of minimizing the number of keystrokes required for each request. For example, function keys in a terminal can be programmed to initiate the various commands. Interfaces for the DBA. Most database systems contain privileged commands that can be used only by the DBA's staff. These include commands for creating account, setting system parameters, granting account authorization, changing a schema, and reorganizing the storage structures of a database.
When not to use a DBMS Main costs of using a DBMS: High initial investment in hardware, software, training and possible need for additional hardware. Overhead for providing generality, security, recovery, integrity, and concurrency control. Generality that a DBMS provides for defining and processing data. When a DBMS may be unnecessary: If the database and applications are simple, well defined, and not expected to change. If there are stringent real-time requirements that may not be met because of DBMS overhead. If access to data by multiple users is not required. Database languages
Once the design of the database is completed and a DBMS is chosen to implement the database, the next step is to define the conceptual and internal schema for the data and any mappings between the two. The languages that are used to do so are :
Page 18 of 30
DBMS/ Unit I
1. Data Definition Language (DDL) statements: DDL statements are used to define, alter or
drop database objects. The following table gives an overview about usage of DDL statements in ORACLE
S. No.
1. 2. 3. 4.
CREATE ALTER DROP RENAME 2. Data Manipulation Language (DML) statements: Once the tables have been created, the DML statements enable users to query or manipulate data in existing schema objects. DML: statements are normally the most frequently used commands. The following table gives an overview about the usage of DML statements in ORACLE.
S. No.
1. 2. 3. 4.
3. Data Control Language (DCL) statements: A privilege can either be granted to a user
with the help of GRANT statement. The privileges assigned can be SELECT, ALTER, DELETE and EXECUTE, INSERT, INDEX etc in addition to granting of privileges, you can also revoke it by using REVOKE command. The following table gives an overview about the usage of DCL statements in ORACLE.
S. No.
1. 2.
4. Transaction Control Language (TCL) statements: Once the tables have been created,
the DML statements enable users to query or manipulate data in existing schema objects. TCL statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions. The following table gives an overview about the usage of TCL statements in ORACLE.
S. No.
1. 2. 3.
Examples of Database: Address book, Dictionary, Telephone director etc. In dictionary words are placed in alphabetic order, so as find a word from out of say 10,000 words easily.
Page 19 of 30
DBMS/ Unit I
Page 20 of 30
DBMS/ Unit I
Page 21 of 30
DBMS/ Unit I
Advantages & Disadvantages of Hierarchical model The main Advantages of this database model are: Simplicity: Since the database is based on the hierarchical structure, the relationship between the various layers is logically simple. Thus the design of a hierarchical database is simple. Data security: The hierarchical model was the first database model that offered the data security is provided and enforced by the DBMS. Data integrity: Since the hierarchical model is based on parent/child relationship, there is always a link between the parent segment and child segments under it. The child segments are always referred by its parent, so this model promotes data integrity. Efficiency: The hierarchical model is very efficient, one when the database contains a large number of 1:N relationships and when the users require large number of transactions using data whose relationships are fixed. Disadvantages Implementation Complexity: This model is simple and easy to design but it is quite complex to implement. The database designers should have very good knowledge of the physical storage characteristics. Database management problems: If you make any changes in the database structure of hierarchical database, then you need to make the necessary changes in all the application programs that access the database. Lacks structural independence: Structural independence exists when the changes in the database structure does not affect the DBMSs ability to access the data. This model use physical storage paths to navigate to the different data segments. Thus in hierarchical database, the benefits of data independence is limited by structural dependence.
Page 22 of 30
DBMS/ Unit I Programs Complexity: Due to structural dependence, the end users must know how the data is distributed physically in the database in order to access data. This requires knowledge of complex pointer system, which is often beyond the ordinary users. Operational Anomalies: the Hierarchical model suffers from the Insert, Update, Delete and Retrieve anomalies. Thus hierarchical model is not suitable for all the cases. Implementation Limitation: Many of the common relationships do not confirm to 1:N relationship. The many to many (N:N) relationships, which are more common in real life are very difficult to implement in a hierarchical model.
Network Data Model: The network model replaces the hierarchical tree with a graph thus allowing more general connections among the nodes. Its ability to handle many to many (N:N) relations. In other words, it allows a record to have more than one parent.
2.
In network database terminology, a relationship is a set. Each set is made up of at least two types of records: an owner record (parent in hierarchical) and a member record (child record in hierarchical).
To define a network database one needs to define: (a) the database record types which consist of data items,
A member record type can have that role in more than one set; hence the multiparent concept is supported. An owner record type can also be a member or owner in another set. The data model is a simple network, and link and intersection record types may exist, as well as sets between them. Thus, the complete network of
Page 23 of 30
DBMS/ Unit I
relationships is represented by several pair wise sets; in each set some (one) record type is owner and one or more record types are members. A network structure thus allows 1: M relationship, M:M relationship, although 1:1 is permitted among entities.
ADVANTAGES The network model retains almost all the advantages of the hierarchical model while eliminating some of its shortcomings.
Simplicity: The network model provide very efficient High-speed retrieval Data Integrity: In a network model, no member can exist without an owner. A user must therefore first define the owner record and then the member record. This ensures the integrity. Ease of data access: the data access is easier and flexible than the hierarchical model. Ability to handle more relationship types. The network model can handle the one-to-many and many-to- many relationships. it is conceptually simple and easy to design. Data Independence: The network model draws a clear line of demarcation between programs and the complex physical storage details. The application programs work independently of the data. Any changes made in the data characteristics do not affect the application program.
DISADVANTAGES
System complexity: In a network model, data are accessed one record at a time. This males it
essential for the database designers, administrators, and programmers to be familiar with the internal data structures to gain access to the data. Therefore, a user friendly database management system cannot be created using the network model.
Operational Anomalies: Network models insertion, deletion and updating operations of any record
require large number of pointer adjustment, which makes its implementation very complex and complicated.
Lack of Structural independence: Making structural modifications to the database is very difficult in
the network database model as the data access method is navigational. Any changes made to the database structure require the application programs to be modified before they can access data. Though the network model achieves data independence, it still fails to achieve structural independence.
3. The
Model, unlike the Hierarchical and Network models, there are no physical links. The data is stored in two-dimensional tables (rows and columns). The data is manipulated based on the relational theory of mathematics.
(RDBMS - relational database management system) A database based on the relational model developed by E.F. Codd. A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organised in tables. A table is a collection of records and each record in a table contains the same fields. Oracle, Sybase, DB2, Ingres, Informix, MS-SQL Server are few of the popular Relational DBMSs.
Page 24 of 30
DBMS/ Unit I
The Sequence of Rows is Insignificant Each Column Has a Unique Name
Terminology used in Relational Model: Term Relation Tuple Attribute A table A row or a record in a relation. A field or a column in a relation. Meaning Eg. from the given Case Example Ord_Aug, Customers, Items etc. A row from Customers relation is a Customer tuple. Ord_Date, Item#, CustName etc. Cardinality of Ord_Items relation is 8 Degree of Customers relation is 3. Domain of Qty in Ord_Items is the set of all values which can represent quantity of an ordered item. Primary Key of Customers relation is Cust#. Ord# and Item# combination forms the primary Key of Ord_Items Cust# in Ord_Aug relation is a foreign key creating reference from Ord_Aug to Customers. This is required to indicate the relationship between Orders in Ord_Aug and Customers.
Cardinality of The number of tuples in a a relation relation. Degree of a relation Domain of an attribute The number of attributes in a relation. The set of all values that can be taken by the attribute. An attribute or a combination of attributes that uniquely defines each tuple in a relation. An attribute or a combination of attributes in one relation R1 which indicates the relationship of R1 with another relation R2.
Page 25 of 30
DBMS/ Unit I
The foreign key attributes in R1 must contain values matching with those of the values in R2 Ord# and Item# in Ord_Items are foreign keys creating references from Ord_Items to Ord_Aug and Items respectively.
Structural independence Improved conceptual simplicity Easier database design, implementation, management, and use Ad hoc query capability Powerful database management system
Substantial hardware and system software overhead May not fit all business models Can facilitate poor design and implementation May promote "islands of information" problems
Page 26 of 30
DBMS/ Unit I
1.
Relationship between records is represented by a . relation that contains a key C for each record involved in the relationship
2.
Many to many relationship Many to many relationship Many - to many. relationship cannot be expressed in this can also be implemented can be easily implemented model It is a simple, straightforward and natural method of a implementing record relationships This type of model is useful only when there is some hierarchical character in the database. In order to represent links among records, pointers are used. Thus relations among records are physical. Searching for a record is very difficult since one can retrieve a child only after going through its parent record. Record relationship, implementation is very complex due to the use of pointers Network model is useful for representing such records which have many to many relationships Relationship implementation is very easy through the use of a key or composite key field(s)
3.
4.
Relational model is useful for representing most of the real world objects and relationships among them Relational model does not maintain physical connection among records. Data is organized logically in the form of rows and columns, and stored in table
5.
6.
Searching a record is easy since there are multiple access paths to data elements.
Page 27 of 30
DBMS/ Unit I
STRUCTURE OF DBMS
DBMS (Database Management System) acts as an interface between the user and the database. The user requests the DBMS to perform various operations (insert, delete, update and retrieval) on the database. The components of DBMS perform these requested operations on the database and provide necessary data to the users. The various components of DBMS are shown below: -
1. DDL Compiler - Data Description Language compiler processes schema definitions specified in the DDL. It includes metadata information such as the name of the files, data items, storage details of each file, mapping information and constraints etc. 2. DML Compiler and Query optimizer - The DML commands such as insert, update, delete, retrieve from the application program are sent to the DML compiler for compilation into object code for database access. The object code is then optimized in the best way to execute a query by the query optimizer and then send to the data manager. 3. Data Manager - The Data Manager is the central software component of the DBMS also knows as Database Control System. The Main Functions Of Data Manager Are: Convert operations in user's Queries coming from the application programs or combination of DML Compiler and Query optimizer which is known as Query Processor from user's logical view to physical file system. Controls DBMS information access that is stored on disk. It also controls handling buffers in main memory. It also enforces constraints to maintain consistency and integrity of the data. It also synchronizes the simultaneous operations performed by the concurrent users. It also controls the backup and recovery operations. 4. Data Dictionary - Data Dictionary is a repository of description of data in the database. It contains information about Data - names of the tables, names of attributes of each table, length of attributes, and number of rows in each table. Page 28 of 30 Prepared by Sushila Gupta
DBMS/ Unit I Relationships between database transactions and data items referenced by them which is useful in determining which transactions are affected when certain data definitions are changed. Constraints on data i.e. range of values permitted. Detailed information on physical database design such as storage structure, access paths, files and record sizes. Access Authorization - is the Description of database users their responsibilities and their access rights. Usage statistics such as frequency of query and transactions. Data dictionary is used to actually control the data integrity, database operation and accuracy. It may be used as a important part of the DBMS. Importance of Data Dictionary Data Dictionary is necessary in the databases due to following reasons: It improves the control of DBA over the information system and user's understanding of use of the system. It helps in document ting the database design process by storing documentation of the result of every design phase and design decisions. It helps in searching the views on the database definitions of those views. It provides great assistance in producing a report of which data elements (i.e. data values) are used in all the programs. It promotes data independence i.e. by addition or modifications of structures in the database application program are not effected. 5. Data Files - It contains the data portion of the database. 6. Compiled DML - The DML complier converts the high level Queries into low level file access commands known as compiled DML. 7. End Users - They are already discussed in previous section. COMPONENTS OF DATABASE SYSTEM A database system is composed of four components; Data Hardware Software Users which coordinate with each other to form an effective database system.
Page 29 of 30
DBMS/ Unit I 1. Data - It is a very important component of the database system. Most of the organizations generate, store and process 1arge amount of data. The data acts a bridge between the machine parts i.e. hardware and software and the users which directly access it or access it through some application programs. Data may be of different types. User Data - It consists of a table(s) of data called Relation(s) where Column(s) are called fields of attributes and rows are called Records for tables. A Relation must be structured properly. Metadata - A description of the structure of the database is known as Metadata. It basically means "data about data". System Tables store the Metadata which includes. - Number of Tables and Table Names - Number of fields and field Names - Primary Key Fields Application Metadata - It stores the structure and format of Queries, reports and other applications components. ' 2. Hardware - The hardware consists of the secondary storage devices such as magnetic disks (hard disk, zip disk, floppy disks), optical disks (CD-ROM), magnetic tapes etc. on which data is stored together with the Input/Output devices (mouse, keyboard, printers), processors, main memory etc. which are used for storing and retrieving the data in a fast and efficient manner. Since database can range from those of a single user with a desktop computer to those on mainframe computers with thousand of users, therefore proper care should be taken for choosing appropriate hardware devices for a required database. 3. Software - The Software part consists of DBMS which acts as a bridge between the user and the database or in other words, software that interacts with the users, application programs, and database and files system of a particular storage media (hard disk, magnetic tapes etc.) to insert, update, delete and retrieve data. For performing these operations such as insertion, deletion and updation we can either use the Query Languages like SQL, QUEL, Gupta SQL or application softwares such as Visual 3asic, Developer etc. 4. Users - Users are those persons who need the information from the database to carry out their primary business responsibilities i.e. Personnel, Staff, Clerical, Managers, Executives etc. On the basis of the job and requirements made by them they are provided access to the database totally or partially. The various types of users which can access the database are: Database Administrators (DBA) Database Designers End Users Application Programmers
Page 30 of 30