The document provides an overview of databases, including what they are, their main components and uses. It discusses that databases solve many problems with traditional file-based data management, including data duplication, inconsistencies and lack of access control. The key components of a database system are the database itself (data and metadata), database management system software, users and procedures. A DBMS provides facilities for defining, querying and controlling access to the database.
The document provides an overview of databases, including what they are, their main components and uses. It discusses that databases solve many problems with traditional file-based data management, including data duplication, inconsistencies and lack of access control. The key components of a database system are the database itself (data and metadata), database management system software, users and procedures. A DBMS provides facilities for defining, querying and controlling access to the database.
• What a database is, the various types of databases, and why they are valuable assets for decision making • The importance of database design • How modern databases evolved from file systems • About flaws in file system data management • The main components of the database system • The main functions of a database management system (DBMS) • Databases solve many of the problems encountered in data management • Used in almost all modern settings involving data management: • Business • Research • Administration • Important to understand how databases work and interact with other applications 1. Purchases from the supermarket 2. Purchases using your credit card 3. Booking a holiday at the travel agents 4. Using the local library 5. Taking out insurance 6. Renting a video 7. Using the Internet 8. Studying at university • Data are raw facts • Information is the result of processing raw data to reveal meaning • Information requires context to reveal meaning • Raw data must be formatted for storage, processing, and presentation • Data are the foundation of information, which is the bedrock of knowledge • Data: building blocks of information • Information produced by processing data • Information used to reveal meaning in data • Accurate, relevant, timely information is the key to good decision making • Good decision making is the key to organizational survival File-based system - A collection of application programs that perform services for the end-users such as the production of reports. Each program defines and manages its own data. 1. Separation and isolation of data 2. Duplication of data 3. Data dependence 4. Incompatible file formats 5. Fixed queries/proliferation of application programs
All the above limitations of the file-based approach can
be attributed to two factors:
(1) the definition of the data is embedded in the application
programs, rather than being stored separately and independently; (2) there is no control over the access and manipulation of data beyond that imposed by the application programs. o Structural dependence: access to a file is dependent on its own structure oAll file system programs must be modified to conform to a new file structure o Structural independence: change file structure without affecting data access o Data dependence: data access changes when data storage characteristics change o Data independence: data storage characteristics do not affect data access o Practical significance of data dependence is difference between logical and physical format o Logical data format: how human views the data o Physical data format: how computer must work with data o Each program must contain: ◦ Lines specifying opening of specific file type ◦ Record specification ◦ Field definitions o Filesystem structure makes it difficult to combine data from multiple sources oVulnerable to security breaches o Organizational structure promotes storage of same data in different locations oIslands of information o Data stored in different locations is unlikely to be updated consistently o Data redundancy: same data stored unnecessarily in different places o Data inconsistency: different and conflicting versions of same data occur at different places o Data anomalies: abnormalities when all changes in redundant data are not made correctly ◦ Update anomalies ◦ Insertion anomalies ◦ Deletion anomalies o Most users lack the skill to properly design databases, despite multiple personal productivity tools being available o Data-modeling skills are vital in the data design process o Good data modeling facilitates communication between the designer, user, and the developer o A shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization. ◦ End-user data: raw facts of interest to end user ◦ Metadata: data about data Provides description of data characteristics and relationships in data Complements and expands value of data
o A software system that enables users to define, create,
maintain, and control access to the database. The DBMS is the software that interacts with the users’ application programs and the database. 1. It allows users to define the database, usually through a Data Definition Language (DDL). The DDL allows users to specify the data types and structures and the constraints on the data to be stored in the database.
2. It allows users to insert, update, delete, and retrieve
data from the database, usually through a Data Manipulation Language (DML). Having a central repository for all data and data descriptions allows the DML to provide a general inquiry facility to this data, called a query language.
3. It provides controlled access to the database.
A DBMS provides another facility known as a view mechanism, which allows each user to have his or her own view of the database (a view is in essence some subset of the database).
As well as reducing complexity by letting users see the
data in the way they want to see it, views have several other benefits:
1. Views provide a level of security.
2. Views provide a mechanism to customize the appearance of the database. 3. A view can present a consistent, unchanging picture of the structure of the database, even if the underlying database is changed. 1. Hardware 2. Software 3. Data The structure of the database is called the schema. 4. Procedures 5. People 1. Data dictionary management 2. Data storage management 3. Data transformation and presentation 4. Security management 5. Multiuser access control 6. Backup and recovery management 7. Data integrity management 8. Database access languages and application programming interfaces 9. Database communication interfaces 1. Data and Database Administrator
Data Administrator (DA) – is responsible for the
management of the data resource including database planning, development and maintenance of standards, policies and procedures, and conceptual/logical database design. The DA consults with and advises senior managers, ensuring that the direction of database development will ultimately support corporate objectives.
Database Administrator (DBA) - is responsible for the
physical realization of the database, including physical database design and implementation, security and integrity control, maintenance of the operational system, and ensuring satisfactory performance of the applications for users. The role of the DBA is more technically oriented than the role of the DA, requiring detailed knowledge of the target DBMS and the system environment. 2. Database designer
Logical database designer - is concerned with identifying the
data (that is, the entities and attributes), the relationships between the data, and the constraints on the data that is to be stored in the database. The logical database designer must have a thorough and complete understanding of the organization’s data and any constraints on this data (the constraints are sometimes called business rules).
Physical database designer - decides how the logical
database design is to be physically realized. This involves: ü mapping the logical database design into a set of tables and integrity constraints; ü selecting specific storage structures and access methods for the data to achieve good performance; ü designing any security measures required on the data. 3. Application Developers Once the database has been implemented, the application programs that provide the required functionality for the end-users must be implemented. This is the responsibility of the application developers.
4. End-Users The end-users are the ‘clients’ for the database, which has been designed and implemented, and is being maintained to serve their information needs.
Naïve users - are typically unaware of the DBMS. They
access the database through specially written application programs that attempt to make the operations as simple as possible. Sophisticated users – is familiar with the structure of the database and the facilities offered by the DBMS. Sophisticated end-users may use a high-level query language such as SQL to perform the required operations. • The roots of the DBMS lie in file-based systems. • The hierarchical and CODASYL systems represent the • first-generation of DBMSs. • The hierarchical model is typified by IMS (Information Management System) and the network or CODASYL model by IDS (Integrated Data Store), both developed in the mid- 1960s. • The relational model, proposed by E. F. Codd in 1970, represents the second-generation of DBMSs. It has • had a fundamental effect on the DBMS community and there are now over one hundred relational DBMSs. • The third-generation of DBMSs are represented by the Object-Relational DBMS and the Object-Oriented • DBMS. o Databases can be classified according to: oNumber of users oDatabase location(s) oExpected type and extent of use o Single-user database supports only one user at a time oDesktop database: single-user; runs on PC o Multiuser database supports multiple users at the same time oWorkgroup and enterprise databases o Centralized database: data located at a single site o Distributed database: data distributed across several different sites o Operational database: supports a company’s day-to-day operations oTransactional or production database o Data warehouse: stores data used for tactical or strategic decisions o Unstructured data exist in their original state o Structured data result from formatting oStructure applied based on type of processing to be performed o Semistructured data have been processed to some extent o Extensible Markup Language (XML) represents data elements in textual format oXML database supports semistructured XML data o Database design focuses on design of database structure used for end-user data oDesigner must identify database’s expected use o Well-designed database: oFacilitates data management oGenerates accurate and valuable information o Poorly designed database: oCauses difficult-to-trace errors 1. Coronel, C., & Morris, S. (2019). Database systems: design, implementation, and management. Boston: Cengage. 2. Connolly, T. M., & Begg, C. E. (2015). Database systems: A practical approach to design, implementation, and management (Fourth ed.). Upper Saddle River, NJ: Pearson.