Chapter 4
Chapter 4
- A giant collection of data in which the tables and records have no relation
between any other tables. ( It could be a single table)
- If you have created an excel spreadsheet, you have created a basic flat
file. A workbook with multiple tabs makes up the database of a flat-file
database. There can be many values that are the same in both worksheets, but
they are not linked together.
Example:
- From the above set-up, you can now analyse the problems you might
encounter using the flat-file database.
1. Handling is difficult.
- Inserting
- Updating
- Deleting
Example:
EMPLOYEE
Payroll Code First Name Surname Address Town Post Code Monthly Pay
- You can notice the redundancy and duplication of data from the worksheet
of employee and payroll.
2. Database Approach
- The entire data is stored in a single repository and multiple users can
access the data based on their interest.
Example:
RELATIONS
Relation _Name No of
columns
EMPLOYEE 7
PAYROLL 7
STOCK 7
ORDERS 7
COLUMNS
- The structure of data files is stored in the DBMS Catalog and is separate
from access programs. If there is any change from the structure of data files, it
does not affect the programs and that is called “program data independence.”
- A database has many users, each of whom may require a different view
of the database.
- Concurrency Control: When multiple users are sharing the same database,
at the same time, it should prevent two users from editing the same data at the
same time.
Topic Outline
● Database Management System
● Users
● The Database Administrator
● The Physical Database
● DBMS Models
Features of DMS
Program Development
● The DBMS includes application development software, this feature can be used by
programmers and end users to build applications that access databases. Also it is a
process of gathering and assessing real-world requirements, creating the system's data
and features, and then putting those designs into practice
Database Access
● The ability of a DBMS to allow authorized users formal and informal access to the
database is its most critical element or feature, Each kind of data can be created,
stored, managed, manipulated, retrieved, and updated using this methodical system or
software. The ability of DBMS to administer an accounting system is one of its
primary uses.
Three software modules that facilitate database access, Data Definition Language, Data Manipulation
Language, Query Language
Users
● A user is someone who employs or uses a particular thing, or uses the database.
(Figure 4.3)
DBMS Operation
1. A user program sends a request for data to the DBMS. The requests are written in a
special DML that is embedded in the user program.
2. The DBMS analyses the request by matching the called data elements against the user view
and the conceptual view. If the data request matches, it is authorised, and processing proceeds
to Step 3. If it does not match the views, access is denied.
3. The DBMS determines the data structure parameters from the internal view and passes them
to the operating system, which performs the actual data retrieval. Data structure parameters
describe the organisation and access method for retrieving the requested data.
4. Using the appropriate access method (an operating system utility program), the operating
system interacts with the disk storage device to retrieve the data from the physical database.
5. The operating system then stores the data in a main memory buffer area managed by the
DBMS.
6. The DBMS transfers the data to the user’s work location in main memory. At this point, the
user’s program is free to access and manipulate the data.
7. When processing is complete, Steps 4, 5, and 6 are reversed to restore the processed data to
the database.
( figure 4.5 )
● This example demonstrates how effective SQL is as a data processing tool. Despite not being
a form of plain English, SQL requires significantly less programming knowledge and training
than third-generation languages. In fact, some query tools don't even require an understanding
of SQL. By "pointing and clicking" on the desired attributes, users choose data visually. The
required SQL commands are then automatically generated by this visual user interface. The
real advantage of the query feature is that it gives the manager or end user control over ad hoc
reporting and data processing. Reducing the manager's reliance on expert programmers
enhances the manager's capacity to respond quickly to emerging issues. Yet, the query feature
creates a significant control problem. It must be prevented by management from being utilised
to get unwanted access to the database.
Data Dictionary
● a collection of names, definitions, and attributes about data elements that are being used or
captured in a database, information system. Every data element in the database is described in
the data dictionary. This allows all users (and programmers) to have a similar view of the data
resource, considerably improving user needs analysis. The data dictionary might be in hard
copy or digital form.
(Table 4.1)
● In large corporations, the DBA function may be supported by an entire department of
technical personnel. DBA roles may be assumed by someone in the computer services group
in smaller organisations. The DBA's responsibilities include database planning, database
design, database installation, operation, and maintenance, and database growth and change.
(figure 4.6)
● This figure depicts some of the DBA's organisational interfaces; particularly significant is the
relationship that exists between the DBA, end users, and system specialists of the
organisation.
(table 4.2)
● This is the list of The file processing activities that data structures must support, The DBMS's
efficiency in doing these tasks is a crucial predictor of its overall success, and it is heavily
dependent on how a particular file is constructed.
Data Structures
● Data structures are the foundation of a database. The data structure allows for the location,
storage, and retrieval of records, as well as movement from one record to another. A data
structure is an intricate format for arranging, processing, accessing, and storing data. Data
structures come in both simple and complex forms, all of which are made to organise data for
a certain use. Users find it simple to access the data they need and use it appropriately thanks
to data structures. There are two main components to data structures: (1) organisation and (2)
access technique.
For all processing jobs, no one structure is optimum. So, choosing one entails the compromise of
desirable traits, here are the criterias that influence the selection of data structures.
Data Organizations
● The physical arrangement of the documents on the storage device. This could be random or in
a specific order. Sequential files keep their records in contiguous regions that take up a certain
amount of disk space. Records are kept in random folders without consideration for how
physically close they are to one another. Records from random files might be dispersed
around the disk.
● Data organisation is the process of categorising and classifying data to improve its usability.
You'll need to organise your data in the most logical and orderly way possible, similar to how
we organise critical documents in file folders, so you and anybody else who accesses it can
quickly find what they're searching for.
Data Access Methods
● Access methods are operating system-integrated computer programs that are used to search
for records and browse the database. In response to requests for data from the user's
application, the access method program finds and retrieves or stores the records during
database processing. The user is fully unaware of any tasks performed by the access method.
DBMS Models
● A data model is an abstract illustration of information about relevant things. They include
assets (resources), transactions (events), and agents (people, clients, etc.) within a company. A
data model's function is to represent things and define their properties in a form that is user-
friendly. Each DBMS is based on a particular conceptual model.
Database Terminology
Entity - anything the organization wants to collect data on. Physical things like consumers, staff, or
stockpiles are examples of entities. Sales (to a customer), accounts receivable (AR), or accounts
payable are examples of conceptual transactions (AP).
Record Type - is an actual database image of an entity. The record types that are relevant to particular
entities are grouped together by database designers into tables (files). For instance, the sales order
record type, which practically reflects the Sales Order entity, would consist of records of sales to
clients.
Occurrence - refers to the quantity of records that a specific record type can represent. For instance,
the Employee entity (record type) is said to contain 100 occurrences if an organization has 100
employees.
Attributes - a piece of data that describes an entity, For example, in a customer database, the attributes
might be name, address, and phone number. In a product database, the attributes might be name,
price, and date of manufacture.
Database - is the collection of record kinds required by an organization to support its operational
procedures. Some businesses use a distributed database strategy and build separate databases for each
of their main functional areas. Such a company might have different databases for production,
marketing, and accounting.
Associations - The several record types that make up a database coexist with one another. An
association is a business component that defines a relationship between two entity objects based on
common attributes. The relationship can be one-to-one, one-to-many, or many-to-many. The
association allows entity objects to access the data of other entity objects through a persistent
reference.
One-to-one
● This indicates that for every instance of Record Type X, there is only one instance of Record
Type Y (or maybe zero for new instances), for instance, for every occurrence of Employee in
the Employee database, there is only one occurrence of Year-to-Date Earnings (or possibly
zero for new instances).
One-to-many
● This indicates that a specific consumer may have made zero, one, or numerous purchases
from the business during the time period under consideration.For instance, for every
occurrence (client) in the customer table, there are zero, one, or many sales orders in the sales
order table.
Many-to-many
● There are zero, one, or many instances of Record Types Y and X for every occurrence of
Record Types X and Y, respectively. The M:M link is demonstrated by the commercial
relationship between an organization's inventory and its suppliers. In this example, a certain
supplier gives the business zero, one, or many inventory items (although if the supplier is
shown in the database, the business doesn't purchase from the provider). Similar to this, the
business may purchase a certain inventory item from zero, one, or many different suppliers (if
the company manufactures the item in-house, for example).
(figure 4.8)
● This is an
illustration of a hierarchical
database's data structure.
The sets that make up the
hierarchical model describe
the connection between two
linked files. A parent and a
child are present in every
pair. Keep in mind that File
B is both the parent and the
child in two sets at the
second level. Siblings are
files that are at the same
level as the same parent. This design is also known as a tree structure. The lowest file on a
specific branch is referred to as a leaf, while the root segment is the level highest in the tree.
Navigational Database
● Hierarchical model is also called a navigational database because it needs following a
predetermined path to traverse the files. This is demonstrated by explicit pointers (links)
between records that are related. The root of the tree and the pointers leading to the relevant
records provide the only access to data at lower levels of the tree. For example (figure 4.9)
The customer record (the root) contains a pointer to the sales invoice record, which points to
the invoice line item record. This record must be accessed by the DBMS before it can obtain
an invoice line item record. Later, we go into greater depth about the processes in this
procedure.
The hierarchical model's usefulness is constrained by the second criterion, which is frequently
limiting. Many businesses require a data association view that supports numerous parents for example
(figure 4.11)
Natural Relationship
● The client
file and the
salesperson file are
the two natural
parents of the sales
invoice file in this
instance. Both the
customer's purchase
event and the
salesperson's selling event result in a specific sales order. Sales order records must be viewed
by management as the logical offspring of both parents if it wants to connect sales activity
with customer service and personnel performance evaluation. Despite being rational, this
relationship breaks the hierarchical model's one parent constraint. Data integration is
constrained because complex relationships cannot be represented.
Hierarchical Representation with Data Redundancy
● demonstrates the most popular method for overcoming this issue. We produce two distinct
hierarchical representations by duplicating the sales invoice file and the associated line item
file. Regrettably, we have to increase data redundancy in order to accomplish this improved
functionality. The network model, which we look at next, effectively addresses this issue.
Network Model
● A hierarchical model known as the "Network Model" is used in database management
systems to describe the many-to-many link between database constraints. Although the
network model in DBMS has a hierarchical structure, it differs from the hierarchical database
model in that a member might have several parents.
● The Committee on Development of Applied Symbolic Languages (CODASYL), which was
established by an ANSI committee in the late 1970s, developed a database task force to
provide guidelines for database design. The network model for databases was created by
CODASYL. The integrated database management system (IDMS), which Cullinane/Cullinet
Software delivered to the commercial market in the 1980s, is the most well-known illustration
of the network paradigm. Even though there have been numerous changes to this model over
time, it is still in use today.
(figure 4.12)
● The network model is a
navigable database with clear
linkages between records and
files, similar to the hierarchical
model. The network model differs
in that it allows a child record to
have several parents.
● For example Salesperson
Number 1 and Customer Number
5 are the parents of Invoice
Number 1. The path to the invoice
(child) record is clearly defined by pointer fields in both of the parent records. There are two
linkages from this invoice record to related (sibling) entries. The first is a link to Invoice
Number 2 from a salesman (SP).
● This record is the outcome of a transaction between Customer Number 6 and Salesperson
Number 1. The customer (C) link to Invoice Number 3 is the second pointer. This is the
second purchase made by Salesperson Number 2 to Customer Number 5, who was the buyer
this time. With the help of this data structure, management is able to monitor and report sales
data on clients and salespeople.
● By supplying the proper primary key (SP # or Cust #), the structure can be accessed at either
of the root-level records (salesperson or customer). Beyond this, the access procedure
resembles the hierarchical model's description.
Relational Model
● An abstract model used to arrange and control the data kept in a database is the relational
model in DBMS. It stores information in two-dimensional interrelated tables, or relations,
where each row corresponds to an entity and each column to its properties.
● The relational model's guiding concepts were first put forth by E. F. Codd in the late 1960s.
The majority of the data manipulation techniques utilized have a theoretical foundation in
relational algebra and set theory, which are the foundations of the formal model. The way that
data associations are displayed to the user differentiates the relational model from the
navigational models most obviously. Two-dimensional tables are used to represent data in the
relational model.
(figure 4.13)
● This figure shows an example of a
database table called customer
● Columns of data attributes (data
fields) run across the top of the table. Tuples
are used in the table to create rows by
intersecting the columns. A record in a flat-
file system is analogous to, but not exactly
comparable to, a tuple, which is a normalized
array of data.
Four characteristics of properly designed
table
1. All occurrences at the intersection of a row
and a column are a single value. No multiple
values (repeating groups) are allowed.
2. The attribute values in any column must all be of the same class.
3. Each column in a given table must be uniquely named.
4. Each row in the table must be unique in at least one attribute. This attribute is the primary key.
● It's necessary to normalize the table. Each property in the row should be independent of the
others and be dependent on (and uniquely defined by) the primary key. We learned in the last
section how navigational databases create associations between records by using explicit
linkages (points). In the relational paradigm, the connections are implicit.
Difference between hierarchical database to relational model
● To illustrate this distinction, compare the file structures of the relational tables in Figure 4.14
with those of the hierarchical example in Figure 4.10. The conceptual relationship between
files is the same, but note the absence of explicit pointers in the relational tables.
● An attribute shared by both tables in a relation serves as its foundation. For instance, the Cash
Receipts and Sales Invoice tables both include embedded foreign keys that correspond to the
Customer table's primary key (Cust #). Similar to this, the Line Item table's foreign key to the
Sales Invoice table's primary key (Invoice #) is the same way. Invoice # and Item # are the
two fields that make up the Line Item table's composite primary key. Only the invoice number
portion of the key provides the logical connection to the Sales Invoice table, even though both
values are required to uniquely identify each entry in the table.
● Rather than using explicit addressees that are constructed into the database, logical operations
of the DBMS are used to create links between records in related tables. For instance, the
system would look for entries in the Sales Invoice table with the foreign key value of 1875 if
a user wanted to see all the invoices for Customer 1875. Figure 4.14 demonstrates that there is
only one, which is Invoice 1921. The Line Item table is searched for entries with a foreign
key value of 1921 in order to find the line item data for this invoice. The retrieval of two
records.
● The process for assigning foreign keys is based on how closely two tables are related to one
another. When there is a one-to-one relationship, each table's main key may be included as a
foreign key in the other. The primary key on the "one" side of one-to-many connections is
embedded as the foreign key on the "many" side. One customer, for instance, can have
numerous records of invoices and cash receipts. As a result, the Cust # is included in the
entries of the Cash Receipts and Sales Invoices tables.
● The Sales Invoice and Line Item tables are connected one to many in a similar manner.
Foreign keys that are incorporated in the tables are not used in many-to-many connections.
Instead, it is necessary to establish a separate link table that contains the keys for the
associated tables.
Remote IT units send requests for data to the central site, which processes the requests and transmits
the data back to the requesting IT unit. The actual processing of the data is performed at the remote IT
unit. The central site performs the functions of a file manager that services the data needs of the
remote sites.
Database Lockout
is a software control (usually a function of the DBMS) that prevents multiple simultaneous accesses to
data. The previous example can be used to illustrate this technique.
Distributed Databases
Distributed databases can be either partitioned or replicated. We examine both approaches in the
following pages.
Partitioned Databases
The partitioned database approach splits the central database into segments or parti- tions that are
distributed to their primary users.
In a distributed environment, it is possible for multi- ple sites to lock out each other from the database,
thus preventing each from processing its transactions. A deadlock occurs here because there is mutual
exclusion to the data resource, and the transactions are in a “wait” state until the locks are removed.
This can result in transactions being incom- pletely processed and the database being corrupted. A
deadlock is a permanent condi- tion that must be resolved by special software that analyzes each
deadlock condition to determine the best solution. Because of the implication for transaction
processing, accountants should be aware of the issues pertaining to deadlock resolutions.
Deadlock Resolution
Resolving a deadlock usually involves terminating one or more transactions to complete processing of
the other transactions in the deadlock. In preempting transactions, the dead- lock resolution software
attempts to minimize the total cost of breaking the deadlock. Some of the factors that are considered
in this decision follow:
* The resources currently invested in the transaction. This may be measured by the number of updates
that the transaction has already performed and that must be repeated if the transaction is terminated.
* The transaction’s stage of completion. In general, deadlock resolution software will avoid
terminating transactions that are close to completion.
* The number of deadlocks associated with the transaction. Because terminating the transaction
breaks all deadlock involvement, the software should attempt to termi- nate transactions that are part
of more than one deadlock.
Replicated Databases
Are effective in companies where there exists a high degree of data shar- ing but no primary user.
Since common data are replicated at each IT unit site, the data traffic between sites is reduced
considerably. Figure 4.18 illustrates the replicated database model.
The problem with this approach is main- taining current versions of the database at each site. Since
each IT unit processes only its transactions, common data replicated at each site are affected by
different transactions and reflect different values.
Concurrency Control
Database concurrency is the presence of complete and accurate data at all user sites. Sys- tem
designers need to employ methods to ensure that transactions processed at each site are accurately
reflected in the databases of all the other sites.
First, special software groups transactions into classes to identify potential conflicts. For example,
read-only (query) transactions do not conflict with other classes of transactions.
The second part of the control process is to time-stamp each transaction. A system- wide clock is used
to keep all sites, some of which may be in different time zones, on the same logical time.
The decision to distribute databases is one that should be entered into thoughtfully. There are many
issues and trade-offs to consider. Here are some of the most basic ques- tions to be addressed:
* Should the organization’s data be centralized or distributed?
* If data distribution is desirable, should the databases be replicated or partitioned?
* If replicated, should the databases be totally replicated or partially replicated?
* If the database is to be partitioned, how should the data segments be allocated among the sites?
ACCESS CONTROLS
- Designed to prevent unauthorised individuals from viewing, retrieving, corrupting, or
destroying the entity’s data.
•User Views (subschema) is a subset of the database that defines the user's data domain and
access.
•Database authorization table contains rules that limit user actions.
•User-defined procedures allow users to create a personal security program or routine.
•Data encryption procedures protect sensitive data.
•Biometric devices such as fingerprints or retina prints control access to the database.
•Inference controls should prevent users from interfering, through query options, specific data
values they are unauthorised to access.
•Since data sharing is a fundamental objective of the database approach, the environment is
vulnerable to damage from individual users.
•Four needed backup recovery features:
•Backup feature makes periodic backup of the entire database which is stored in a secure,
remote location.
•Transaction log provides an audit trail of all processed transactions.
•Checkpoint facility suspends all processing while the system reconciles transaction log and
database change log against the database.
Recovery module uses logs and backup files to restart the system after a failure.