DBMS Repeated From Book
DBMS Repeated From Book
END USERS
EXTERNAL EXTERNAL.
LEVEL VIEW
VIEW
axleriial/curicoptual
mapping
CONCEPTUAL
LEVEL
conceptual/internal mapping
INTERNAL
LEVEL
STORED DATABASE
Mappings among schema levels are needed to transform
requests and data. Programs refer to an external schema, and
are mapped by the DBMS to the internal schema for execution.
Data Independence:
Database Administrator:
Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs.
Responsibilities of the Database administrator include:
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting user authority to access the database
• Specifying integrity constraints
• Acting as liaison with users
• Monitoring performance and responding to changes in
requirements
Transaction Management within Storage Manager:
• A transaction is a collection of operations that performs a
single logical function in a database application
Storage Manager
Disk Storage
Query processor
Data Model:
19
• Organization is that of an arbitrary graph and
represented by Network diagram.
ROW J’ABLE
Data Dictionary:
20
20
metadata. The data dictionary should be accessible to the user
of the database, so that she can obtain this metadata. Some
examples of the contents of the data dictionary are:
DATA ANALYSIS
LOGICAL DESIGN
Tables - Columns - Primary Keys - Foreign Keys
PHYSIC AL DESIGN
21
• Schema Refinement: (Normalization)
Should Be:
Attribute:
Classification of attributes:
• Identifier Attributes
22
For example consider tne following:
23
If there are more than one values associated with an
entity it is a multivalued attribute. For example the attribute Skill
in Employee entity is multivalued.
Identifier Attributes:
Business Logic:
24
SfJlPMENT ITEM CUSÏOMER
ENTITY
Opl Man
Basic symbols
25
Evaluation of DBMS:
Evaluation is done based on the following features:
• Data Definition
• Physical Definition
• Accessibility
• Transaction handling
• Utilities
• Development
Physical dzfnition
62
26
Accessibilité
4GL/SGL tech
CASEtmh
Loa‹f/unloaé facilite
User usage nonitoring
Dalabm adInÎnÎStfdti0Il Sllp|mt
Other ferrures
63
Evaluation of Data Models:
An optimal data model should satisfy the criteria
tabulated below:
64
Criteria
Simplicity
Expressability
Non redundancy
Sharability
Extensibility
Integrity
Diagrammatic
representation
65
Description
Consistency with the way the enterprise defines and organizes information.
Ability to distinguish between different data, relationships, between data, and constraints.
Exclusion of extraneous information; in particular, the representation of any one piece of information
exactly ones
Not specific to any particular application or technology and thereby usable by many
Ability to evolve to support new requirements with minimal affect on existing users
Consistency with the way the enterprise uses and manages information.
66
fundamental consideration in this examination will concern the
data to be recorded on the file. But an equally important and less
obvious consideration concerns how the data are to be placed
on the file.
3.1.1 Sequential file Organization
It is the simplest method to store and retrieve data from a
file. Sequential organization simply means storing and sorting in
67
physical on tape or disk. In a sequential organization records can
be added only at the end of the file. That is in a sequential file,
records are stored one after the other without concern for the
actual value of the data in the records. It is not possible to insert
a record in the middle of the file without re-writing the file.
Records from both files are matched, one record at a time,
resulting in an updated master file. It is a characteristic of
sequential files that all records are stored by position; the first
one is at the first position, the second one occupies the second
position and so on. "rhere are no addresses or location
assignments in sequential files. To read a sequential file, the
system always starts at the beginning of the file. If the record
sought is somewhere in the file, the system reads its ways unto
it, one record at a time. For example, if a particular record
happens to be the fifteen one in a file, the system starts at the
first one, and reads a head one record at a time until the fifteenth
one is reached. It cannot jump directly to the fifteenth one in a
sequential file without starting from the beginning. Using the key
field, in a sequential file the records have been arranged into
ascending or descending order according to a Keyfield. This key
field may be numeric, alphabetic, or a combination of both, but it
must be occupy the same place in each reccrd, as it forms the
basis for determining header which the records will appear on the
file. Sequential files are generally maintained a magnetic tape,
disk or a mass storage system. The advantages and
disadvantages of the sequential file organization are given
below:
Advantages:
68
• Files may be relatively easy to reconstruct since a good
measure of built in backup is usually available
Disadvantages:
69
value for a record to the address or location of the record on the
file. This formula or method is generally called an algorithm.
Otherwise called the hashing addressing. Hashing refers to the
process of deriving a storage address from a record key; there
are many algorithms to determine the storage location using key
field, some of the algorithms are:
Division by Prime: In this procedure, the actual key is divided
by any prime number. Here the modular division is used. “rhat is
quotient is discarded and the storage locations signified by the
remainder. If the key field consists of large number of digits, for
instance, 10digits (e.g. 2345632278) then strip off the first or last
4 digits and then apply the division by prime method. Various
common algorithms are also given as under: Folding, Extraction,
Squaring.
The advantages and disadvantages of direct file
organization are as follows:
Advantages:
• Immediate access to records for inquiry and updating
purposes is possible
• Immediate updating of several files as a result of single
transaction is possible
• Time taken for sorting the transaction can be saved
Disadvantages:
• Records in the on-line file may be exposed the risk of a
loss of accuracy and a procedure for special backup and
reconstruction is required
• As compared to sequentially organized, this may be less
efficient in using the storage space
• Adding and deleting of records is more difficult then with
sequential files
• Relatively expensive hardware and software resources
are required
3.1.3 Index Sequential File Organization
The third way of accessing records stored in the system
is through an index. The basic form of an index includes a record
key and the storage address for record. To find record, when the
storage address is unknown it is necessary to scan the records.
However, if an index is used, the search will be faster since it
takes less time to search an index than an entire file of data.
Indexed file offers the simplicity of sequential file while the
same time offering a capability for direct access. The records
must be initially stored on the file in sequential order according
to a key field. In addition, as the records are being recorded on
the file, one or more indexes are establist›ed by the system to
associate the key field vall.ie(s) with the storage location of the
record on the file. These indexes are then used by the system
to allow a record to be directly accessed.
To find a specific record when the file is stored under an
indexed organization, the index is searched first to find the key
of the record wanted. When it is found, the corresponding
storage address is ncted and then the program can access the
record directly. This method uses a sequential scan of the index,
followed by direct access to the appropriate record. The index
helps to speed up the search compared with a sequential file, but
it is slower than the direct addressing.
The indexed files are generally maintained on magnetic
disk or on a mass storage system. The primary differences
between direct and indexed organized files are as follows:
Records may be accessed from a direct organized file
only randomly, where as records may be accessed sequentially
or randorrily from an indexed organized files.
Direct organized files utilize an algorithm to determine the
location of a record, whereas indexed organized files utilize an
index to locate a record to be randomly accessed. The
advantages and disadvantages of indexed sequential file
organization are as follows:
Advantages:
53
• Access types: The types of access that are supported
efficiently. Access types can include finding records with
a specified attribute value and finding records whose
attribute values fall in a specified range.
17
Deletion: To delete a record, the system first looks up the record
to be deleted. The actions the system takes next depend on
whether the index is dense or sparse:
Dense Indices:
• If the deleted record was the only record with its particular
search-key value, then the system deletes the
corresponding index record from the index.
Otherwise the following actions are taken:
• If the deleted record was the only record with its search
key, the system replaces the corresponding index record
with an index record for the next search-key value (in
search-key order). If the next search-key value already
has an index entry, the entry is deleted instead of being
replaced.
fi0
Multiple-Key Access:
Use multiple indices for certain types of queries.
Example:
SELECT * FROM ACCOLINT
WHERE BRANCH_NAME = ‘CHENNAI’ AND BALANCE = 1000
Possible strategies for processing query using indices on single
attributes:
• Use index on BRANCH_NAME to find accounts with
balances of 1000; test BRANCH_NAME = ‘CHENNAI’
• Use index on BALANCE to find accounts with balances
of 1000; test BRANCH NAME = ’CHENNAI’
• Use BRANCH NAME index to find pointers to all records
pertaining to the CHEhINAI branch. Similarly use index on
BALANCE. Take intersection of both sets of pointers
obtained.
Indices on Multiple Keys:
Composite search keys are search keys containing more than
one attribute
Example: (BRANCH NAME, BALANCE)
Indices on Multiple Attributes:
Suppose we have an index on combined search-key
(BRANCH_NAME, BALANCE)
With the where clause
WHERE BRANCH_NAME ‘CHEI' NAI’ AND BALANCE
1000 the index on (BRANCH_NAME, BALANCE) can be used
to fetch only records that satisfy both conditions.
Using separate indices is less efficient — we may fetch many
records (or pointers) that satisfy only one of the conditions.
The combined search-key (BRANCH NAME, BALANCE) can
also efficiently handle
WHERE BRANCH_NAME = ’CHENNAI’ AND BALANCE < 1000
59
4.2.1 Properties of Normalization:
Prime Attribute:
An attribute is said to be prime if it is a candidate key,
primary key or part of candidate key or primary key.
Non Prime Attribute,°
An attribute is said to be non prime if it is neither a
candidate key nor a primary key or not part of a candidate key
or primary key.
Transitive Functional Dependency:
59
EXAMPLE:
AD ADMINISTRATION (CHEIJIJAI)
59
DLOCATION). This decomposes the non 1NF relation in
to two ’INF relations.
Expand the key so that there will be a separate tuple in
the original relation for each location of a department. In
this case, the primary key becomes a combination of
(DNO DLOCA1-ION). -Fhis solution has the
disadvantage of introducing redundancy in the relation.
If a maximum of values is known for the attribute, for
example, if it is known that at most three locations can
exist for a department — replace the DLOCATION
attribute by three atomic attributes: DLOCATION1,
DLOCATIOIfl2 and DLOCATION3. “rhis solution has the
disadvantage of introducing null values if most
departments have fewer than three locations
Solution 1:
DEPARTMENT:
DNO DNAME
RE RESEARCH
AD ADMINISTRATION
DC DATA COLLECTION
DEPARTMENT_LOCATION:
DNO DLOCATION
RE MADURAI
RE CHENNAI
59
RE PUNE
AD CHENNAI
DC CHENNAI
DC NAGERCOIL
DC MARTAhIDAM
RE RESEARCH CHENNAI
RE RESEARCH MADURAI
RE RESEARCH PUNE
AD ADMINISTRATION CHENNAI
9I
The Primary Key of the above relation is the combination
of (DNO, DLOCATION). The major drawback of this approach is
redundancy.
Solution 2:
DEPARTMENT:
AD ADMiNis-rRATlON CHENNAI
FD 3: PNO PNAI\/IE
92
9I
Is the above relation in 2NF?
No, The above relation EMPLOYEE PROJECT is not in 2hlF.
Justification:
The Primary key of the relation is ENO, PNO, DATE. For
a relation to be in 2NF each nonprime attribute must be fully
functionally dependent on the key of the relation.
It’s clear from the above functional dependencies there
are non prime attributes that are partially functionally dependent
on the key of the relation.
• The non prime attributes ENAME and DESIGNATION
are dependent on part of the key ENO.
PROJECT
PNO PNAME
WORKS
93
The following Integrity Constraints hold on the above two
relations:
EXAMPLE:
A company is organized in to departments. An Employee work’s
in one department. Consider the following relation
EMPLOYEE_DEPARTMENT:
ENO DNO
DNO DNAME
94
“rhird Normal form states that no non-prime attribute must
be transitively determined by the Candidate Key / Primary Key
of a re.lation through a non prime attribute.
Is the above relation in 3NF?
The above relation is not in "rhird normal form.
How to Normalize†
Decomposition!
Create a relation with the original key and retain the attributes
that are not functionally determined by other non key
attribute(s).
Create a relation that includes the non key attribute(s) that
functionally determines other non key attribute(s) and the non
key attribute(s) it determines.
EMPLOYEE
DEPARTMENT
DNO DNAME
95
Boyce-Codd Normal Form (BCNF):
A relation R is said to be in BClflF if every determinant
is a Candidate Key. Consider the following relation.
Candidate_lnterview:
96
• Create another relation smith the determinant that is not a
candidate key ancl the attribute(s) it determines.
R1
You can infer that the relation R2 has the following determinant:
97
directly related to the key. These kinds of relationships are
called Multi-valued Dependencies.
Consider the following Relation EMP_PROJ_HOBBY:
ENO PNO
R2
ENO HOBBY
98
Lossless-Join Dependency:
Lossless-Join Dependency is a property of
decomposition, which ensures that no spurious tuples (additional
tuples) are generated and when relations are reunited through a
natural join operation.
Bomain-Key Normal Form
This level of normalization is simply a model taken to the
point where there are no opportunities for modification
anomalies.
99
• Logical description is a further restriction of the values the
domain is allowed
• Logical consequence: find a constraint on keys and/or
domains which, if it is enforced, means that thë desired
constraint is also enforced
100
introduced in 1979 by Oracle. Today, there are three standardsof
SQL, SQL80 (SQU), SQL92 (SQL2), and SQL99 (SQL3),
and numeroL S flavors of SQL available. SQL is used in
manipulating data stored in Relational Database Management
Systems (RDBMS). SQL provides commands through which data
can be extracted, sorted, updated, deleted and inserted. SQL is
an ANSI (American National Standards Institute) standard
computer language for accessing and manipulating database
systems. SQL can be used with any RDBMS such as MySQL,
PostgresSQL, Oracle, Microsoft SQL Server, Sybase, Ingres etc.
All the important and common SQL statements are supported by
these RDBMS; however, each has its own set of proprietary
statements and extensions.
In a Nutt Shell
• SQL stands for Structured Query Language
• SQL is an ANSI standard computer language
• SQL allows you to access a database
• SQL allows you to execute queries against a database
• SQL allows you to retrieve data from a database
• SBL allows you to insert new records in a database
• SQL allows you to delete records from a database
• SQL allows you to update records in a database
SQL Language Elements
The SQL language is sub-divided into several language
elements, including:
Statements which may have a persistent effect on
schemas and data, or which may control transactions,
program flow, connections, sessions, or diagnostics.
Queries which retrieve data based on specific criteria.
Expressions which can produce either scalar values
or tables consisting of columns and rows of data.
101
» Predicates which specify conditions that can be
evaluated to SQL three-valued logic (3VL) Boolean
truth values and which are used to limit the effects of
statements and queries, or to change program flow.
‹ Clauses which are (in some cases optional) constituent
components of statements and queries
Whitespace is generally ignored in SQL statements
and queries, nlaking it easier to format SQL code for
readability.
SQL statements also incluóe the semicolon (";")
statement terminator. Though not required on every
platform, it is defined as a standard part of the SQL
grammar.
102
• SELECT - Extracts data from a database table
• LIPDATE - Updates data in a database table
• DELETE - Deletes data from a database table
• INSERT INTO - Inserts new data into a database table
SQL Data Control Language (DCL).
DCL handles the authorization aspects of data and
permits the user to control who has access to see or manipulate
data within the database. Its two rriain keywords are:
GRANT authorizes one or more users to perform an
operation cr a set of operations on an object.
, REVOKE removes or restricts the capability of a user
to perform an operation or a set of operations.
Transaction Controls:
Transactions, if available, can be used to wrap around the
DML operations:
BEGIN WORK (or (SQL)|STARJ TRANSACTION]],
depending on SQL dialect) can be used to mark the start
of a database transaction, which either completes
completely or not at all.
. COMMIT causes all data changes in a transaction to be
made permanent.
, ROLLBACK causes all data changes since the last COMMIT
107
components of any DBMS are the query processor and the
transaction manager.
The query processor translates queries into a sequence
of retrieval requests on the stored data. There may be many
alternative translations for a given query, which are known as
query plans. The task of selecting a good query plan is known
as query optimization. A .good. query plan is one that has a
relatively low cost of execution compared with the alternative
query plans. A transaction is a sequence of queries and/or
updates. The transaction manager coordinates concurrently
executing transactions so as to guarantee the so-ealled ACID
properties:
108
alternative architectures for Distributed Database systems. To
improve the performance of global queries in distributed
databases, data items can be split into fragments that can be
stored at sites requiring frequent access to them. Data items or
fragments of data items can also be replicated across more than
one site. Techniques are therefore needed for deciding the best
way to fragment and replicate the data to optimize the
performance of applications.
A key difference between processing global queries in a
Distributed Database system and processing queries in a
centralized database system is that distributed database queries
may require data to be transmitted over the network. Thus, new
query-processing algorithms are needed that include data
transmission as an explicit part of their processing. Also, the
global query optimizer needs to take data transmission costs into
account when generating and evaluating alternative query plans.
A key difference between global transactions in a
Distributed Database system and transactions in a centralized
database system is that global transactions are divided into a
number of sub-transactions. Each sub-transaction is executed
by a single DATABASE server, which guarantees its ACID
properties. However, an extra level of coordination of the sub-
transactions is needed to guarantee that the overall global
transactions also exhibit the ACID properties.
109
e«erated
II‹eISi oq eI‘.e o uS'i
Sincle Mulâple
federated lG8er tet
sc l›ei»a schema
110
integrated view through which global queries and transactions
can access the information stored in the local databases. A
federated DDB is loosely coupled if there is no global schema
provided by a global DBA, and it is the users. responsibility to
define the global schemas they require to support their
applications. “Fhis chapter concentrates on tightly coupled
GDBs, which present the extra difficulty of having to provide an
integrated view of the information stored in the local databases.
The presence of a single database administration authority in an
unfederated multi-DBMS makes it likely that the multi-DBMS will
be a homogeneous one, both physically and semantically.
Physical homogeneity means that the local databases are all
managed by the same type of DBMS, supporting the same data
model, Data Definition Language (DDL)/ Data Manipulation
Language (DML), query processing, transaction management,
and so forth. Semantic homogeneity means that different local
databases store any information they have in common in a
consistent manner, so that integration of the information does not
require it to be transformed in any way. In contrast, the presence
of multiple database administration authorities in a federated
multi-DBMS makes it likely that it will be heterogeneous. The
heterogeneity may be physical, semantic, or both. Physical
heterogeneity means that different local DBs may be managed
by different types of DBMSs for example, different products or
different versions of one product. Thus, the local conceptual
schemas may be defined in different data models (e.g., network,
hierarchical, relational, object-oriented), the DDL/DML supported
by local DATABASEs may be different (e.g., network or
hierarchical, different versions of Structured Query Language
(SQL), Object Query Language (OQL), the query processors
may use different algorithms and cost models, the transaction
managers may support different concurrency control and
recovery mechanisms, and so forth. Semantic heterogeneity
means that different local DATABASEs may
111
moclel the same information using different schema constructs
or may use the same schema construct to model different
information. For example, peoples names may be stored using
different string lengths, or a relation named student in one
DATABASE may contain only undergraduate students while a
relation named student in another DATABASE contains both
undergraduate and postgraduate students. If there is semantic
heterogeneity in a multi-DBMS, it is necessary to perform
semantic integration of the export schemas. That requires thé
export schemas to be transformed so as to eliminate any
inconsistencies between them.
A heterogeneous multi-DBMS must integrate the export
schemas of the local DATABASEs into one or more global
schemas, which provide an integrated view through which
global queries and transactions can access the federation. This
view must be constructed while preserving the autonomy of the
local DATABASEs, that is, leaving control of them in the hands
of the local DBAs. The following types of schemas are
addressed in a heterogeneous DDB system.
• A local schema for each local DATABASE. ”rhe local
schema is the conceptual schema of the local
DATABASE. Each local DATABASE continues to opeiate as
an autonomous entity, and the content of its local
schema is under the control of its local DBAs. Each local
DATABASE will also have a physical schema and
possibly a number of external schemas that are views of
its local schema. However, those schemas are not
considered to be part of the heterogeneous multi-DBMS
architecture.
• A component schema corresponding to each local
schema. The local DATABASEs may support different
data models and different DDL/ DMLs. Thus, the local
111
schemas have to be translated into some ccmmon data
model (CDM) before they can be integrated.
• One or more export schemas corresponding to each
component scherria. Each export scherria is a view over
the component schema that the local DBAs want to make
available to the federation. The export schemas define
what part of the locally held information can be accessed
by global queries and transactions.
115
Data Distribution:
Unit of distribution can be entire table(s) or subset of records.
Fragmentation and replication are two techniques through which
data is stored in a distributed environment. Parallel execution of a
single query by dividing it into a set of sub queries that operate on
fragments. Fragmentation typically increases the level of
concurrency and therefore the system throughput. There may be a
performance degradation (integration of several fragments: joins,
unions)
Four alternatives for Fragmentation:
• Horizontal fragmentation
• Vertical fragmentation
• Hybrid (eg. vertical-horizontal) fragmentation
• Derived horizontal fragmentation
132
object-oriented features in RDBMS to make them ORDBMS,
and the emergence of object-relational mappers (ÓRMs) have
made RDBMS successfully defend their dominance in the data
center for server-side persistence.
Object databases are now established as a complement,
not a replacement for relationai databases. They found their
place as embeddable persistence solutions in devices, on
clients, in packaged software, in real-time control systems, and
to power websites. The open source community has created a
new wave of enthusiasm that's now fueling the rapid growth of
ODBMS installations.
6.1 Object-Oriented DBMS
Object-Oriented Concepts: ““
Object:
An object is an abstract representation of a real-world er t”ty that
has a unique identity, embedded properties, and the ability to
interact with other objects and itself. It can also be defined as am
entity that has a well-defined role in the application domain, as
well as state, behavior, and identity. Objects exhibit behavior.
Behavior represents how an object acts and reacts. Behavior is
expressed through operations that can be performed on it.
Object ldentifier:
• An object lD (OID) represents the cbject’s identity, which
is unique to that object.
• The OID is assigned by the System at the moment of the
object’s creation and ca› nom be changed under any
circumstance.
• The OID can be dele ‹.•^. ’•'°. ." the object is deleted, and
that OID can ac•/ i e eu•›•ú
Attributes:
• Objects are described by their attributes, known as
instance variables.
133
e Attributes have a domain. The domain logically groups and
describes the set of all possible values that an attribute
can ha‘ e.
• An attribute can se single valued or multivalued.
• Attributes may reference one or more other objects.
Object State:
134
Classes:
• Objects that share common characteristics are grouped
into classes. A class is a collection of similar objects with
shared structure (attributes) and behavior (methods).
• Each object in a class is known as a class instance or
object instance.
Protocol:
• The class’s collection of messages, each identified by a
message name, constitutes the object or class protocol.
• The protocol represents an object’s public aspect; i.e., it
is known by other objects as well as end users.
• The implementation of the object’s structure and methods
constitutes the object’s private aspect.
• A message can be sent to an object instance or the
class. When the receiver object is a class, the message
will invoke a class method.
Superclasses, Subclasses, and Inheritance:
Classes are organized into a class hierarchy.
Example: Musical instrument class hierarchy
Piano, Violin, and Guitar are a subclass of Stringed instruments,
which is, in turn, a subclass of Musical instruments. Musical
instruments defines the superclass of Stringed instruments,
which is, in turn, the superclass of the Piano, Violin, and Guitar
classes. Inheritance is the ability of an object within the
hierarchy to inherit the data structure and behavior (methods) of
the classes above it.
Characteristics of an OO Data Model:
• Support the representation of complex objects.
• Be extensible; i.e., it must be capable of defining new
data types as well as the operations to be performed on
them.
135
• Support encapsulation; i.e., the data representation and
the method’s imt:!cmeri!atiun must be hidden from
x.ernal entities.
• Exhibit inheritance; an object must be able to inhei it the
properties (data and methods) of other objects.
• Support the notion of object id°•t/ty (OID).
• The OODM models real-worid entities as objects.
• Each object is composed of attributes and a set of
methods.
• Each attribute can reference another object or a set of
objects.
• The attributes and the methods implementation are
hidden, or encapsulated, from other objects.
• each object is identified by a unique object ID (OU), which
is independent of the value of .› s attributes.
• Similar objects are described and grouped in a class that
contains the description of the data and the method’s
implementation.
• The class describes a type of object.
• Classes are organized in a class h:erarchy.
• Each object of a class inner.ts a.! ‹sioperties of its
superc!asses in the class hierarcn,’
Object 6rient.r•.d Data Modeling:
• Cente: o around objects and classes
• invcives inheritance
• Encapsulates both data and behavior
Benefits of Object-Oriented Modeling:
• Ability to tackle challenging problems
• Improved communication between users, analysts,
designer, and programmers
• Increased consistency in analysis and design
• Expli it representation of commonality among system
components
136
• System robustness
• Reusability of analysis, design, and programming results
• Object-oriented modeling is frequently accomplished
using the Unified Modeling Language (UML)
137
WHAT AN OODBMS SHOULD SUPPORT†
• Atomic and Complex Objects
• Methods and Messages
• Object Identity
• Single Inheritance
• Polymorphism - Overloading and Late-binding
• Persistence
• Shared Objects
In addition an OODBMS can optionally support the following:
• Multiple Inheritance
• Exception Messages
• Distribution
• Long Transactions
• Versions
Characteristics That ‘Must Be’ Supported by an OODBMS
As Specified By The OO Database Manifesto:
• Complex Objects
• Object Identity
• Encapsulation
• Classes
• Inheritance
• Overriding and Late-binding
• Extensibility
• Computational Completeness
• Persistence
• Concurrency
• Recovery
• Ad-hoc querying
Advantages of OODBMS:
• Enriched modelling capabilities
• Extensibility
• Removal of Impedance Mismatch
• Support for schema evolution.
138
• Support for long duration transactions.
• Applicable for advanced database applications
• Improved performance.
Applications of OODBMS:
• Computer-Aided Design (CAD).
• Computer-Aided Manufacturing (CAM).
• Computer-Aided Software Engineering (CASE).
• Office Information Systems (OIS).
• Multimedia Systems.
• Digital Publishing.
• Geographic Information Systems (GIS).
• Scientific and Medical Systems.
Disadvantages of OODBMS:
• Lack of a universal data model
• Lack of experience
• Lack of standards.
• Ad-hoc querying compromises encapsulation.
• Locking at object-level impacts performance
• Complexity
• Lack of support for views
• Lack of support for security
Object RelationalDBMS:
Object-Relational databases extend the Relational Data
Model to address those weaknesses identified previously. An
Object-Relational database adds features associated with
object-oriented systems to the Relational Data Model. In
essence ORDBMSs are an attempt to add OO to Tables.
Major Difference Between an OODBMS and an ORDBMS:
OODBMSs try to add DBMS functionality to one or more
OO programming languages. [Revolutionary in that they
abandon SQL]
147
ORDBMSs try to add richer data types and OO features to a relational DBMS
148
a traditional mainframe architecture, the combination of
processing power and dafa storage is located at only one site -
implementing a distributed database is not possible.
Although client-server systems are usually identified with
distributed data storage, there is no requirement for data
storage to be distributed in client-server environments - data
may be centralized on one mainframe, distributed widely
throughout the organization, or anything in between. Many
organizations have seen the ability to move to client-server as
an opportunity to replace their expensive mainframe data
centers with less expensive minicomputers and
microcomputers. Such a strategy has come to be known as
downsizing or rightsizing.
“Client/server systems operate in a networked
environment, splitting the processing of an application between
a front-end client and a back-end processor.“
• Client and server may reside on same computer
• Both are intelligent and programmable
Application Logic Components:
• Presentation logic
Input
Output
• Processing logic
I/O processing
Business rules
Data management
• Storage logic
Data storage and retrieval
DBMS functions
File Server Architecture:
• “A file server is a device that manages file operations and
is shared by each of the client PCs."
e Fat client: does most processing
149
Limitations:
• Whole file or table transferred to client
• Client must have full version of DBMS
• Each client DBMS must manage database integi ity
Database Server Architecture:
• Client workstation:
user interface, presentation logic, data processing
logic, business rules logic
• Database server:
database storage, access, and processing
Advantages: tess traffic, more ‹:ontroI over data
• Stored procedures: first use of business logic at database
server
Three-Tier Architectures:
• Application server in addition to ciient and database
server
• “rhin clients: do less processing
• Application server contains “standard” programs
• Benefits:
scalability
technological flexibility
lower long-term costs
better match business needs
improved customer service
competitive advantage
reduced risk
Characteristics of a Client
151
• Typically interacts directly with end-users using a
graphical user interface
Characteristics of a Server
• Receiver of request which is sent by client is known as
server
• Passive (slave)
• Waits for requests from clients
• Lipon receipt of requests, processes them and then
serves replies
• Usually accepts connections from a large number of
clients
• Typically does not interact directly with end-users
Examples of Client Server Databases : Oracle, Sybase, SGL
Server, Informix, etc.
7.1 Knowledge based Management Systems
Commercial relational DBMSs are tailored to efficiently
support fixed format data models in what is known as data
management. Nevertheless the upcoming demands in data
analysis are pushing the technological frontiers to allow that two
new other dimensions be supported by such systems: the object
management and the knowledge management.
7.2 Definition and importance of Knowledge
Knowledge:
Knowledge is defined variously as:
i. Expertise, and skills acquired by a person through
experience or education; ’the theoretical or practical
understanding of a subject
ii. What is known in a particular field or in total; facts and
information
iii. Awareness or familiarity gained by experience of a fact or
situation.
There is however no single agreed definition of
knowledge presently, nor any prospect of one, and there remain
numerous competing theories.
152
Knowledge acquisition involves complex cognitive
processes like perception, learning, communication, association
and reasoning. The term knowledge is also used to mean the
confident understanding pf a subject with the ability to.use it for
a specific purpose
7.3 Difference of KBMS and DBMS
Knowledge Based Systems:
Knowledge-based expert systems, or simply expert
systems, use human knowledge to solve problems that normally
would require human intelligence. These expert systems
represent the expertise knowledge as data or rules within the
computer. These rules and data can be called upon when
needed to solve problems. Books and manuals have a
tremendous amount of knowledge but a human has to read and
interpret the knowledge for it to be used. Conventional computer
programs perform tasks using conventional decision-making
logic containing little knowledge other than the basic algorithm for
solving that specific problem and the necessary boundary
conditions. This program knowledge is often embedded as part
of the programming code, so that as the knowledge changes, the
program has to be changed and then rebuilt. Knowledge- based
systems collect the small fragments of human know-how into a
knowledge-base which is used to reason through a problem,
using the knowledge that is appropriate. A different problem,
within the domain of the knowledge-base, can be solved using
the same program without reprogramming. The ability of these
systems to explain the reasoning process through back-traces
and to handle levels of confidence and uncertainty provides an
additional feature that conventional programming doesn’t
handle.
A knowledge base is a special kind of database for
knowledge management. It provides the means for the
computerized collection, organization, and retrieval of
knowledge. An active area of research in artificial intelligence is
153
knowledge representation. Early work in
Artificial Intelligence(AI) has focused on
techniques such as representation and
oblem-solving, scant attention was paid to
the issues to which database (DB)
research has focused (e.g., data sharing,
queryoptimization, transaction processing).
Knowledge base systems, also known
as expert systems, are a
facet of ArtificialIntelligence (AI). AI is
a sub-field of computer science that
focuses on the development of
intelligent software and
hardware systems that emulate human
reasoning techniquesand
capabilities. Knowledge base
systems emulate the
decision-making processes of humans
and are one of the most comn1ercially
successful AI technologies. These
systems areused in a variety of
applications for business, science and
engineering. Business applications
capture a company’s critical
business knowledge and utilize it for decision support.
Knowledge management entails the
ability to store “rules” (as defined in First
Order Logic) that are part of the semantic
of an application. These rules allow the
derivation of data that is not directly stored
in the database. A number of application
domains would benefit of the knowledge
management capabilities, and, therefore,
a simple, powerful, and efficient
mechanism to add the knQwledge
154
dimension to an off-the-shelf DBMS can
be rather useful.
Formally, a knowledge-base management
system (KBMS) is a system that:
• Provides support for efficient
access, transaction management,
and all other functionalities
associated to DBMSs.
• Provides a single, declarative
language to serve the roles played
by both the data manipulation
language and the host language in
a DBMS.
155