Student Notes
Student Notes
Student Notes
Module Structure
Data and Access Control
Database Security
Discretionary Access Control
Multilevel Access Control
Distributed Access Control
View Management
Views in Centralized DBMSs
Views in Distributed DBMSs
Maintenance of Materialized Views
Data Security
• Data security is an important function of a database system that protects data against
unauthorized access.
• Data security includes two aspects:
– Data protection and
– Access control.
Data Protection
• Data protection is required to prevent unauthorized users from understanding the physical
content of data.
• This function is typically provided by file systems in the context of centralized and
distributed operating systems.
• Data Protection
• The main data protection approach is data encryption which is useful both for information
stored on disk and for information exchanged on a network.
• Encrypted (encoded) data can be decrypted (decoded) only by authorized users who “know”
the code.
Access Control
• Access control must guarantee that only authorized users perform operations they are
allowed to perform on the database.
• Many different users may have access to a large collection of data under the control of a
single centralized or distributed system.
• The centralized or distributed DBMS must thus be able to restrict the access of a subset of
the database to a subset of the users
• Access control in database systems differs in several aspects from that in traditional file
systems.
• Authorizations must be refined so that different users have different rights on the same
database objects.
1
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• This requirement implies the ability to specify subsets of objects more precisely than by
name and to distinguish between groups of users.
• In addition, the decentralized control of authorizations is of particular importance in a
distributed context.
• In relational systems, authorizations can be uniformly controlled by database administrators
using high-level constructs
• There are two main approaches to database access control the first approach is called
discretionary and has long been provided by DBMS.
• Discretionary access control (or authorization control) defines access rights based on the
users, the type of access (e.g., SELECT, UPDATE) and the objects to be accessed
• The second approach, called mandatory or multilevel further increases security by restricting
access to classified data to cleared users.
• Support of multilevel access control by major DBMSs is more recent and stems from
increased security threats coming from the Internet
• Authorization control can be characterized based on who (the grantors) can grant the rights.
2
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• In its simplest form, the control is centralized: a single user or user class, the database
administrators, has all privileges on the database objects and is the only one allowed to use
the GRANT and REVOKE statements.
• A more flexible but complex form of control is decentralized the creator of an object
becomes its owner and is granted all privileges on it.
• In particular, there is the additional operation type GRANT, which transfers all the rights of
the grantor performing the statement to the specified subjects.
• Therefore, the person receiving the right (the grantee) may subsequently grant privileges on
that object.
• The main difficulty with this approach is that the revoking process must be recursive.
• For example, if A, who granted B who granted C the GRANT privilege on object O, A wants to
revoke all the privileges of B on O, all the privileges of C on O must also be revoked.
• To perform revocation, the system must maintain a hierarchy of grants per object where the
creator of the object is the root.
• The privileges of the subjects over objects are recorded in the catalog (directory) as
authorization rules.
• The most convenient approach is to consider all the privileges as an authorization matrix, in
which a row defines a subject, a column an object, and a matrix entry for a pair (subject,
object), the authorized operations
3
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• Consider user A who has authorized access to relations R and S and user B who has
authorized access to relation S only.
• If B somehow manages to modify an application program used by A so it writes R data into
S , then B can read unauthorized data without violating authorization rules.
• Multilevel access control further improves security by defining different security levels for
both subjects and data objects
• The security levels are Top Secret (TS ), Secret (S ), Confidential (C ) and Unclassified (U ), and
ordered as TS > S >C >U , where “> ” means “more secure”.
Access in read and write modes by subjects is restricted by two simple rules:
1. A subject S is allowed to read an object of security level l only if level(S) > l .
2. A subject S is allowed to write an object of security level l only if class(S) <= l .
• Rule 1 called “no read up” protects data from unauthorized disclosure, i.e., a subject at a
given security level can only read objects at the same or lower security levels.
• For instance, a subject with secret clearance cannot read top-secret data.
• Rule 2 (called “no write down”) protects data from unauthorized change, i.e., a subject at a
given security level can only write objects at the same or higher security levels.
• For instance, a subject with top-secret clearance can only write top-secret data but cannot
write secret data
• In the relational model, data objects can be relations, tuples or attributes.
• Thus, a relation can be classified at different levels:
– Relation - all tuples in the relation have the same security level
– Tuple - every tuple has a security level
– Attribute - every distinct attribute value has a security level
• A classified relation is thus called multilevel relation to reflect that it will appear differently
(with different data) to subjects with different clearances.
• A multilevel relation classified at the tuple level can be represented by adding a security
level attribute to each tuple.
• Similarly, a multilevel relation classified at attribute level can be represented by adding a
corresponding security level to each attribute.
• A multilevel relation PROJ* based on relation PROJ which is classified at the attribute level.
• The entire relation also has a security level which is the lowest security level of any data it
contains
4
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
5
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• If group information as well as access rules are fully replicated at all sites the enforcement
of access rights is similar to that of a centralized system.
View Management
• One of the main advantages of the relational model is that it provides full logical data
independence.
• External schemas enable user groups to have their particular view of the database.
• In a relational system, a view is a virtual relation, defined as the result of a query on base
relations (or real relations), but not materialized like a base relation, which is stored in the
database.
• A view is a dynamic window in the sense that it reflects all updates to the database.
• An external schema can be defined as a set of views and/or base relations.
• Besides their use in external schemas, views are useful for ensuring data security in a simple
way.
• By selecting a subset of the database, views hide some data.
• If users may only access the database through views, they cannot see or manipulate the
hidden data, which are therefore secure.
• In a distributed DBMS, a view can be derived from distributed relations, and the access to a
view requires the execution of the distributed query corresponding to the view definition.
• An important issue in a distributed DBMS is to make view materialization efficient
• The single effect of this statement is the storage of the view definition in the catalog.
• No other information needs to be recorded.
• Therefore, the result of the query defining the view is not produced.
• However, the view SYSAN can be manipulated as a base relation
• Find the names of all the system analysts with their project number and responsibility
involving the view SYSAN and relation ASG(ENO,PNO,RESP,DUR) can be expressed as
6
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• Mapping a query expressed on views into a query expressed on base relations can be done
by query modification
• With this technique the variables are changed to range on base relations and the query
qualification is merged with the view qualification
• Views in Centralized DBMS
The preceding query can be modified to
• The modified query is expressed on base relations and can therefore be processed by the
query processor.
• It is important to note that view processing can be done at compile time.
• The view mechanism can also be used for refining the access controls to include subsets of
objects.
• To specify any user from whom one wants to hide data, the keyword USER generally refers
to the logged-on user identifier
The view ESAME restricts the access by any user to those employees having the same title:
• If the user who creates ESAME is an electrical engineer, as in this case, the view represents
the set of all electrical engineers
• Views can be defined using arbitrarily complex relational queries involving selection,
projection, join, aggregate functions, and so on.
• All views can be interrogated as base relations, but not all views can be manipulated as such.
7
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• Updates through views can be handled automatically only if they can be propagated
correctly to the base relations.
• We can classify views as being updatable and not updatable.
• A view is updatable only if the updates to the view can be propagated to the base relations
without ambiguity.
• The view SYSAN above is updatable; the insertion, for example, of a new system analyst
h201, Smithi will be mapped into the insertion of a new employee (201, Smith, Syst. Anal.).
• If attributes other than TITLE were hidden by the view, they would be assigned null values.
The following view, however, is not updatable
• The deletion, for example, of the tuple hSmith, Analysti cannot be propagated, since it is
ambiguous.
• Deletions of Smith in relation EMP or analyst in relation ASG are both meaningful, but the
system does not know which is correct.
• Current systems are very restrictive about supporting updates through views.
• Views can be updated only if they are derived from a single relation by selection and
projection.
• This precludes views defined by joins, aggregates, and so on
• It is interesting to note that views derived by join are updatable if they include the keys of
the base relations.
8
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• The query processor maps the distributed query into a query on physical fragments.
• Evaluating views derived from distributed relations may be costly.
• In a given organization it is likely that many users access the same view which must be
recomputed for each user.
• An alternative solution is to avoid view derivation by maintaining actual versions of the
views, called materialized views.
• A materialized view stores the tuples of a view in a database relation, like the other
database tuples, possibly with indices.
• Thus, access to a materialized view is much faster than deriving the view, in particular, in a
distributed DBMS where base relations can be remote.
9
DISTRIBUTED DATA SYSTEMS - SSZG554
Student Notes
• The views managed with these strategies are also called snapshots
• The second question (how to refresh a view) is an important efficiency issue.
• The simplest way to refresh a view is to recompute it from scratch using the base data.
• In some cases, this may be the most efficient strategy, e.g., if a large subset of the base data
has been changed.
• However, there are many cases where only a small subset of view needs to be changed.
• In these cases, a better strategy is to compute the view incrementally, by computing only the
changes to the view.
10