0% found this document useful (0 votes)
101 views

Unit 5-DBP

The document discusses database security issues and discretionary access control. It covers: 1) Types of discretionary privileges at the account and relation level that can be granted and revoked, such as SELECT, INSERT, UPDATE, DELETE privileges. 2) How views can be used to specify privileges by allowing selective access to only certain fields or tuples of a relation. 3) The typical method of enforcing discretionary access control is based on granting and revoking privileges to users through statements in query languages like SQL. This controls user access to relations and operations in the database.

Uploaded by

Hafiz Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views

Unit 5-DBP

The document discusses database security issues and discretionary access control. It covers: 1) Types of discretionary privileges at the account and relation level that can be granted and revoked, such as SELECT, INSERT, UPDATE, DELETE privileges. 2) How views can be used to specify privileges by allowing selective access to only certain fields or tuples of a relation. 3) The typical method of enforcing discretionary access control is based on granting and revoking privileges to users through statements in query languages like SQL. This controls user access to relations and operations in the database.

Uploaded by

Hafiz Rahman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

UNIT 5

DATABASE SECURITY
Database Security Issues – Discretionary Access Control Based on Granting and Revoking Privileges – Mandatory Access Control and
Role-Based Access Control for Multilevel Security –SQL Injection – Statistical Database Security – Flow Control – Encryption and Public
Key Infrastructures – Preserving Data Privacy – Challenges to Maintaining Database Security – Database Survivability – Oracle Label-
Based Security.

10 Common Database Security Issues


Databases are very attractive targets for hackers because they contain valuable and sensitive information. This can range fro m
financial or intellectual property to corporate data and personal user data. Cybercriminals can profit by breaching the servers
of companies and damaging the databases in the process. Thus, database security testing is a must.

There are numerous incidents where hackers have targeted companies dealing with personal customer details. Equifax,
Facebook, Yahoo, Apple, Gmail, Slack, and eBay data breaches were in the news in the past few years, just to name a few.
Such rampant activities raised the need for cybersecurity software and web app testing which aims to protect the data that
people share with online businesses. If these measures are applied, the hackers will be denied all access to the records and
documents available on the online databases. Also, complying with GDPR will help a lot on the way to strengthening user data
protection.

Here’s a list of top 10 vulnerabilities that are commonly found in the database-driven systems and our tips for how to eliminate
them.

No Security Testing Before Deployment


One of the most common causes of database weaknesses is negligence on the deployment stage of the development process.
Although functional testing is conducted to ensure supreme performance, this type of test can’t show you if the database is
doing something that it is not supposed to. Thus, it is important that you test website security with different types of tests before
complete deployment.

Poor Encryption and Data Breaches Come Together


You might consider the database a backend part of your set-up and focus more on the elimination of Internet-borne threats. It
does not really work that way. There are network interfaces within the databases which can be easily tracked by hackers if
your software security is poor. In order to avoid such situations, it is important to use TLS or SSL encrypted communication
platforms.

Feeble Cybersecurity Software = Broken Database


Case in point, the Equifax data breach. Company representatives admitted that 147 million consumers’ data was compromised,
so the consequences are huge. This case has proven how important cybersecurity software is to defend one’s database.
Unfortunately, either due to a lack of resources or time, most businesses don’t bother to conduct user data security testing and
do not provide regular patches for their systems, thus, leaving them susceptible to data leaks.

Stolen Database Backups


There are two kinds of threats to your databases: external and internal. There are cases when companies struggle with interna l
threats even more than with external. Business owners can never be 100% sure of their employees’ loyalty, no matter what
computer security software they use, and how responsible they seem to be. Anybody who has access to sensitive data can steal
it and sell it to the third-party organizations for profit. However, there is a way to eliminate the risk: encrypt database archives,
implement strict security standards, apply fines in case of violations, use cybersecurity software, and
continuously increase your teams’ awareness via corporate meetings and personal consulting.
Flaws in Features as a Database Security Issue
Databases can be hacked through the flaws of their features. Hackers can break into legitimate credentials and compel the
system to run any arbitrary code. Although it sounds complex, the access is actually gained through the basic flaws inherent
to the features. The database can be protected from third-party access by security testing. Also, the simpler its functional
structure — the more chances to ensure good protection of each database feature.

Weak and Complex DB Infrastructure


Hackers do not generally take control over the entire database in one go. They opt for playing a Hopscotch game where they
find a particular weakness within the infrastructure and use it to their advantage. They launch a string of attacks until they
finally reach the backend. Security software is not capable of fully protecting your system from such manipulations. Even if
you pay attention to the specific feature flaws, it’s important not to leave the overall database infrastructure too complex. When
it’s complex, there are chances you will forget or neglect to check and fix its weaknesses. Thus, it is important that every
department maintains the same amount of control and segregates systems to decentralize focus and reduce possible risks.

Limitless Administration Access = Poor Data Protection


Smart division of duties between the administrator and the user ensures limited access only to experienced teams. This way
users that are not involved into the database administration process will experience more difficulties if they try to steal any
data. If you can limit the number of user accounts, it’s even better because hackers will face more problems in gaining control
over the database as well. This case can be applied to any type of business but usually it happens in financial industry. Thus,
it’s good not only to care about who has the access to the sensitive data but also to perform banking software testing before
releasing it.

Test Website Security to Avoid SQL Injections


This is a major roadblock on the way to the database protection. Injections attack the applications and database administrators
are forced to clean up the mess of malicious codes and variables that are inserted into the strings. Web application security
testing and firewall implementation are the best options to protect the web-facing databases. However this is a big problem for
online business, it’s not one of the major mobile security challenges, which is a great advantage for the owners who only have
a mobile version of their application.

Inadequate Key Management


It’s good if you encrypt sensitive data but it’s also important that you pay attention to who exactly has access to the keys. Since
the keys are often stored on somebody’s hard drive, it is obviously an easy target for whoever wants to steal them. If you leave
such important software security tools unguarded, be aware that this makes your system vulnerable to attack.

Irregularities in Databases
It is inconsistencies that lead to vulnerabilities. Test website security and assure data protection on the regular basis. In case
any discrepancies are found, they have to be fixed ASAP. Your developers should be aware of any threat that might affect the
database. Though this is not an easy work but through proper tracking, the information can be kept secret.

In spite of being aware of the need for security testing, numerous businesses still fail to implement it. Fatal mistakes usually
appear during the development stages but also during the app integration or while patching and updating the database.
Cybercriminals take advantage of these failures to make a profit and, as a result, your business is under risk of being busted.

Discretionary Access Control Based on Granting and Revoking Privileges


The typical method of enforcing discretionary access control in a database system is based on the granting and revoking
of privileges. Let us consider privileges in the context of a relational DBMS. In particular, we will discuss a system of
privileges somewhat similar to the one originally developed for the SQL language (see Chapters 4 and 5). Many current
relational DBMSs use some variation of this tech-nique. The main idea is to include statements in the query language that
allow the DBA and selected users to grant and revoke privileges.

1. Types of Discretionary Privileges


In SQL2 and later versions, the concept of an authorization identifier is used to refer, roughly speaking, to a user account (or
group of user accounts). For simplicity, we will use the words user or account interchangeably in place
of authorization identifier. The DBMS must provide selective access to each relation in the database based on specific
accounts. Operations may also be controlled; thus, having an account does not necessarily entitle the account holder to all the
functionality provided by the DBMS. Informally, there are two levels for assigning privileges to use the database system:
The account level. At this level, the DBA specifies the particular privileges that each account holds independently of
the relations in the database.
The relation (or table) level. At this level, the DBA can control the privilege to access each individual relation or view in the
database.
References privilege on R. This gives the account the capability to reference (or refer to) a relation R when specifying
integrity constraints. This privilege can also be restricted to specific attributes of R.
Notice that to create a view, the account must have the SELECT privilege on all relations involved in the view definition in
order to specify the query that corresponds to the view.

2. Specifying Privileges through the Use of Views


The mechanism of views is an important discretionary authorization mechanism in its own right. For example, if the
owner A of a relation R wants another account B to be able to retrieve only some fields of R, then A can create a
view V of R that includes only those attributes and then grant SELECT on V to B. The same applies to limiting B to retrieving
only certain tuples of R; a view V can be created by defining the view by means of a query that selects only those tuples
from R that A wants to allow B to access.

3. Revoking of Privileges
In some cases it is desirable to grant a privilege to a user temporarily. For example, the owner of a relation may want to grant
the SELECT privilege to a user for a specific task and then revoke that privilege once the task is completed. Hence, a
mechanism for revoking privileges is needed. In SQL a REVOKE command is included for the purpose of canceling
privileges.

4. Propagation of Privileges Using the GRANT OPTION


Whenever the owner A of a relation R grants a privilege on R to another account B, the privilege can be given to B
with or without the GRANT OPTION. If the GRANT OPTION is given, this means that B can also grant that privilege on R to
other accounts. Suppose that B is given the GRANT OPTION by A and that B then grants the privilege on R to a third
account C, also with the GRANT OPTION. In this way, privileges on R can propagate to other accounts without the
knowledge of the owner of R. If the owner account A now revokes the privilege granted to B, all the privileges
that B propagated based on that privilege should automatically be revoked by the system.
It is possible for a user to receive a certain privilege from two or more sources. For example, A4 may receive a
certain UPDATE R privilege from both A2 and A3. In such a case, if A2 revokes this privilege from A4, A4 will still continue
to have the privilege by virtue of having been granted it from A3. If A3 later revokes the privilege from A4, A4 totally loses
the privilege. Hence, a DBMS that allows propagation of privi-leges must keep track of how all the privileges were granted so
that revoking of priv-ileges can be done correctly and completely.

5. An Example to Illustrate Granting and Revoking of Privileges


Suppose that the DBA creates four accounts—A1, A2, A3, and A4—and wants only A1 to be able to create base relations. To
do this, the DBA must issue the following GRANT command in SQL:

GRANT CREATETAB TO A1;

The CREATETAB (create table) privilege gives account A1 the capability to create new database tables (base relations) and
is hence an account privilege. This privilege was part of earlier versions of SQL but is now left to each individual system
imple-mentation to define.

In SQL2 the same effect can be accomplished by having the DBA issue a CREATE SCHEMA command, as follows:

CREATE SCHEMA EXAMPLE AUTHORIZATION A1;

User account A1 can now create tables under the schema called EXAMPLE. To con-tinue our example, suppose
that A1 creates the two base relations EMPLOYEE and DEPARTMENT shown in Figure 24.1; A1 is then the owner of these
two relations and hence has all the relation privileges on each of them.

Next, suppose that account A1 wants to grant to account A2 the privilege to insert and delete tuples in both of these relations.
However, A1 does not want A2 to be able to propagate these privileges to additional accounts. A1 can issue the following
com-mand:

GRANT INSERT, DELETE ON EMPLOYEE, DEPARTMENT TO A2;

Notice that the owner account A1 of a relation automatically has the GRANT OPTION, allowing it to grant privileges on the
relation to other accounts. However, account A2 cannot grant INSERT and DELETE privileges on
the EMPLOYEE and DEPARTMENT tables because A2 was not given the GRANT OPTION in the preceding command.

Next, suppose that A1 wants to allow account A3 to retrieve information from either of the two tables and also to be able to
propagate the SELECT privilege to other accounts. A1 can issue the following command:

GRANT SELECT ON EMPLOYEE, DEPARTMENT TO A3 WITH GRANT OPTION;

The clause WITH GRANT OPTION means that A3 can now propagate the privilege to other accounts by using GRANT. For
example, A3 can grant the SELECT privilege on the EMPLOYEE relation to A4 by issuing the following command:
GRANT SELECT ON EMPLOYEE TO A4;
Notice that A4 cannot propagate the SELECT privilege to other accounts because the GRANT OPTION was not given to A4.
Now suppose that A1 decides to revoke the SELECT privilege on the EMPLOYEE relation from A3; A1 then can issue this
command:
REVOKE SELECT ON EMPLOYEE FROM A3;
The DBMS must now revoke the SELECT privilege on EMPLOYEE from A3, and it must also automatically
revoke the SELECT privilege on EMPLOYEE from A4. This is because A3 granted that privilege to A4, but A3 does not have
the privilege any more.
Next, suppose that A1 wants to give back to A3 a limited capability to SELECT from the EMPLOYEE relation and wants to
allow A3 to be able to propagate the privilege. The limitation is to retrieve only the Name, Bdate, and Address attributes and
only for the tuples with Dno = 5. A1 then can create the following view:
CREATE VIEW A3EMPLOYEE AS
SELECT Name, Bdate, Address
FROM EMPLOYEE
WHERE Dno = 5;
After the view is created, A1 can grant SELECT on the view A3EMPLOYEE to A3 as follows:
GRANT SELECT ON A3EMPLOYEE TO A3 WITH GRANT OPTION;
Finally, suppose that A1 wants to allow A4 to update only the Salary attribute of EMPLOYEE; A1 can then issue the
following command:
GRANT UPDATE ON EMPLOYEE (Salary) TO A4;
The UPDATE and INSERT privileges can specify particular attributes that may be updated or inserted in a relation. Other
privileges (SELECT, DELETE) are not attrib-ute specific, because this specificity can easily be controlled by creating the
appro-priate views that include only the desired attributes and granting the corresponding privileges on the views. However,
because updating views is not always possible (see Chapter 5), the UPDATE and INSERT privileges are given the option to
specify the particular attributes of a base relation that may be updated.

6. Specifying Limits on Propagation of Privileges


Techniques to limit the propagation of privileges have been developed, although they have not yet been implemented in most
DBMSs and are not a part of SQL. Limiting horizontal propagation to an integer number i means that an account B given
the GRANT OPTION can grant the privilege to at most i other accounts.
Vertical propagation is more complicated; it limits the depth of the granting of privileges. Granting a privilege with a vertical
propagation of zero is equivalent to granting the privilege with no GRANT OPTION. If account A grants a privilege to
account B with the vertical propagation set to an integer number j > 0, this means that the account B has the GRANT
OPTION on that privilege, but B can grant the privilege to other accounts only with a vertical propagation less than j. In effect,
vertical propagation limits the sequence of GRANT OPTIONS that can be given from one account to the next based on a single
original grant of the privilege.
We briefly illustrate horizontal and vertical propagation limits—which are not available currently in SQL or other relational
systems—with an example. Suppose that A1 grants SELECT to A2 on the EMPLOYEE relation with horizontal propagation
equal to 1 and vertical propagation equal to 2. A2 can then grant SELECT to at most one account because the horizontal
propagation limitation is set to 1. Additionally, A2 cannot grant the privilege to another account except with vertical
propagation set to 0 (no GRANT OPTION) or 1; this is because A2 must reduce the vertical propagation by at least 1 when
passing the privilege to others. In addition, the horizontal propagation must be less than or equal to the originally granted hor-
izontal propagation. For example, if account A grants a privilege to account B with the horizontal propagation set to an integer
number j > 0, this means that B can grant the privilege to other accounts only with a horizontal propagation less than or equal
to j. As this example shows, horizontal and vertical propagation techniques are designed to limit the depth and breadth of
propagation of privileges.

Mandatory Access Control and Role-Based Access Control for Multilevel


Security

The discretionary access control technique of granting and revoking privileges on relations has traditionally been the main
security mechanism for relational database systems. This is an all-or-nothing method: A user either has or does not have a
certain privilege. In many applications, an additional security policy is needed that classifies data and users based on security
classes. This approach, known as mandatory access control (MAC), would typically be combined with the discretionary
access control mechanisms . It is important to note that most commercial DBMSs currently provide mechanisms only for
discretionary access control. However, the need for multilevel security exists in government, military, and intelligence
applications, as well as in many industrial and corporate applications. Some DBMS vendors—for example, Oracle—have
released special versions of their RDBMSs that incorporate mandatory access control for government use.
Typical security classes are top secret (TS), secret (S), confidential (C), and unclassified (U), where TS is the highest level
and U the lowest. Other more complex security classification schemes exist, in which the security classes are organized in a
lattice. For simplicity, we will use the system with four security classification levels, where TS ≥ S ≥ C ≥ U, to illustrate our
discussion. The commonly used model for multilevel security, known as the Bell-LaPadula model, classifies
each subject (user, account, program) and object (relation, tuple, column, view, operation) into one of the security
classifications TS, S, C, or U. We will refer to the clearance (classification) of a subject S as class(S) and to
the classification of an object O as class(O). Two restrictions are enforced on data access based on the subject/object
classifications:
1. A subject S is not allowed read access to an object O unless class(S) ≥ class(O). This is known as the simple security
property.
2. A subject S is not allowed to write an object O unless class(S) ≤ class(O). This is known as the star property (or *-
property).
The first restriction is intuitive and enforces the obvious rule that no subject can read an object whose security classification is
higher than the subject’s security clearance. The second restriction is less intuitive. It prohibits a subject from writing an object
at a lower security classification than the subject’s security clearance. Violation of this rule would allow information to flow
from higher to lower classifications, which violates a basic tenet of multilevel security. For example, a user (subject) with TS
clearance may make a copy of an object with classification TS and then write it back as a new object with classification U,
thus making it visible throughout the system.
To incorporate multilevel security notions into the relational database model, it is common to consider attribute values and
tuples as data objects. Hence, each attribute A is associated with a classification attribute C in the schema, and each attribute
value in a tuple is associated with a corresponding security classification. In addition, in some models, a tuple
classification attribute TC is added to the relation attributes to provide a classification for each tuple as a whole. The model
we describe here is known as the multilevel model, because it allows classifications at multiple security levels. A multilevel
relation schema R with n attributes would be represented as:
R(A1, C1, A2, C2, ..., An, Cn, TC)
where each Ci represents the classification attribute associated with attribute Ai.
The value of the tuple classification attribute TC in each tuple t—which is the highest of all attribute classification values
within t—provides a general classification for the tuple itself. Each attribute classification Ci provides a finer security
classification for each attribute value within the tuple. The value of TC in each tuple t is the highest of all attribute classification
values Ci within t.
The apparent key of a multilevel relation is the set of attributes that would have formed the primary key in a regular (single-
level) relation. A multilevel relation will appear to contain different data to subjects (users) with different clearance levels. In
some cases, it is possible to store a single tuple in the relation at a higher classification level and produce the corresponding
tuples at a lower-level classification through a process known as filtering. In other cases, it is necessary to store two or more
tuples at different classification levels with the same value for the apparent key.
This leads to the concept of polyinstantiation, where several tuples can have the same apparent key value but have different
attribute values for users at different clearance levels.
We illustrate these concepts with the simple example of a multilevel relation shown in Figure 24.2(a), where we display the
classification attribute values next to each attribute’s value. Assume that the Name attribute is the apparent key, and consider
the query SELECT * FROM EMPLOYEE. A user with security clearance S would see the same relation shown in Figure
24.2(a), since all tuple classifications are less than or equal to S. However, a user with security clearance C would not be
allowed to see the values for Salary of ‘Brown’ and Job_performance of ‘Smith’, since they have higher classification. The
tuples would be filtered to appear as shown in Figure 24.2(b), with Salary and Job_performance appearing as null. For a user
with security clearance U, the filtering allows only the Name attribute of ‘Smith’ to appear, with all the other

attributes appearing as null (Figure 24.2(c)). Thus, filtering introduces null values for attribute values whose security
classification is higher than the user’s security clearance.
In general, the entity integrity rule for multilevel relations states that all attributes that are members of the apparent key must
not be null and must have the same security classification within each individual tuple. Additionally, all other attribute values
in the tuple must have a security classification greater than or equal to that of the apparent key. This constraint ensures that a
user can see the key if the user is permitted to see any part of the tuple. Other integrity rules, called null
integrity and interinstance integrity, informally ensure that if a tuple value at some security level can be filtered (derived)
from a higher-classified tuple, then it is sufficient to store the higher-classified tuple in the multilevel relation.
To illustrate polyinstantiation further, suppose that a user with security clearance C tries to update the value
of Job_performance of ‘Smith’ in Figure 24.2 to ‘Excellent’; this corresponds to the following SQL update being submitted
by that user:
UPDATE EMPLOYEE

SET Job_performance = ‘Excellent’

WHERE S Name = ‘Smith’;


Since the view provided to users with security clearance C (see Figure 24.2(b)) per-mits such an update, the system should not
reject it; otherwise, the user could infer that some nonnull value exists for the Job_performance attribute of ‘Smith’ rather than
the null value that appears. This is an example of inferring information through what is known as a covert channel, which
should not be permitted in highly secure system. However, the user should not be allowed to overwrite the existing value
of Job_performance at the higher classification level. The solution is to create a polyinstantiation for the ‘Smith’ tuple at the
lower classification level C, as shown in Figure 24.2(d). This is necessary since the new tuple cannot be filtered from the
existing tuple at classification S.
The basic update operations of the relational model (INSERT, DELETE, UPDATE) must be modified to handle this and
similar situations, but this aspect of the prob-lem is outside the scope of our presentation. We refer the interested reader to the
Selected Bibliography at the end of this chapter for further details.
1. Comparing Discretionary Access Control and Mandatory Access Control
Discretionary access control (DAC) policies are characterized by a high degree of flexibility, which makes them suitable for a
large variety of application domains. The main drawback of DAC models is their vulnerability to malicious attacks, such as
Trojan horses embedded in application programs. The reason is that discretionary authorization models do not impose any
control on how information is propagated and used once it has been accessed by users authorized to do so. By contrast,
mandatory policies ensure a high degree of protection—in a way, they prevent any illegal flow of information. Therefore, they
are suitable for military and high security types of applications, which require a higher degree of protection. However,
mandatory policies have the drawback of being too rigid in that they require a strict classification of subjects and objects into
security levels, and there-fore they are applicable to few environments. In many practical situations, discretionary policies are
preferred because they offer a better tradeoff between security and applicability.

2. Role-Based Access Control


Role-based access control (RBAC) emerged rapidly in the 1990s as a proven technology for managing and enforcing security
in large-scale enterprise-wide systems. Its basic notion is that privileges and other permissions are associated with
organizational roles, rather than individual users. Individual users are then assigned to appropriate roles. Roles can be created
using the CREATE ROLE and DESTROY ROLE commands. The GRANT and REVOKE commands discussed in Section
24.2 can then be used to assign and revoke privileges from roles, as well as for individual users when needed. For example, a
company may have roles such as sales account manager, purchasing agent, mailroom clerk, department manager, and so on.
Multiple individuals can be assigned to each role. Security privileges that are common to a role are granted to the role name,
and any individual assigned to this role would automatically have those privileges granted.
RBAC can be used with traditional discretionary and mandatory access controls; it ensures that only authorized users in their
specified roles are given access to certain data or resources. Users create sessions during which they may activate a subset of
roles to which they belong. Each session can be assigned to several roles, but it maps to one user or a single subject only. Many
DBMSs have allowed the concept of roles, where privileges can be assigned to roles.
Separation of duties is another important requirement in various commercial DBMSs. It is needed to prevent one user from
doing work that requires the involvement of two or more people, thus preventing collusion. One method in which sepa-ration
of duties can be successfully implemented is with mutual exclusion of roles. Two roles are said to be mutually exclusive if
both the roles cannot be used simultaneously by the user. Mutual exclusion of roles can be categorized into two types,
namely authorization time exclusion (static) and runtime exclusion (dynamic). In authorization time exclusion, two roles that
have been specified as mutually exclusive cannot be part of a user’s authorization at the same time. In runtime exclusion, both
these roles can be authorized to one user but cannot be activated by the user at the same time. Another variation in mutual
exclusion of roles is that of complete and partial exclusion.
The role hierarchy in RBAC is a natural way to organize roles to reflect the organization’s lines of authority and
responsibility. By convention, junior roles at the bottom are connected to progressively senior roles as one moves up the
hierarchy. The hierarchic diagrams are partial orders, so they are reflexive, transitive, and antisymmetric. In other words, if a
user has one role, the user automatically has roles lower in the hierarchy. Defining a role hierarchy involves choosing the t ype
of hierarchy and the roles, and then implementing the hierarchy by granting roles to other roles. Role hierarchy can be
implemented in the following manner:
GRANT ROLE full_time TO employee_type1
GRANT ROLE intern TO employee_type2
The above are examples of granting the roles full_time and intern to two types of employees.
Another issue related to security is identity management. Identity refers to a unique name of an individual person. Since the
legal names of persons are not necessarily unique, the identity of a person must include sufficient additional information to
make the complete name unique. Authorizing this identity and managing the schema of these identities is called Identity
Management. Identity Management addresses how organizations can effectively authenticate people and manage their access
to confidential information. It has become more visible as a business requirement across all industries affecting organizations
of all sizes. Identity Management administrators constantly need to satisfy application owners while keeping expenditures
under control and increasing IT efficiency.
Another important consideration in RBAC systems is the possible temporal constraints that may exist on roles, such as the
time and duration of role activations, and timed triggering of a role by an activation of another role. Using an RBAC model is
a highly desirable goal for addressing the key security requirements of Web-based applications. Roles can be assigned to
workflow tasks so that a user with any of the roles related to a task may be authorized to execute it and may play a certain role
only for a certain duration.
RBAC models have several desirable features, such as flexibility, policy neutrality, better support for security management
and administration, and other aspects that make them attractive candidates for developing secure Web-based applications.
These features are lacking in DAC and MAC models. In addition, RBAC models include the capabilities available in traditional
DAC and MAC policies. Furthermore, an RBAC model provides mechanisms for addressing the security issues related to the
execution of tasks and workflows, and for specifying user-defined and organization-specific policies. Easier deployment over
the Internet has been another reason for the success of RBAC models.

3. Label-Based Security and Row-Level Access Control


Many commercial DBMSs currently use the concept of row-level access control, where sophisticated access control rules can
be implemented by considering the data row by row. In row-level access control, each data row is given a label, which is used
to store information about data sensitivity. Row-level access control provides finer granularity of data security by allowing the
permissions to be set for each row and not just for the table or column. Initially the user is given a default session label by the
database administrator. Levels correspond to a hierarchy of data-sensitivity levels to exposure or corruption, with the goal of
maintaining privacy or security. Labels are used to prevent unauthorized users from viewing or altering certain data. A user
having a low authorization level, usually represented by a low number, is denied access to data having a higher-level number.
If no such label is given to a row, a row label is automatically assigned to it depending upon the user’s session label.
A policy defined by an administrator is called a Label Security policy. Whenever data affected by the policy is accessed or
queried through an application, the policy is automatically invoked. When a policy is implemented, a new column is added to
each row in the schema. The added column contains the label for each row that reflects the sensitivity of the row as per the
policy. Similar to MAC, where each user has a security clearance, each user has an identity in label-based security. This user’s
identity is compared to the label assigned to each row to determine whether the user has access to view the contents of that
row. However, the user can write the label value himself, within certain restrictions and guidelines for that specific row. This
label can be set to a value that is between the user’s current session label and the user’s minimum level. The DBA has the
privilege to set an initial default row label.
The Label Security requirements are applied on top of the DAC requirements for each user. Hence, the user must satisfy the
DAC requirements and then the label security requirements to access a row. The DAC requirements make sure that the user is
legally authorized to carry on that operation on the schema. In most applica-tions, only some of the tables need label-based
security. For the majority of the application tables, the protection provided by DAC is sufficient.
Security policies are generally created by managers and human resources personnel. The policies are high-level, technology
neutral, and relate to risks. Policies are a result of management instructions to specify organizational procedures, guiding
principles, and courses of action that are considered to be expedient, prudent, or advantageous. Policies are typically
accompanied by a definition of penalties and countermeasures if the policy is transgressed. These policies are then interpreted
and converted to a set of label-oriented policies by the Label Security administra-tor, who defines the security labels for
data and authorizations for users; these labels and authorizations govern access to specified protected objects.
Suppose a user has SELECT privileges on a table. When the user executes a SELECT statement on that table, Label Security
will automatically evaluate each row returned by the query to determine whether the user has rights to view the data. For
example, if the user has a sensitivity of 20, then the user can view all rows having a security level of 20 or lower. The level
determines the sensitivity of the information contained in a row; the more sensitive the row, the higher its security label value.
Such Label Security can be configured to perform security checks on UPDATE, DELETE, and INSERT statements as well.

4. XML Access Control


With the worldwide use of XML in commercial and scientific applications, efforts are under way to develop security standards.
Among these efforts are digital signatures and encryption standards for XML. The XML Signature Syntax and Processing
specification describes an XML syntax for representing the associations between cryptographic signatures and XML
documents or other electronic resources. The specificaton also includes procedures for computing and verifying XML
signatures. An XML digital signature differs from other protocols for message signing, such as PGP (Pretty Good Privacy—
a confidentiality and authentication service that can be used for electronic mail and file storage application), in its sup-port for
signing only specific portions of the XML tree rather than the complete document. Additionally, the XML signature
specification defines mechanisms for countersigning and transformations—so-called canonicalization to ensure that two
instances of the same text produce the same digest for signing even if their representations differ slightly, for example, in
typographic white space.
The XML Encryption Syntax and Processing specification defines XML vocabulary and processing rules for protecting
confidentiality of XML documents in whole or in part and of non-XML data as well. The encrypted content and additional
pro-cessing information for the recipient are represented in well-formed XML so that the result can be further processed using
XML tools. In contrast to other commonly used technologies for confidentiality such as SSL (Secure Sockets Layer—a leading
Internet security protocol), and virtual private networks, XML encryption also applies to parts of documents and to documents
in persistent storage.

5. Access Control Policies for E-Commerce and the Web


Electronic commerce (e-commerce) environments are characterized by any trans-actions that are done electronically. They
require elaborate access control policies that go beyond traditional DBMSs. In conventional database environments, access
control is usually performed using a set of authorizations stated by security officers or users according to some security policies.
Such a simple paradigm is not well suited for a dynamic environment like e-commerce. Furthermore, in an e-commerce
environment the resources to be protected are not only traditional data but also knowledge and experience. Such peculiarities
call for more flexibility in specifying access control policies. The access control mechanism must be flexible enough to support
a wide spectrum of heterogeneous protection objects.
A second related requirement is the support for content-based access control. Content-based access control allows one to
express access control policies that take the protection object content into account. In order to support content-based access
control, access control policies must allow inclusion of conditions based on the object content.
A third requirement is related to the heterogeneity of subjects, which requires access control policies based on user
characteristics and qualifications rather than on specific and individual characteristics (for example, user IDs). A possible
solution, to better take into account user profiles in the formulation of access control policies, is to support the notion of
credentials. A credential is a set of properties concerning a user that are relevant for security purposes (for example, age or
position or role within an organization). For instance, by using credentials, one can simply formulate policies such as Only
permanent staff with five or more years of service can access documents related to the internals of the system.
It is believed that the XML is expected to play a key role in access control for e-commerce applications5 because XML is
becoming the common representation language for document interchange over the Web, and is also becoming the language
for e-commerce. Thus, on the one hand there is the need to make XML representations secure, by providing access control
mechanisms specifically tailored to the protection of XML documents. On the other hand, access control information (that is,
access control policies and user credentials) can be expressed using XML itself. The Directory Services Markup
Language (DSML) is a representation of directory service information in XML syntax. It provides a foundation for a standard
for communicating with the directory services that will be responsible for providing and authenticating user credentials. The
uniform presentation of both protection objects and access control policies can be applied to policies and credentials
themselves. For instance, some credential properties (such as the user name) may be accessible to everyone, whereas other
properties may be visible only to a restricted class of users. Additionally, the use of an XML-based language for specify-ing
credentials and access control policies facilitates secure credential submission and export of access control policies.

SQL Injection
SQL Injection
SQL injection is a code injection technique that might destroy your database.

SQL injection is one of the most common web hacking techniques.

SQL injection is the placement of malicious code in SQL statements, via web page input.

SQL in Web Pages


SQL injection usually occurs when you ask a user for input, like their username/userid, and instead of a name/id, the user gives you
an SQL statement that you will unknowingly run on your database.

Look at the following example which creates a SELECT statement by adding a variable (txtUserId) to a select string. The variable is
fetched from user input (getRequestString):

Example
txtUserId=getRequestString("UserId");
txtSQL = "SELECT * FROM Users WHERE UserId = " + txtUserId;

The rest of this chapter describes the potential dangers of using user input in SQL statements.
SQL Injection Based on 1=1 is Always True
Look at the example above again. The original purpose of the code was to create an SQL statement to select a user, with a given
user id.

If there is nothing to prevent a user from entering "wrong" input, the user can enter some "smart" input like this:

105 OR 1=1
UserId:

Then, the SQL statement will look like this:

SELECT * FROM Users WHERE UserId = 105 OR 1=1;

The SQL above is valid and will return ALL rows from the "Users" table, since OR 1=1 is always TRUE.

Does the example above look dangerous? What if the "Users" table contains names and passwords?

The SQL statement above is much the same as this:

SELECT UserId, Name, Password FROM Users WHERE UserId = 105 or 1=1;

A hacker might get access to all the user names and passwords in a database, by simply inserting 105 OR 1=1 into the input field.

SQL Injection Based on ""="" is Always True


Here is an example of a user login on a web site:

Username:
John Doe

Password:
myPass

Example
uName=getRequestString("username");
uPass=getRequestString("userpassword");
sql = 'SELECT * FROM Users WHERE Name ="' + uName + '" AND Pass ="' + uPass + '"'

Result
SELECT * FROM Users WHERE Name ="John Doe" AND Pass ="myPass"

A hacker might get access to user names and passwords in a database by simply inserting " OR ""=" into the user name or password
text box:

UserName:

Password:
The code at the server will create a valid SQL statement like this:

Result
SELECT * FROM Users WHERE Name ="" or ""="" AND Pass ="" or ""=""

The SQL above is valid and will return all rows from the "Users" table, since OR ""="" is always TRUE.

SQL Injection Based on Batched SQL Statements


Most databases support batched SQL statement.

A batch of SQL statements is a group of two or more SQL statements, separated by semicolons.

The SQL statement below will return all rows from the "Users" table, then delete the "Suppliers" table.

Example
SELECT * FROM Users; DROP TABLE Suppliers

Look at the following example:

Example
txtUserId = getRequestString("UserId");
txtSQL = "SELECT * FROM Users WHERE UserId = " + txtUserId;

And the following input:

105; DROP TA
User id:

The valid SQL statement would look like this:

Result
SELECT * FROM Users WHERE UserId = 105; DROP TABLE Suppliers;

Use SQL Parameters for Protection


To protect a web site from SQL injection, you can use SQL parameters.

SQL parameters are values that are added to an SQL query at execution time, in a controlled manner.

ASP.NET Razor Example


txtUserId=getRequestString("UserId");
txtSQL="SELECT*FROM Users WHERE UserId=@0";
db.Execute(txtSQL,txtUserId);

Note that parameters are represented in the SQL statement by a @ marker.


The SQL engine checks each parameter to ensure that it is correct for its column and are treated literally, and not as part of the SQL
to be executed.

Another Example
txtNam=getRequestString("CustomerName");
txtAdd=getRequestString("Address");
txtCit=getRequestString("City");
txtSQL="INSERT INTO Customers(CustomerName,Address,City)Values(@0,@1,@2)";
db.Execute(txtSQL,txtNam,txtAdd,txtCit

Examples
The following examples shows how to build parameterized queries in some common web languages.

SELECT STATEMENT IN ASP.NET:

txtUserId=getRequestString("UserId");
sql="SELECT*FROM Customers WHERE CustomerId=@0";
command=new SqlCommand(sql);
command.Parameters.AddWithValue("@0",txtUserId);
command.ExecuteReader();

Statistical Database Security


Certain databases may contain confidential or secret data of individuals of country like (Aadhar numbers, PAN card
numbers) and this database should not be accessed by attackers. So, therefore it should be protected from user access.
The database which contains details of huge population is called Statistical databases and it is used mainly to produce
statistics on various populations. But Users are allowed to retrieve certain statistical information of population like averages
of population of particular state/district etc and their sum, count, maximum, minimum, and standard deviations, etc.
It is the responsibility of ethical hackers to monitor Statistical Database security statistical users are not permitted to access
individual data, such as income of specific person, phone number, Debit card numbers of specified person in database
because Statistical database security techniques prohibit retrieval of individual data. It is also responsibility of DBMS to
provide confidentiality of data about individuals.
Statistical Queries:
The queries which allow only aggregate functions such as COUNT, SUM, MIN, MAX, AVERAGE, and STANDARD
DEVIATION are called statistical queries. Statistical queries are mainly used for knowing population statistics and in
companies/industries to maintain their employees’ database etc.
Example –
Consider the following examples of statistical queries where EMP_SALARY is confidential database that contains the
income of each employee of company.
Query-1:
SELECT COUNT(*)
FROM EMP_SALARY
WHERE Emp-department = '3';
Query-2:
SELECT AVG(income)
FROM EMP_SALARY
WHERE Emp-id = '2';
Here, the “Where” condition can be manipulated by attacker and there is chance to access income of individual
employees or confidential data of employee if he knows id/name of particular employee.
The possibility of accessing individual information from statistical queries is reduced by using the following
measures –
1. Partitioning of Database – This means the records of database must be not be stored as bulk in single
record. It must be divided into groups of some minimum size according to confidentiality of records.
The advantage of Partitioning of database is queries can refer to any complete group or set of groups,
but queries cannot access the subsets of records within a group. So, attacker can access at most one or
two groups which are less private.
2. If no statistical queries are permitted whenever number of tuples in population specified by selection
condition falls below some threshold.
3. Prohibit sequences of queries that refer repeatedly to same population of tuples.

Flow Control:
Measures of Control
The measures of control can be broadly divided into the following categories −
 Access Control − Access control includes security mechanisms in a database management system to protect against
unauthorized access. A user can gain access to the database after clearing the login process through only valid user
accounts. Each user account is password protected.
 Flow Control − Distributed systems encompass a lot of data flow from one site to another and also within a site. Flow
control prevents data from being transferred in such a way that it can be accessed by unauthorized agents. A flow
policy lists out the channels through which information can flow. It also defines security classes for data as well as
transactions.
 Data Encryption − Data encryption refers to coding data when sensitive data is to be communicated over public
channels. Even if an unauthorized agent gains access of the data, he cannot understand it since it is in an
incomprehensible format.

Encryption and Public Key Infrastructures


The most distinct feature of Public Key Infrastructure (PKI) is that it uses a pair of keys to achieve the underlying securit y
service. The key pair comprises of private key and public key.
Since the public keys are in open domain, they are likely to be abused. It is, thus, necessary to establish and maintain some
kind of trusted infrastructure to manage these keys.

Key Management
It goes without saying that the security of any cryptosystem depends upon how securely its keys are managed. Without secure
procedures for the handling of cryptographic keys, the benefits of the use of strong cryptographic schemes are potentially
lost.
It is observed that cryptographic schemes are rarely compromised through weaknesses in their design. However, they are
often compromised through poor key management.
There are some important aspects of key management which are as follows −
 Cryptographic keys are nothing but special pieces of data. Key management refers to the secure administration of
cryptographic keys.
 Key management deals with entire key lifecycle as depicted in the following illustration −
 There are two specific requirements of key management for public key cryptography.
o Secrecy of private keys. Throughout the key lifecycle, secret keys must remain secret from all parties except
those who are owner and are authorized to use them.
o Assurance of public keys. In public key cryptography, the public keys are in open domain and seen as public
pieces of data. By default there are no assurances of whether a public key is correct, with whom it can be
associated, or what it can be used for. Thus key management of public keys needs to focus much more
explicitly on assurance of purpose of public keys.
The most crucial requirement of ‘assurance of public key’ can be achieved through the public-key infrastructure (PKI), a key
management systems for supporting public-key cryptography.

Public Key Infrastructure (PKI)


PKI provides assurance of public key. It provides the identification of public keys and their distribut ion. An anatomy of PKI
comprises of the following components.

 Public Key Certificate, commonly referred to as ‘digital certificate’.


 Private Key tokens.
 Certification Authority.
 Registration Authority.
 Certificate Management System.

Digital Certificate
For analogy, a certificate can be considered as the ID card issued to the person. People use ID cards such as a driver's license,
passport to prove their identity. A digital certificate does the same basic thing in the electronic world, but with one difference.
Digital Certificates are not only issued to people but they can be issued to computers, software packages or anything else that
need to prove the identity in the electronic world.
 Digital certificates are based on the ITU(International Telecommunication Union) standard X.509 which defines a
standard certificate format for public key certificates and certification validation. Hence digital certificates are
sometimes also referred to as X.509 certificates.
Public key pertaining to the user client is stored in digital certificates by The Certification Authority (CA) along with
other relevant information such as client information, expiration date, usage, issuer etc.
 CA digitally signs this entire information and includes digital signature in the certificate.
 Anyone who needs the assurance about the public key and associated information of client, he carries out the signature
validation process using CA’s public key. Successful validation assures that the public key given in the certificate
belongs to the person whose details are given in the certificate.
The process of obtaining Digital Certificate by a person/entity is depicted in the following illustration.

As shown in the illustration, the CA accepts the application from a client to certify his public key. The CA, after duly verifying
identity of client, issues a digital certificate to that client.

Certifying Authority (CA)


As discussed above, the CA issues certificate to a client and assist other users to verify the certificate. The CA takes
responsibility for identifying correctly the identity of the client asking for a certificate to be issued, and ensures that the
information contained within the certificate is correct and digitally signs it.

Key Functions of CA
The key functions of a CA are as follows −
 Generating key pairs − The CA may generate a key pair independently or jointly with the client.
 Issuing digital certificates − The CA could be thought of as the PKI equivalent of a passport agency − the CA issues
a certificate after client provides the credentials to confirm his identity. The CA then signs the certificate to prevent
modification of the details contained in the certificate.
 Publishing Certificates − The CA need to publish certificates so that users can find them. There are two ways of
achieving this. One is to publish certificates in the equivalent of an electronic telephone directory. The other is to send
your certificate out to those people you think might need it by one means or another.
 Verifying Certificates − The CA makes its public key available in environment to assist verification of his signature
on clients’ digital certificate.
 Revocation of Certificates − At times, CA revokes the certificate issued due to some reason such as compromise of
private key by user or loss of trust in the client. After revocation, CA maintains the list of all revoked certificate that
is available to the environment.

Classes of Certificates
There are four typical classes of certificate −
 Class 1 − These certificates can be easily acquired by supplying an email address.
 Class 2 − These certificates require additional personal information to be supplied.
 Class 3 − These certificates can only be purchased after checks have been made about the requestor’s identity.
 Class 4 − They may be used by governments and financial organizations needing very high levels of trust.

Registration Authority (RA)


CA may use a third-party Registration Authority (RA) to perform the necessary checks on the person or company requesting
the certificate to confirm their identity. The RA may appear to the client as a CA, but they do not actually sign the certificate
that is issued.

Certificate Management System (CMS)


It is the management system through which certificates are published, temporarily or permanently suspended, renewed, or
revoked. Certificate management systems do not normally delete certificates because it may be necessary to prove their status
at a point in time, perhaps for legal reasons. A CA along with associated RA runs certificate management systems to be able
to track their responsibilities and liabilities.

Private Key Tokens


While the public key of a client is stored on the certificate, the associated secret private key can be stored on the key owner’s
computer. This method is generally not adopted. If an attacker gains access to the computer, he can easily gain access to
private key. For this reason, a private key is stored on secure removable storage token access to which is protected through a
password.
Different vendors often use different and sometimes proprietary storage formats for storing keys. For example, Entrust uses
the proprietary .epf format, while Verisign, GlobalSign, and Baltimore use the standard .p12 format.

What is an EPF file?


Part file created by Edgecam Student Edition, an educational version of Edgecam used to train students for part
design and manufacturing; contains a part design, including geometry, toolpaths, and other part properties.
A p12 file contains a digital certificate that uses PKCS#12 (Public Key Cryptography Standard #12) encryption. It is
used as a portable format for transferring personal private keys and other sensitive information. P12 files are used by various
security and encryption programs.

Hierarchy of CA
With vast networks and requirements of global communications, it is practically not feasible to have only one trusted CA
from whom all users obtain their certificates. Secondly, availability of only one CA may lead to difficulties if CA is
compromised.
In such case, the hierarchical certification model is of interest since it allows public key certificates to be used in environments
where two communicating parties do not have trust relationships with the same CA.
 The root CA is at the top of the CA hierarchy and the root CA's certificate is a self-signed certificate.
 The CAs, which are directly subordinate to the root CA (For example, CA1 and CA2) have CA certificates that are
signed by the root CA.
 The CAs under the subordinate CAs in the hierarchy (For example, CA5 and CA6) have their CA certificates signed
by the higher-level subordinate CAs.
Certificate authority (CA) hierarchies are reflected in certificate chains. A certificate chain traces a path of certificates from
a branch in the hierarchy to the root of the hierarchy.
The following illustration shows a CA hierarchy with a certificate chain leading from an entity certificate through two
subordinate CA certificates (CA6 and CA3) to the CA certificate for the root CA.

Verifying a certificate chain is the process of ensuring that a specific certificate chain is valid, correctly signed, and
trustworthy. The following procedure verifies a certificate chain, beginning with the certificate that is presented for
authentication −
 A client whose authenticity is being verified supplies his certificate, generally along with the chain of certificates up
to Root CA.
 Verifier takes the certificate and validates by using public key of issuer. The issuer’s public key is found in the issuer’s
certificate which is in the chain next to client’s certificate.
 Now if the higher CA who has signed the issuer’s certificate, is trusted by the verifier, verification is successful and
stops here.
 Else, the issuer's certificate is verified in a similar manner as done for client in above steps. This process continues till
either trusted CA is found in between or else it continues till Root CA.
Preserving Data Privacy
Abstract

Incredible amounts of data is being generated by various organizations like hospitals, banks, e-commerce, retail and supply
chain, etc. by virtue of digital technology. Not only humans but machines also contribute to data in the form of closed circuit
television streaming, web site logs, etc. Tons of data is generated every minute by social media and smart phones. The
voluminous data generated from the various sources can be processed and analyzed to support decision making. However data
analytics is prone to privacy violations. One of the applications of data analytics is recommendation systems which is widely
used by ecommerce sites like Amazon, Flip kart for suggesting products to customers based on their buying habits leading to
inference attacks. Although data analytics is useful in decision making, it will lead to serious privacy concerns. Hence privacy
preserving data analytics became very important. This paper examines various privacy threats, privacy preservation techniques
and models with their limitations, also proposes a data lake based modernistic privacy preservation technique to handle privacy
preservation in unstructured data.
Introduction

There is an exponential growth in volume and variety of data as due to diverse applications of computers in all domain areas.
The growth has been achieved due to affordable availability of computer technology, storage, and network connectivity. The
large scale data, which also include person specific private and sensitive data like gender, zip code, disease, caste, shopping
cart, religion etc. is being stored in public domain. The data holder can release this data to a third party data analyst to gain
deeper insights and identify hidden patterns which are useful in making important decisions that may help in improving
businesses, provide value added services to customers , prediction, forecasting and recommendation . One of the prominent
applications of data analytics is recommendation systems which is widely used by ecommerce sites like Amazon, Flip kart for
suggesting products to customers based on their buying habits. Face book does suggest friends, places to visit and even movie
recommendation based on our interest. However releasing user activity data may lead inference attacks like identifying gender
based on user activity . We have studied a number of privacy preserving techniques which are being employed to protect
against privacy threats. Each of these techniques has their own merits and demerits. This paper explores the merits and demerits
of each of these techniques and also describes the research challenges in the area of privacy preservation. Always there exists
a trade off between data utility and privacy. This paper also proposes a data lake based modernistic privacy preservation
technique to handle privacy preservation in unstructured data with maximum data utility.
Privacy threats in data analytics

Privacy is the ability of an individual to determine what data can be shared, and employ access control. If the data is in public
domain then it is a threat to individual privacy as the data is held by data holder. Data holder can be social networking
application, websites, mobile apps, ecommerce site, banks, hospitals etc. It is the responsibility of the data holder to ensure
privacy of the users data. Apart from the data held in public domain, knowing or unknowingly users themself contribute to
data leakage. For example most of the mobile apps, seek access to our contacts, files, camera etc. and without reading the
privacy statement we agree for all terms and conditions, there by contributing to data leakage.

Hence there is a need to educate the smart phone users regarding privacy and privacy threats. Some of the key privacy threats
include (1) Surveillance; (2) Disclosure; (3) Discrimination; (4) Personal embracement and abuse.

Surveillance
Many organizations including retail, e-commerce, etc. study their customers buying habits and try to come up with various
offers and value added services . Based on the opinion data and sentiment analysis, social media sites does provide
recommendations of the new friends, places to visit, people to follow etc. This is possible only when they continuously monitor
their customer’s transactions. This is a serious privacy threat as no individual accepts surveillance.

Disclosure
Consider a hospital holding patient’s data which include (Zip, gender, age, disease) . The data holder has released data to a
third party for analysis by anonymizing sensitive person specific data so that the person cannot be identified. The third party
data analyst can map this information with the freely available external data sources like census data and can identify person
suffering with some disorder. This is how private information of a person can be disclosed which is considered to be a serious
privacy breach.

Discrimination
Discrimination is the bias or inequality which can happen when some private information of a person is disclosed. For instance,
statistical analysis of electoral results proved that people of one community were completely against the party, which formed
the government. Now the government can neglect that community or can have bias over them.

Personal embracement and abuse


Whenever some private information of a person is disclosed, it can even lead to personal embracement or abuse. For example,
a person was privately undergoing medication for some specific problem and was buying some medicines on a regular basis
from a medical shop. As part of their regular business model, the medical shop may send some reminder and offers related to
these medicines over phone. If any family member has noticed this, it will lead to personal embracement and even abuse .

Data analytics activity will affect data Privacy. Many countries are enforcing Privacy preservation laws. Lack of awareness is
also a major reason for privacy attacks. For example many smart phones users are not aware of the information that is stolen
from their phones by many apps. Previous research shows only 17% of smart phone users are aware of privacy threats .
Privacy preservation methods

Many Privacy preserving techniques were developed, but most of them are based on anonymization of data. The list of privacy
preservation techniques is given below.

 K anonymity
 L diversity
 T closeness
 Randomization
 Data distribution
 Cryptographic techniques
 Multidimensional Sensitivity Based Anonymization (MDSBA).
K anonymity
Anonymization is the process of modifying data before it is given for data analytics , so that de identification is not possible
and will lead to K indistinguishable records if an attempt is made to de identify by mapping the anonymized data with external
data sources. K anonymity is prone to two attacks namely homogeneity attack and back ground knowledge attack. Some of
the algorithms applied include, Incognito , Mondrian to ensure Anonymization. K anonymity is applied on the patient data
shown in Table 1. The table shows data before anonymization.

Table 1 Patient data, before anonymization


From: Privacy preservation techniques in big data analytics: a survey
Sno Zip Age Disease

1 57677 29 Cardiac problem

2 57602 22 Cardiac problem

3 57678 27 Cardiac problem

4 57905 43 Skin allergy


Sno Zip Age Disease

5 57909 52 Cardiac problem

6 57906 47 Cancer

7 57605 30 Cardiac problem

8 57673 36 Cancer

9 57607 32 Cancer

K anonymity algorithm is applied with k value as 3 to ensure 3 indistinguishable records when an attempt is made to identify
a particular person’s data. K anonymity is applied on the two attributes viz. Zip and age shown in Table 1. The result of
applying anonymization on Zip and age attributes is shown in Table 2.

Table 2 After applying anonymization on Zip and age


From: Privacy preservation techniques in big data analytics: a survey
Sno Zip Age Disease

1 576** 2* Cardiac problem

2 576** 2* Cardiac problem

3 576** 2* Cardiac problem

4 5790* > 40 Skin allergy

5 5790* > 40 Cardiac problem

6 5790* > 40 Cancer

7 576** 3* Cardiac problem

8 576** 3* Cancer

9 576** 3* Cancer

The above technique has used generalization to achieve Anonymization. Suppose if we know that John is 27 year old and lives
in 57677 zip codes then we can conclude John to have Cardiac problem even after anonymization as shown in Table 2. This is
called Homogeneity attack. For example if John is 36 year old and it is known that John does not have cancer, then definitely
John must have Cardiac problem. This is called as background knowledge attack. Achieving K anonymity can be done either
by using generalization or suppression. K anonymity can optimized if the minimal generalization can be done without huge
data loss . Identity disclosure is the major privacy threat which cannot be guaranteed by K anonymity . Personalized privacy
is the most important aspect of individual privacy .
L diversity
To address homogeneity attack, another technique called L diversity has been proposed. As per L diversity there must be L
well represented values for the sensitive attribute (disease) in each equivalence class.

Implementing L diversity is not possible every time because of the variety of data. L diversity is also prone to skewness attack.
When overall distribution of data is skewed into few equivalence classes attribute disclosure cannot be ensured. For example
if the entire records are distributed into only three equivalence classes then semantic closeness of these values may lead to
attribute disclosure. Also L diversity may lead to similarity attack. From Table 3 it can be noticed that if we know that John is
27 year old and lives in 57677 zip, then definitely John is under low income group because salaries of all three persons in
576** zip is low compare to others in the table. This is called as similarity attack.

Table 3 L diversity privacy preservation technique


From: Privacy preservation techniques in big data analytics: a survey
Sno Zip Age Salary Disease

1 576** 2* 5k Cardiac problem

2 576** 2* 6k Cardiac problem

3 576** 2* 7k Cardiac problem

4 5790* > 40 20k Skin allergy

5 5790* > 40 22k Cardiac problem

6 5790* > 40 24k Cancer

T closeness
Another improvement to L diversity is T closeness measure where an equivalence class is considered to have ‘T
closeness’ if the distance between the distributions of sensitive attribute in the class is no more than a threshold
and all equivalence classes have T closeness . T closeness can be calculated on every attribute with respect to
sensitive attribute.

From Table 4 it can be observed that if we know John is 27 year old, still it will be difficult to estimate whether
John has Cardiac problem or not and he is under low income group or not. T closeness may ensure attribute
disclosure but implementing T closeness may not give proper distribution of data every time.

Table 4 T closeness privacy preservation technique


From: Privacy preservation techniques in big data analytics: a survey
Sno Zip Age Salary Disease

1 576** 2* 5k Cardiac problem


Sno Zip Age Salary Disease

2 576** 2* 16k Cancer

3 576** 2* 9k Skin allergy

4 5790* > 40 20k Skin allergy

5 5790* > 40 42k Cardiac problem

6 5790* > 40 8k Flu

Randomization technique
Randomization is the process of adding noise to the data which is generally done by probability distribution .
Randomization is applied in surveys, sentiment analysis etc. Randomization does not need knowledge of other
records in the data. It can be applied during data collection and pre processing time. There is no anonymization
overhead in randomization. However, applying randomization on large datasets is not possible because of time
complexity and data utility which has been proved in our experiment described below.

We have loaded 10k records from an employee database into Hadoop Distributed File System and processed
them by executing a Map Reduce Job. We have experimented to classify the employees based on their salary
and age groups. In order apply randomization we added noise in the form of 5k records which are randomly
added to make a database of 15k records and following observations were made after running Map Reduce job.

 More number of Mappers and Reducers were used as data volume increased.
 Results before and after randomization were significantly different.
 Some of the records which are outliers remain unaffected with randomization and are vulnerable to
adversary attack.
 Privacy preservation at the cost of data utility is not appreciated and hence randomization may not be
suitable for privacy preservation especially attribute disclosure.
Data distribution technique
In this technique, the data is distributed across many sites. Distribution of the data can be done in two ways:

1. i.Horizontal distribution of data


2. ii.Vertical distribution of data

Horizontal distribution When data is distributed across many sites with same attributes then the distribution is
said to be horizontal distribution which is described in Fig. 1.
Distribution of sales data across different sites

Horizontal distribution of data can be applied only when some aggregate functions or operations are to be applied
on the data without actually sharing the data. For example, if a retail store wants to analyse their sales across
various branches, they may employ some analytics which does computations on aggregate data. However, as
part of data analysis the data holder may need to share the data with third party analyst which may lead to privacy
breach. Classification and Clustering algorithms can be applied on distributed data but it does not ensure privacy.
If the data is distributed across different sites which belong to different organizations, then results of aggregate
functions may help one party in detecting the data held with other parties. In such situations we expect all
participating sites to be honest with each other .

Vertical distribution of data When Person specific information is distributed across different sites under
custodian of different organizations, then the distribution is called vertical distribution as shown in Fig. 2. For
example, in crime investigations, the police officials would like to know details of a particular criminal which
include health, profession, financial, personal etc. All this information may not be available at one site. Such a
distribution is called vertical distribution where each site holds few set of attributes of a person. When some
analytics has to be done data has to be pooled in from all these sites and there is a vulnerability of privacy breach.

Vertical distribution of person specific data


In order to perform data analytics on vertically distributed data, where the attributes are distributed across different sites under
custodian of different parties, it is highly difficult to ensure privacy if the datasets are shared. For example, as part of a police
investigation, the investigating officer wants to access some information about the accused from his employer, health
department, bank to gain more insights about the character of the person. In this process some of the personal and sensitive
information of the accused may be disclosed to investigating officer leading to personal embarrassment or abuse.
Anonymization cannot be applied when entire records are not needed for analytics. Distribution of data will not ensure privacy
preservation but it closely overlaps with cryptographic techniques.

Cryptographic techniques
The data holder may encrypt the data before releasing the same for analytics. But encrypting large scale data using conventional
encryption techniques is highly difficult and must be applied only during data collection time. Differential privacy techniques
have already been applied where some aggregate computations on the data are done without actually sharing the inputs. For
example, if x and y are two data items then a function F(x, y) will be computed to gain some aggregate information from both
x and y without actually sharing x and y. This can be applied on when x and y are held with different parties as in the case of
vertical distribution. However, if the data is at single location under the custodian of a single organization, then differential
privacy cannot be employed. Another similar technique called secure multiparty computation has been used but proved to be
inadequate in privacy preservation. Data utility will be less if encryption is applied during data analytics. Thus encryption is
not only difficult to implement but it reduces the data utility .

Multidimensional Sensitivity Based Anonymization (MDSBA)


Bottom up Generalization and Top down Generalization are the conventional methods of Anonymization which were applied
on well represented structured data records. However, applying the same on large scale data sets is very difficult leading to
issues of scalability and information loss. Multidimensional Sensitivity Based Anonymization is a improved version of
Anonymization proved to be more effective than conventional Anonymization techniques.

Multidimensional Sensitivity Based Anonymization is an improved Anonymization technique such that it can be applied on
large data sets with reduced loss of information and predefined quasi identifiers. As part of this technique Apache MAP
REDUCE framework has been used to handle large data sets. In conventional Hadoop Distributed Files System, the data will
be divided into blocks of either 64 MB or 128 MB each and distributed across different nodes without considering the data
inside the blocks. As part of Multidimensional Sensitivity Based Anonymization technique the data is split into different bags
based on the probability distribution of the quasi identifiers by making use of filters in Apache Pig scripting language.

Multidimensional Sensitivity Based Anonymization makes use of bottom up generalization but on a set of attributes with
certain class values where class represents a sensitive attributes. Data distribution was made effectively when compared to
conventional method of blocks. Data Anonymization was done using four quasi identifiers using Apache Pig.

Since the data is vertically partitioned into different groups, it can protect from background knowledge attack if the bag contains
only few attributes. This method also makes it difficult to map the data with external sources to disclose any person specific
information.

In this method, the implementation was done using Apache Pig. Apache Pig is a scripting language, hence development effort
is less. However, code efficiency of Apache Pig is relatively less when compared to Map Reduce job because ultimately every
Apache Pig script has to be converted into a Map Reduce job. Multidimensional Sensitivity Based Anonymization is more
appropriate for large scale data but only when the data is at rest. Multidimensional Sensitivity Based Anonymization cannot
be applied for streaming data.
Analysis

Various privacy preservation techniques have been studied with respect to features including, type of data, data utility, attribute
preservation and complexity. The comparison of various privacy preservation techniques is shown in Table 5.

Table 5 Comparison of privacy preservation techniques


From: Privacy preservation techniques in big data analytics: a survey
Features Privacy preservation techniques

Anonymization Cryptographic Data distribution Randomization MDSBA


techniques techniques

Suitability for unstructured No No No No Yes


data

Attribute preservation No No No Yes Yes

Damage to data utility No No Yes No Yes

Very complex to apply No Yes Yes Yes Yes

Accuracy of results of data No Yes No No No


analytics

Results and discussions

As part of systematic literature review, it has been observed that all existing mechanisms of privacy preservation are with
respect to structured data. More than 80% of data being generated today is unstructured . As such, there is a need to address
following challenges.

1. i.Develop concrete solution to protect privacy in both structured and unstructured data.
2. ii.Scalable and robust techniques to be developed to handle large scale heterogeneous data sets.
3. iii.Data should be allowed to stay in its native form without need for transformation and data analytics can be carried
out while ensuring privacy preservation.
4. iv.New techniques apart from Anonymization must be developed to ensure protection against key privacy threats
which include identity disclosure, discrimination, surveillance etc.
5. v.Maximizing data utility while ensuring data privacy.

Conclusion

No concrete solution for unstructured data has been developed yet. Conventional data mining algorithms can be applied for
classification and clustering problems but cannot be used in privacy preservation especially when dealing with person specific
information. Machine learning and soft computing techniques can be used to develop new and more appropriate solution to
privacy problems which include identity disclosure that can lead to personal embarrassment and abuse.

There is a strong need for law enforcement by governments of all countries to ensure individual privacy. European Union is
making an attempt to enforce privacy preservation law. Apart from technological solutions, there is a strong need to create
awareness among the people regarding privacy hazards to safeguard themselves form privacy breaches. One of the serious
privacy threats is smart phone. Lot of personal information in the form of contacts, messages, chats and files are being accessed
by many apps running in our smart phone without our knowledge. Most of the time people do not even read the privacy
statement before installing any app. Hence there is a strong need to educate people on the various vulnerabilities which can
contribute to leakage of private information.

We propose a novel privacy preservation model based on Data Lake concept to hold variety of data from diverse sources. Data
lake is a repository to hold data from diverse sources in their raw format . Data ingestion from variety of sources can be done
using Apache Flume and an intelligent algorithm based on machine learning can be applied to identify sensitive attributes
dynamically . The algorithm will be trained with existing data sets with known sensitive attributes and rigorous training of the
model will help in predicting the sensitive attributes in a given data set . Accuracy of the model can be improved by adding
more layers of training leading to deep learning techniques . Advanced computing techniques like Apache Spark can be used
in implementing privacy preserving algorithms which is a distributed massive parallel computing with in memory processing
to ensure very fast processing. The proposed model is shown in Fig. 3.

Fig. 3

A Novel privacy preservation model based on vertical distribution and tokenization


Data analytics is done on the data collected from various sources. If an ecommerce site would like to perform data analytics,
they need transactional data, website logs and customers opinion through social media pages. A Data lake is used to collect
data from different sources. Apache Flume is used to ingest data from social media sites, website logs into Hadoop Distributed
File System(HDFS). Using SQOOP relational data can be loaded into HDFS.

In Data lake the data can remain in its native form which is either structured or unstructured. When data has to be processed,
it can be transformed into HIVE tables. A Hadoop map reduce job using machine learning can be executed on the data to
classify the sensitive attributes . The data can be vertically distributed to separate the sensitive attributes from rest of the data
and apply tokenization to map the vertically distributed data. The data without any sensitive attributes can be published for
data analytics.
Abbreviations

CCTV:
closed circuit television
MDSBA:

Multidimensional Sensitivity Based Anonymization

Challenges to Maintaining Database Security


Seeing the vast increase in volume and speed of threats to databases and many information assets, research efforts need to
be consider to the following issues
such as data quality, intellectual property rights, and database survivability.
Let’s discuss them one by one.
1. Data quality –
 The database community basically needs techniques and some organizational solutions to assess and attest the
quality of data. These techniques may include the simple mechanism such as quality stamps that are posted on
different websites. We also need techniques that will provide us more effective integrity semantics verification
tools for assessment of data quality, based on many techniques such as record linkage.
 We also need application-level recovery techniques to automatically repair the incorrect data.
 The ETL that is extracted transform and load tools widely used for loading the data in the data warehouse are
presently grappling with these issues.
2. Intellectual property rights–
As the use of Internet and intranet is increasing day by day, legal and informational aspects of data are becoming major
concerns for many organizations. To address this concerns watermark technique are used which will help to protect content
from unauthorized duplication and distribution by giving the provable power to the ownership of the content.
Traditionally they are dependent upon the availability of a large domain within which the objects can be altered while
retaining its essential or important properties.
However, research is needed to access the robustness of many such techniques and the study and investigate many different
approaches or methods that aimed to prevent intellectual property rights violation.
3.Database survivability–
Database systems need to operate and continued their functions even with the reduced capabilities, despite disruptive events
such as information warfare attacks
A DBMS in addition to making every effort to prevent an attack and detecting one in the event of the occurrence should be
able to do the following:
 Confident:
We should take immediate action to eliminate the attacker’s access to the system and to isolate or contain the
problem to prevent further spread.
 Damage assessment:
Determine the extent of the problem, including failed function and corrupted data.
 Recover:
Recover corrupted or lost data and repair or reinstall failed function to reestablish a normal level of operation.
 Reconfiguration:
Reconfigure to allow the operation to continue in a degraded mode while recovery proceeds.
 Fault treatment:
To the extent possible, identify the weakness exploited in the attack and takes steps to prevent a recurrence.

Database Survivability
Survivability of a system is the capability of a system to fulfill its mission in a timely manner in the
presence of attacks, failures, or accidents.
Survivability of Systems in the Cloud
Survivability is the ability of a system to adapt and recover from a serious failure, or more generally the ability to retain service
functionality in the face of threats. This could be related to small local events—such as equipment failures, and reconfigure
itself essentially automatically and over a time scale of seconds to minutes. Survivability could relate to major events, such as
a large natural disaster or a capable attack, on a time scale of hours to days or even longer. Another important part of
survivability is robustness. While survivability is to do with coping with the impact of events, robustness is to do with reducing
the impact in the first place. Assigning probabilities to potential dangers is difficult because of uncertainty. In addition, there
are no effective measures to actually assess the performance of the Internet . Because of these and other issues, dependability
is based on statistical measures of historical outages, faults, and failures.
At every level of the interconnection system in the Internet, there is little global information available, and what is available is
incomplete and of unknown accuracy. Specifically, there are no maps of physical connections, traffic, and interconnections
between ASes. There are a number of reasons for this lack of information. One is the physical complexity of the network fibers
around the world, which change from time to time as well. Another reason is that probes have limited paths and will only
reveal something about the path between two points in the Internet at the time of the probe. A security threat also exists,
because if the physical aspect of the Internet is mapped, it could be potentially dangerous material in the hands of certain
individuals and groups. Some groups have a commercial incentive for encouraging Internet anonymity and not having the
networks mapped out. Another reason is that networks lack motivation to gather such information because it does not seem to
serve them directly while being costly. Finally, there are no metrics for a network as a whole, and stakeholders must look
closely at the idiosyncrasies of the specific subsystems in use .
Oracle Label-Based Security
Oracle Label Security (OLS) is a security option for the Oracle Enterprise Edition database and mediates access to data rows by
comparing labels attached to data rows in application tables (sensitivity labels) and a set of user labels (clearance labels).

The need for more sophisticated controls on access to sensitive data is becoming increasingly important as
organizations address emerging security requirements around data consolidation, privacy and compliance.
Maintaining separate databases for highly sensitive customer data is costly and creates unnecessary
administrative overhead. However, consolidating databases sometimes means combining sensitive financial,
HR, medical or project data from multiple locations into a single database for reduced costs, easier
management and better scalability. Oracle Label Security provides the ability to tag data with a data label
or a data classification to allow the database to inherently know what data a user or role is authorizedfor
and allows data from different sources to be combined in the same table as a larger data set without
compromising security.

Access to sensitive data is controlled by comparing the data label with therequesting user’s label or access
clearance. A user label or access clearance can be thought of as an extension to standard database privileges
and roles. Oracle Label Security is centrally enforced within thedatabase, below the application layer,
providing strong security and eliminating the need for complicated application views.

What is Oracle Label Security?


Oracle Label Security (OLS) is a security option for the Oracle Enterprise Edition database and mediates
access to data rows by comparing labels attached to data rows in application tables (sensitivity labels) and a
set of user labels (clearance labels).

1 FAQ / Oracle Label Security


Who should consider Oracle Label Security?
Sensitivity labels are used in some form in virtually every industry. These industries include health care, law
enforcement, energy, retail, national security and defense industries. Examples of
label use include:

 Separating individual branch store, franchisee, or region data

 Financial companies with customers that span multiple countries with strong
government privacycontrols

 Consolidating and securing sensitive R&D projects

 Minimizing access to individual health care records

 Protecting HR data from different divisions

 Securing classified data for Government and Defense use

 Complying with U.S. State Department’s International Traffic in Arms (ITAR) regulations

 Supporting multiple customers in a multi-tenant SaaS application

 Restrict data processing, tracking consent and handling right to erasure requests under EU GDPR

What can Oracle Label Security do for my


security needs?
Oracle Label Security can be used to label data and restrict access with a high degree
of granularity. This is particularly useful when multiple organizations, companies or
users share a single application. Sensitivity labels can be used to restrict application
users to a subset of data within an organization, without having to change the
application. Data privacy is important to consumers and stringent regulatory measures
continue to be enacted. Oracle Label Security can be used to implement privacy
policies on data, restricting access to only those who have a need-to-know.

COMPONENTS AND FEATURES

2 FAQ / Oracle Label Security


What are the main components of Oracle Label
Security?
Label Security provides row level data access controls for application users. This is
called Label Security because each user and each data record have an associated
security label.

The User label consist of three components – a level, zero or more compartments
and zero or more groups. This label is assigned as part of the user authorization and
is not modifiable by the user.

Session labels also consists of the same three components and are different from the
user label based on the session that was established by the user. For example if the user
has a Top Secret level component, but the user logged in from a Secret workstation,
the session label level would be Secret.

Data security labels have the same three components as the User and Session

labels. Label components – the three label components are level,

compartment and group.

 Levels indicate the sensitivity level of the data and the authorization for a user to access sensitive
data. The user (and session) level must be equal or greater than the data level to
access that record.

3 FAQ / Oracle Label Security


 Data can be part of zero or more compartments. The user/session label must have
every compartment that the record data has for the user to successfully retrieve the
record. For example, if the data label compartments are A, B and C – the session
label must at least contain A, B and C to access that data record.

 Data can have zero or more groups in the group component. The user/session label
needs to have at least one group that matches a data record’s group(s) to access the
data record. For example, if the data record had Boston, Chicago and New York for
groups, then the session label needs only Boston (or one of the other 2 groups) to
access the data.

 Protected objects are tables with labeled records.

 Label Security policies are a combination of User labels, Data labels and protected objects.

Does Oracle Label Security provide column-


level access control?
No, Oracle Label Security is not column-aware.

A column-sensitive Virtual Private Database (VPD) policy can determine access to


a specific column by evaluating OLS user labels. An example of this type of OLS
and VPD integration is available as a white paper on the OLS OTN webpage.

A VPD policy can be written so that it only becomes active when a certain column
(the 'sensitive' column) is part of a SQL statement against a protected table. With
'column sensitivity' switch on, VPD either returns only those rows that include
information in the sensitive column the user is allowed to see, or it returns all rows,
with all cells in the sensitive column being empty, except those values the user is
allowed to see.

Can I base Secure Application Roles on Oracle


Label Security?
Yes, the procedure, which determines if the 'set role' command is executed, can
evaluate OLS user labels. In this case, the OLS policy does not need to be applied to
a table, since row labels are not part of this solution. An example of this can be found
as a white paper on the OLS OTN webpage.

4 FAQ / Oracle Label Security


What are Trusted Stored Program Units?
Stored procedures, functions and packages execute with the system and object
privileges (DAC) of the definer. If the invoker is a user with OLS user clearances
(labels), the procedure executes with a combination of the definer's DAC privileges
and the invoker's security clearances.

Trusted stored procedures are procedures that are either granted the OLS privilege
'FULL' or 'READ'. When a trusted stored program unit is carried out, the policy
privileges in force are a union of the invoking user's privileges and the program unit's
privileges.

Are there any administrative tools available for


Oracle Label Security?

Beginning with Oracle Database 11gR1, the functionality of Oracle Policy Manager
(and most other security related administration tools) has been made available in
Oracle Enterprise Manager Cloud Control, enabling administrators to create and
manage OLS policies in a modern, convenient and integrated environment.

5 FAQ / Oracle Label Security


DEPLOYMENT AND ADMINISTRATION

Where can I find Oracle Label Security?


Oracle Label Security is an option that is part of Oracle Database Enterprise Edition.
Oracle LabelSecurity is installed as part of the database and just needs to be enabled.

Should I use Oracle Label Security to protect all


my tables?
The traditional Oracle discretionary access control (DAC) objects privileges
SELECT, INSERT, UPDATE, and DELETE combined with database roles and
stored procedures are sufficient for most tables. Furthermore, before applying OLS to
your sensitive tables, some considerations need to be taken into account; they are
described in a white paper titled Oracle Label Security – Multi-Level Security
Implementation found on the OLS OTN webpage.

Are there any guidelines for using Oracle Label


Security and defining sensitivity labels?
Yes, a comprehensive Label Security Administrator's Guide is available online. In
addition, examples are available in a white paper and under technical resources on the
Oracle Technology Network, which walk you through a list of recommended
implementation guidelines. In most cases, the security mechanisms provided at no-
cost with the Oracle Enterprise Edition (system and object privileges, Database roles,
Secure Application Roles) will be sufficient to address security requirements. Oracle
Label Security should be considered when security is required at the individual row
level.

Can Oracle Label Security policies and


user labels (clearances) be stored

6 FAQ / Oracle Label Security


centrally in Oracle Identity
Management?
Not only can your database users be stored and managed centrally in Oracle Identity
Management using Enterprise User Security, but Oracle Label Security policies and
user clearances can be stored and managed in Oracle Identity Management as well.
This greatly simplifies policy management in distributed environments and enables
security administrators to automatically attach user clearances to all centrally
managed users.

How can I maintain the performance of


my applications after applying Label
Security access control policies?
As a best practice:

 Only apply sensitivity labels to those tables that really need protection. When
multiple tables are joined to retrieve sensitive, look for the driving table

 Do not apply OLS policies to schemas.

 Usually, there is only a small set of different data classification labels; if the table
is mostly used for READ operations, try building an Bitmap Index over the (hidden)
OLS column, and add this index to existing indexes in that table.

 Review the Oracle Label Security whitepaper available in the product OTN
webpage as it contains a thorough discussion of performance considerations with
Oracle Label Security.

Can I use Oracle Label Security with


Oracle Database Vault, Real Application
Security and Data Redaction?

Yes. Oracle Label Security can provide user labels to be used as factors within
Oracle Database Vault and security labels can be assigned to Real Application
Security users. It also integrates with oracle advanced security data
redaction,enabling security clearances to be used in data redactionpolici

7 FAQ / Oracle Label Security


8 FAQ / Oracle Label Security

You might also like