0% found this document useful (0 votes)
11 views

Oral Questions 2021 - Database

Database

Uploaded by

Maruf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Oral Questions 2021 - Database

Database

Uploaded by

Maruf
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Oral Questions 2021

Database

1. What is deadlock? How do we get rid of it?


Deadlock occurs when each transaction T1 in a set of two or more transactions is waiting for
some item that is locked by some other transaction T2 in the set.
Prevents deadlock by protocols:
1. A conservative two-phase locking, which requires that every transaction lock all the items it
needs in advance (not a practical and limits concurrency).
2. Ordering all the items in the database and making sure that a transaction that needs several
items will lock them according to that order (not practical and limits concurrency).
3. Transaction timestamp TS(T), which is a unique identifier assigned to each transaction. The
timestamps are typically based on the order in which transactions are started.

2. What is two-phase locking?


In two-phase locking, all locking operations (read_lock, write_lock) must precede the first
unlock operation in the transaction. A transaction can be divided into two phases: an expanding
or growing (first) phase, during which new locks on items can be acquired but none can be
released; and a shrinking (second) phase, during which existing locks can be released but no
new locks can be acquired.
If every transaction in a schedule follows the two-phase locking protocol, the schedule is
guaranteed to be serializable, obviating the need to test for serializability of schedules. This
basic 2PL does not deadlocks.

3. Compare between fragmentation and replication?


Fragmentation is used to break up the database into logical units, called fragments, which may
be assigned for storage at the various sites.
Replication permits certain data to be copied and stored in more than one site.
The choice of sites and the degree of replication depend on the performance and availability
goals of the system and on the types and frequencies of transactions submitted at each site.
4. Is availability satisfied by replication or fragmentation?
Availability is satisfied by replication. Fragment is one subset of a database which is allocated to
one site of the DDBS. On the other hand, replication is storing a fragment at more than one site,
this improves the availability of data.

5. Where does trigger updates happen usually?


Trigger updates happen in active databases. It is a technique for specifying certain types of
active rules that specify actions that are automatically triggered by certain events.
It uses Event-Condition-Action (ECA) model which has three components:
1. The event(s) that triggers the rule: These events are usually database update operations. e.g.
(Inserting new employee tuple)
2. The condition that determines whether the rule action should be executed.
e.g. (if the new employee is assigned to a department - Dno attribute not equal NULL)
3. The action to be taken (a sequence of SQL statements).
e.g. (automatically update the value of Total_sal for the employee’s department)

6. What is durability or permanency?


It is the changes applied to the database by a committed transaction must persist in the
database. These changes must not be lost because of any failure.
The durability property is the responsibility of the recovery subsystem of the DBMS.

7. What is critical section?


The lock_item and unlock_item operations must be implemented as indivisible units (known as
critical sections in operating systems); that is, no interleaving should be allowed once a lock or
unlock operation is started until the operation terminates or the transaction waits. (Isolation)
A critical section, CS, is a section of code in which a process accesses shared resources. To avoid
race conditions, the execution of critical sections must be mutually exclusive (e.g., at most one
process can be in its critical section at any time).
8. How would you secure the database even it’s a centralized or distributed?
To protect databases, it is common to implement four kinds of control measures:
1. Access control. (Preventing unauthorized persons from accessing the system by creating
user accounts and passwords)
2. Inference control. (In statistical databases, must ensure that information about individuals
cannot be accessed. Only summary statistics are permitted)
3. Flow control. (Prevents information from flowing in such a way that it reaches unauthorized
users.)
4. Encryption. (Protect sensitive data such as credit card numbers that is transmitted via some
type of communications network.)
There are two types of database security mechanisms:
1. Discretionary security mechanisms. These are used to grant privileges to users, including the
capability to access specific data files in a specified mode (such as read or update).
2. Mandatory security mechanisms. These are used to enforce multilevel security by classifying
the data and users into various security classes (or levels).
Role-based access control (RBAC) is a technology for managing and enforcing security in
largescale enterprise-wide systems. Its basic notion is that privileges and other permissions are
associated with organizational roles, rather than individual users. Individual users are then
assigned to appropriate roles.

9. Compare between Security & Privacy, and which one requires the other.
There is a considerable overlap between issues related to access to resources (security) and
issues related to appropriate use of information (privacy).
In summary, security involves technology to ensure that information is appropriately protected.
Security is a required building block for privacy to exist.
Security Privacy
Security is protecting a system from Privacy is the ability of individuals to control
unauthorized use, including authentication of the terms under which their personal
users, information encryption, access control, information is acquired and used.
firewall policies, and intrusion detection. The concept of privacy goes beyond security.
Privacy is preventing storage of personal
information and ensuring appropriate use of
them.
10. What are the availability and reliability of data? What is the difference between them?
Reliability and availability are two of the most common potential advantages cited for
distributed databases.
Reliability is broadly defined as the probability that a system is running (not down) at a certain
time point, whereas availability is the probability that the system is continuously available
during a time interval.
We can directly relate reliability and availability of the database to the faults, errors, and
failures associated with it.
A failure can be described as a deviation of a system’s behavior from that which is specified in
order to ensure correct execution of operations.
Errors constitute that subset of system states that causes the failure.
Fault is the cause of an error.

11. When we choose centralized or distributed database?


1. Location: if we have a company with many branches, we need to have distributed database.
2. Size of data: if we have high load on one central DB, we can use distributed DB instead to
distribute the load and increase performance.
12. What is the normalization? What is the issue of redundancy (replication)?
Normalization is a database design technique which organizes tables in a manner that reduces
redundancy and dependency of data.
Minimizing redundancy reduces the need for multiple updates to maintain consistency across
multiple copies of the same information.
The issues of redundancy are:
1. An Insert Anomaly occurs when certain attributes cannot be inserted into the database
without the presence of other attributes.
2. An Update Anomaly exists when one or more instances of duplicated data is updated, but
not all.
3. A Delete Anomaly exists when certain attributes are lost because of the deletion of other
attributes.

13. Why do we replicate the data?


To increase Reliability and Availability of data.

14. Compare between Replication & Mirroring.


The difference is that mirroring refers to copy a database to another location whereas
replication includes the copy of data and database objects from one database to another
database.
15. What is the relation of availability and consistency with fragmentation and replication?
Fragmentation and replication enhance availability but produce consistency complications.
They improve availability because the system can continue to operate as long as at least one
copy is up. But, they can slow down update operations, since a single logical update must be
performed on every copy of the database to keep the copies consistent.

16. What do we mean by sensitive data, Financial transaction?


Sensitivity of data is a measure of the importance assigned to the data by its owner, for the
purpose of denoting its need for protection. Some databases contain only sensitive data while
other databases may contain no sensitive data at all.
A financial transaction is an agreement, or communication, carried out between a buyer and a
seller to exchange an asset for payment. It involves a change in the status of the finances of two
or more businesses or individuals. (financial data is often considered confidential, and only
authorized persons are allowed to access such data.)
17. Name at least one security protocol used between two ends.
1. Symmetric keys (one key for both encryption and decryption)
2. Asymmetric keys (Two keys are used for encryption/decryption - Public/Private)
3. Digital certificates (combine a public key with the identity of the person that holds the
corresponding private key. It is issued by a certification authority (CA))
---------------------------------------------
1. Big Data:
Big Data is a collection of data that is huge in volume, yet growing exponentially with time. It is
a data with so large size and complexity that none of traditional data management tools can
store it or process it efficiently. Big data is also a data but with huge size.
Big Data could be 1) Structured, 2) Unstructured, 3) Semi-structured
Characteristics Of Big Data are: Volume (big size) - Variety (Heterogeneous sources and the
nature of data: structured and unstructured) - Velocity (the speed of generation) - Variability
(inconsistency).
Advantages of Bigdata are: improved customer service, better operational efficiency, better
decision making.
2. What is the main issue in distributed database?
The main issue is consistency and communication among sites.
DDB are complex and need extra functions:
1. Keeping track of data distribution.
2. The ability to access remote sites and transmit queries.
3. Distributed transaction management.
4. Maintain the consistency of copies of a replicated data item.
5. Distributed database recovery. There are new types of failures (failure of
communication Links).
6. Security.

3. What are the advantages of distributed database?


1. Improved ease and flexibility of application development.
2. Increased reliability and availability.
3. Improved performance. Data localization reduces the contention for CPU and I/O
services and simultaneously reduces access delays.
4. Easier expansion.

4. What is the famous techniques of replication?


1. fully replicated: the whole database at every site. This can improve availability and
performance of retrieval for global queries The disadvantage is that it can slow down
update operations to keep the copies consistent. It makes the concurrency control and
recovery techniques more expensive.
2. no replication: each fragment is stored at exactly one site.
3. partial replication: some fragments of the database may be replicated whereas others may
not.
The choice of the degree of replication depend on the performance and availability goals of the
system and on the types and frequencies of transactions submitted at each site. It is a complex
optimization problem.

5. What are polymorphism, abstraction and Inheritance?


Data abstraction generally refers to the suppression of details of data organization and storage,
and the highlighting of the essential features for an improved understanding of data.
Operator polymorphism or overloading: is an OO concept which refers to an operation’s ability
to be applied to different types of objects; i.e. one operation name may refer to several distinct
implementations, depending on the type of object it is applied to.
Inheritance: allows the definition of new types (subtypes) based on other predefined types
(supertypes), leading to a type (or class) hierarchy.

6. What is the benefits of consistency and how to achieve it?


Consistency: A consistent state of the database satisfies the constraints specified in the schema
as well as any other constraints on the database that should hold.
Consistency preservation. A transaction should be consistency preserving, meaning that if it is
completely executed from beginning to end without interference from other transactions, it
should take the database from one consistent state to another.

7. What are the fragmentation types? Which one is better?


1. Horizontal Fragmentation. It divides a relation horizontally by grouping rows to create
subsets of tuples, where each subset has a certain logical meaning e.g. We may want to
store the database information relating to each department at the computer site for
that department.
2. Vertical Fragmentation. It divides a relation “vertically” by columns. A vertical fragment
of a relation keeps only certain attributes of the relation. It is necessary to include the
primary key or some candidate key attribute in every vertical fragment so that the full
relation can be reconstructed from the fragments.
3. Mixed (Hybrid) Fragmentation. We can intermix the two types of fragmentation,
yielding a mixed fragmentation.
No one is better, it depends on the requirements of the system.

8. What is security in distributed database?


Distributed transactions must be executed with the proper management of the security of the
data and the authorization/access privileges of users.

9. What is relational DB? How it is represented?


A relational database is a collection of data items with pre-defined relationships between them.
These items are organized as a set of tables with columns and rows.
Q. What is Redundancy drawback?
Minimizing redundancy reduces the need for multiple updates to maintain consistency across
multiple copies of the same information.
The issues of redundancy are:
4. An Insert Anomaly occurs when certain attributes cannot be inserted into the database
without the presence of other attributes.
5. An Update Anomaly exists when one or more instances of duplicated data is updated, but
not all.
6. A Delete Anomaly exists when certain attributes are lost because of the deletion of other
attributes.

Q. What is reliable database?


• Databases which perform consistently without giving any technical issues are said to be
reliable.
• Depend on: data integrity, data safety and data recoverability.
• Ensuring the reliability of database is an important element of effective application and
utilization of computing resources.

You might also like