2database Assignment2
2database Assignment2
DATABASE ASSIGNMENT 2
Relational model of data is where all data is represented in terms of tuples, grouped into
relations in the form of a table.
Relational data model is the primary data model, which is used widely around the world for
data storage and processing.
The format is as follows:
Tables − In relational data model, relations are saved in the format of Tables. This format
stores the relation among entities. A table has rows and columns, where rows represents
records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is called a
tuple.
Relation instance − A finite set of tuples in the relational database system represents relation
instance. Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes,
and their names.
Relation key − Each row has one or more attributes, known as relation key, which can identify
the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute
domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions
are called Relational Integrity Constraints. There are three main integrity constraints −
Key constraints
Domain constraints
Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a
tuple uniquely. This minimal subset of attributes is called key for that relation. If there are
more than one such minimal subsets, these are called candidate keys.
Key constraints force that −
in a relation with a key attribute, no two tuples can have identical values for key
attributes.
a key attribute can not have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive
integer. The same constraints have been tried to employ on the attributes of a relation. Every
attribute is bound to have a specific range of values. For example, age cannot be less than zero
and telephone numbers cannot contain a digit outside 0-9.
Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key
attribute of a relation that can be referred in other relation.
Referential integrity constraint states that if a relation refers to a key attribute of a different or
same relation, then that key element must exist.
1b)
(i) The entity integrity constraint states that primary key value can't be null. This is
because the primary key value is used to identify individual rows in relation and if
the primary key has a null value, then we can't identify those rows. A table can
contain a null value other than the primary key field.
(ii) A foreign key relationship allows you to declare that an index in one table is
related to an index in another and allows you to place constraints on what may be
done to the table containing the foreign key.
Referential integrity is the state of a database in which all values of all foreign
keys are valid. A foreign key is a column or a set of columns in a table whose
values are required to match at least one primary key or unique key value of a row
in its parent table.
A referential constraint is the rule that the values of the foreign key are valid only
if one of the following conditions is true: They appear as values of a parent key.
Some component of the foreign key is null.
(iii) Semantic integrity ensures that data entered into a row reflects an allowable value
for that row. The value must be within the domain, or allowable set of values, for
that column. For example, the quantity column of the items table permits only
numbers. If a value outside the domain can be entered into a column, the semantic
integrity of the data is violated.
1c)
(i)
Functional Dependency (FD) determines the relation of one attribute to another
attribute in a database management system (DBMS) system. Functional
dependency helps you to maintain the quality of data in the database. A functional
dependency is denoted by an arrow →. The functional dependency of X on Y is
represented by X → Y. Functional Dependency plays a vital role to find the
difference between good and bad database design. A functional dependency (FD)
is a relationship between two attributes, typically between the PK and other non-
key attributes within a table
(ii) A relational schema R is in Boyce–Codd normal form if and only if for every one
of its dependencies X → Y, at least one of the following conditions hold: ⁕X → Y
is a trivial functional dependency ⁕X is a superkey for schema R
Boyce-Codd Normal Form (BCNF) is one of the forms of database normalization.
A database table is in BCNF if and only if there are no non-trivial functional
dependencies of attributes on anything other than a superset of a candidate key.
(iii) 3NF states that all data in a table must depend only on that table’s primary key,
and not on any other field in the table. A relation is said to be in Third Normal
Form (3NF), if it is in 2NF and when no non key attribute is transitively dependent
on the primary key i.e., there is no transitive dependency.
2a)
Functional dependency is a constraint between two keys
When an attribute is functionally dependent on an entire composite key, and not just on
parts of the composite key, then it is said to be fully functionally dependent.
2b)
2b (ii)
AB->C DE-A
iii DE->A
2b(iv) R(A B C D E )
AB={A,B,C,D,E}
BC={B,C,D,E,A}
CD={C,D,E,A}
DE={D,E,A}
3a)
R(A B C)
F{A,B→C , C→B}
(i) The schema is 3NF since AB and AC are keys so there are no prime attributes
(ii) Schema is not BCNF since C is not a key and we have C-B
3b)
(i)
(ii)
Dependency preserving decomposition means If we decompose a relation R into relations
R1 and R2, all dependencies of R must be part of either R1 or R2 or must be derivable from
combination of functional dependencies(FD) of R1 and R2
3C)
4a)
Projection (π)
Projection is used to select the required columns of data from a relation. Note that projection
Selection (σ)
Selection is used to select the required tuples of data from a relation. During selection, we can
Rename (ρ)
Cross product is used to combine data from two different relations into one combined relation.
If we consider two relations; A with n tuples and B with m tuples, A ✕ B will consist
of n.m tuples.
Natural join between two or more relations will result in all the combination of tuples where
Conditional join is similar to the natural join but in the conditional join, we can specify any
join condition with the operators greater than, less than, equal or not equal. You can combine
Union (⋃)
The union operation in RA is very similar to that of set theory. However, for the union of two
relations, both the relations must have the same set of attributes.
Intersection (⋂)
The intersection operation in RA is very similar to that of set theory. However, for the
intersection of two relations, both the relations must have the same set of attributes.
The set difference operation in RA is very similar to that of set theory. However, for the set
difference between two relations, both the relations must have the same set of attributes.
4b)
5a)
(i)
SELECT name
FROM MovieStar
WHERE GENDER=Male;
(ii)
SELECT title
FROM Movie
WHERE
6a)
(i)
However, the person that completes authentication process first will be able
to get money. In this case, OLTP system makes sure that withdrawn
amount will be never more than the amount present in the bank. The key to
note here is that OLTP systems are optimized for transactional superiority
instead data analysis.
Online banking
Online airline ticket booking
Sending a text message
Order entry
(ii)
(iv)
On rolling up, the data is aggregated by ascending the location hierarchy from the level of
city to the level of country.
The roll-up operation groups the data by levels of temperature. The roll down operation
(also called drill down) is the reverse of roll up. It navigates from less detailed data to more
detailed data. It can be realized by either stepping down a concept hierarchy for a
dimension or introducing additional dimensions
The slice operation selects one particular dimension from a given cube and provides a new
sub-cube.
6(b)
The process of creating a star schema involves distilling down our full schema into just
relevant features for a particular analytic purpose. The general structure of the star schema is
as follows.
1. Facts: Metrics of a business process. These are generally numeric and additive (e.g.
amount of an invoice or the number of invoices), or quantitative. The fact table also
contain keys pointing to relevant dimension tables. There is just one fact table at the
2. Dimensions: The where, when, what, etc. (e.g. date/time, locations, goods sold). These
typically contain qualitative information. There are multiple dimension tables in the
schema, all of which are related to the fact table.
6c)
Atomicity
By this, we mean that either the entire transaction takes place at once or doesn’t happen at
all. There is no midway i.e. transactions do not occur partially. Each transaction is
considered as one unit and either runs to completion or is not executed at all. It involves the
following two operations.
Consistency
This means that integrity constraints must be maintained so that the database is consistent
before and after the transaction. It refers to the correctness of a database.
Isolation
This property ensures that multiple transactions can occur concurrently without leading to
the inconsistency of database state. Transactions occur independently without interference.
Changes occurring in a particular transaction will not be visible to any other transaction until
that particular change in that transaction is written to memory or has been committed. This
property ensures that the execution of transactions concurrently will result in a state that is
equivalent to a state achieved these were executed serially in some order.
Durability:
This property ensures that once the transaction has completed execution, the updates and
modifications to the database are stored in and written to disk and they persist even if a
system failure occurs. These updates now become permanent and are stored in non-volatile
memory. The effects of the transaction, thus, are never lost.