CS3492-QB (1)
CS3492-QB (1)
3 0 0 3
OBJECTIVES
To learn the fundamentals of data models, relational algebra and SQL
To represent a database system using ER diagrams and to learn normalization techniques
To understand the fundamental concepts of transaction, concurrency and recovery processing
To understand the internal storage structures using different file and indexing techniques whichwill
help in physical DB design
To have an introductory knowledge about the Distributed databases, NOSQL and database
security
UNIT 3 TRANSACTIONS
Transaction Concepts – ACID Properties – Schedules – Serializability – Transaction support in
SQL – Need for Concurrency – Concurrency control –Two Phase Locking- Timestamp –
Multiversion – Validation and Snapshot isolation– Multiple Granularity locking – Deadlock
Handling – Recovery Concepts – Recovery based on deferred and immediate update – Shadow
paging – ARIES Algorithm
TEXT BOOKS:
1. Abraham Silberschatz, Henry F. Korth, S. Sudharshan, ―Database System Concepts‖, Seventh
Edition, McGraw Hill, 2020.
2. Ramez Elmasri, Shamkant B. Navathe, ―Fundamentals of Database Systems‖, Seventh Edition,
Pearson Education, 2017
REFERENCES:
1. C.J.Date, A.Kannan, S.Swamynathan, ―An Introduction to Database Systems‖, Eighth
Edition,Pearson Education, 2006.
COURSE OUTCOME
At the end of course, students will be able to
PART A
1. What is database?(R)
A database is a basic electronic storage with collection of interrelated data, organized to provide
efficient retrieval. Databases are organized by fields, records and files or tables.
14. What are the different data models? (R) (MAY 2012)
Relational Data Model
The Entity-Relationship Data Model
Object-Based Data Model
Semi structured Data Model
15. What is a relational model? (R) (MAY 2010)
The Relational model uses a collection of tables to represent both data and the relationships
among those data. Each table has multiple columns, and each column has a unique Value.
31. What is the use of embedded SQL? (U) (MAY 2012) (NOV 2014)
A fundamental principle underlying embedded SQL, which we call the dual-mode
principle, is that any SQL statement that can be used interactively can also be embedded in an
application program.
32. Write short note on OPEN, FETCH and CLOSE statements. (R)
OPEN:
EXEC SQL OPEN < Cursor name>;
Opens the specified cursor. A set of rows is thus identified and becomes the current active set for
the cursor.
FETCH:
EXEC SQL FETCH <Cursor name>
INTO <host variable reference
commalist>; Advances the specified cursor to the next
row in the active set.
CLOSE:
49. Give the usage of the rename operation with an example.(MAY 2010) (U)
1.LosslessDecomposition
2.DependencyPreservation
3. Lack of Data Redundancy
Key A key is a single or combination of multiple fields. Its purpose is to access or retrieve
1. Explain the system structure of a database system with neat block diagram. (R) (DEC 2007),
(MAY 2010) (MAY 2012)
2. What is a data model? Explain various data model for describing the design of a database at the
logical level. (U) (APRIL 2008), (MAY 2010)
3. Explain the different between physical and logical data independence with an example. (U)(APRIL 2008)
5. Describe the three-schema architecture. Why do we need mapping between schema Levels? How do
different schema definition languages support this architecture? (R) (DEC 2008)
1. What is the notation used in E-R diagram? Explain the E-R model structure with Example.(U)(NOV
2014)
2. Explain the role and functions of the database administrator. (R)
iv. Views
4. Develop an Entity Relationship model for a library management system. Clearly State
the problem Definition, Description, Business Rule and any assumption you make. (AP)
(MAY 2009) (NOV 2014)
12. (i)State and explain the command DDL,DML,DCL with suitable example(7).
(ii) Justify the need of embedded SQL.Consider the relation
Student (student no, name, mark and grade).Write the embedded SQL statement in C language to
retrieve all the students records whose mark than 90. (6) (Nov/Dec-2017)
13. Discuss about Tuple Relational Calculus and Domain Relational Calculus. (R) (DEC 2008)
(MAY 2012)
14. What are aggregate functions? Explain five built-in aggregate functions. (U) (MAY 2008)
15. Consider the following relations for a company database Application.(AP)(MAY 2009)
(iii) Develop a View that will keep track of the department number, the number of
employees in the department, and the total basic pay expenditure for each
department.
(iv) Develop an SQL query to list the details of employees who have worked in more
than three projects on a day.
16. Explain how dangling tuple may arise and also explain problems that they Cause. (U)
(MAY 2008)
17. Briefly present a survey on Integrity and Security. (U) (MAY 2012)
18. Create an EMPLOYEE table and write the steps for various functions in an SQL Like
add, update, delete, save and join the various attributes of an EMPLOYEE Table.
(C)(MAY 2007)
19. Explain the use of trigger with your own example. (U) (MAY 2010)
20. What is a view? How can it be created? Explain with an example. (R) (MAY 2010)
21. Briefly present a survey on Integrity and Security. (U) (MAY 2012)
Course File
Professor File
Professor Number Name Office
Registration File
Consider a suitable example of tuples/records for the above mentioned tables and
write DML statements (SQL) to answer for the queries listed below:
(iv) For a specific student number, in which courses is the student registered and what is
his/her name?
23. Let relations r1(A,B,C)and r2(C,D,E) have the following properties r1 has 20000 tuples r2
has 45000 tupes ,25 tuples of r1, fit on one block and 30 tuples of r2 fit on one block .Estimate
the number of block transfers and seeks required,using each of the following join strategies for
r1∞r2:
(i) Nested loop join
(ii) Block Nested
loop join
(iii) Merge join
Hill.(5)
(ii) Find the names of employees who have borrowed all books published by McGraw-
Hill(5)
UNIT II
DATABASE DESIGN
Entity-Relationship model – E-R Diagrams – Enhanced-ER Model – ER-to-Relational
Mapping – Functional Dependencies – Non-loss Decomposition – First, Second, Third
Normal Forms, Dependency Preservation – Boyce/Codd Normal Form – Multi-valued
Dependencies and Fourth Normal Form – JoinDependencies and Fifth Normal Form
Fifth Normal Form
PART A
1. What is entity? (R)
is a 'thing' in the real world with an independent existence.
II / III 129
8. What do you mean by functional dependencies? (U)
(MAY 2010)(nov2013)(Dec2015)
Let R be a relation variable and let X and Y be arbitrary subsets of the
set of attributes of R. then we say that Y is functionally dependent on-in
symbols,
XY
II / III 130
17. What is multi-valued dependence? (R)
Let R be a relvar, and let A, B and C be subsets of the attributes of R.
Then we say that B is multi-dependent on A-in symbols, AB
*{A, B …Z}
if and only if every legal value of R is equal to the join of its projections on A, B,
……Z.
II / III 131
24. What are the aspects of relational model?(R)
The Three aspects of relational model are:
Structural aspect: The data in the database is perceived by
the user as tables, and nothing but tables.
Integrity aspect: Those tables satisfy certain integrity constraints.
Manipulative aspect: The operators available to the user for
manipulating those tables- for example, for purposes of data
retrieval – are operators that derive tables from tables. Of
those operators, three particularly important ones are select,
project and join.
25. Mention the important points about relational databases. (R)
A set of important points about relational databases:
a. Relational databases store data in the form of tables (logically).
b. The rows of a table are called as tuples.
c. The columns of a table are known as attributes.
d. Every attribute has a data type associated with it.
e. Every attribute has a domain which provides the set of all possible
values that can be stored as values for that attribute.
f. Tables are called as relations.
g. The table names are called as relational variable
28. What is the difference between Data Integrity and Data Security? (NOV 2013)
(U)
Data integrity and data security are two different aspects that make sure
the usability of data is preserved all the time. Main difference between integrity
and security is that integrity deals with the validity of data, while security deals
with protection of data. Backing up, designing suitable user interfaces and error
detection/correction in data are some of the means to preserve integrity, while
authentication/authorization, encryptions and masking are some of the popular
means of data security. Suitable control mechanisms can be used for both
security and integrity.
II / III 132
29. Which operators are called as unary operators and why are they called so
(NOV 2013) (U)
An operator that takes a single operand in an expression or a statement.
The unary operators in C# are +, -,!, ~, ++, -- and the cast operator.
II / III 133
PART B
1. What is the notation used in E-R diagram? Explain the E-R model structure with
(U)(NOV 2014)
2. Develop an Entity Relationship model for a library management system. Clearly
II / III 134
UNIT III
TRANSACTIONS
Transaction Concepts – ACID Properties – Schedules – Serializability –
Transaction support in SQL – Need for Concurrency – Concurrency
control –Two Phase Locking- Timestamp – Multiversion – Validation and
Snapshot isolation– Multiple Granularity locking – Deadlock Handling –
Recovery Concepts – Recovery based on deferred and immediate update
– Shadow paging – ARIES Algorithm
PART – A
Recovery in a database system means, primarily, recovering the database itself; that
is, restoring the database to a correct state after some failure has rendered the
current state incorrect, or at least suspect.
II / III 135
d. Durability: Once a transaction commits, its updates persist in the
database, even if there is a subsequent system crash.
II / III 136
12. Define concurrency (MAY 2012) (R)
Concurrency refers to the fact that DBMSs typically allow many
transactions to access the same database at the same time.
13. What are the three problems that any concurrency control mechanism must
address? (R)
The three problems are:
Two –phase locking theorem is ―If all transactions obey the two-phase
locking protocol, then all possible interleaved schedules are serializable‖.
II / III 137
acquire any more locks.
20.What is an isolation level? (R)
The isolation level that applies to a given transaction might be defined
as the degree of interference the transaction in question is prepared to tolerate
on the part of concurrent transactions.
Hold and Wait:There must exist a process that is holding at least one resource
and is waiting to acquire additional resources that are currently being held by other
processes.
Circular Wait: There must exist a set {p0, p1,...........pn} of waiting processes
such that p0 is
waiting for a resource which is held by p1, p1 is waiting for a resource which
is held by p2,
for a resource which is held by pn and pn is waiting for a resource which
is held by p0.
II / III 138
24. What is a shadow copy scheme?
It is simple, but efficient, scheme called the shadow copy schemes. It is based o n
making copies of the database called shadow copies that one transaction is active
at a time. The scheme also
25. What type of locking needed for insert and delete operations (April/May-
2017)
When you execute an INSERT, UPDATE, or DELETE statement, the database server uses
exclusive locks. An exclusive lock means that no other users can update or delete the
item until the
PART – B
II / III 139
1. Why is Recovery needed? Discuss any two Recovery Techniques.(U)
(MAY2009)
(MAY 2012)
2. Write a relevant example discuss two phase Looking. (U)(MAY 2009)
3. What is Deadlock? List and discuss the four conditions for Deadlock. (R)
II / III 140
UNIT IV
IMPLEMENTATION TECHNIQUES
RAID – File Organization – Organization of Records in Files – Data dictionary Storage –
Column Oriented Storage– Indexing and Hashing –Ordered Indices – B+ tree Index Files –
B tree Index Files – Static Hashing – Dynamic Hashing – Query Processing Overview –
Algorithms for Selection, Sorting and join operations – Query optimization using Heuristics
- Cost Estimation.
PART – A
Cache
Main memory
Flash memory
Magnetic disk
Optical disk
Magnetic tapes
9. What is NAS?(R)
Network attached storage (NAS) is an alternative to SAN. NAS is
much like SAN, except that instead of the networked storage appearing to be
a large disk, it provides a file system interface using networked file system
protocols such as NFS or CIFS.
II / III 142
11. Define seek time.(R)
To access data on a given sector of a disk, the arm first must move so
that it is positioned over the correct track, and then must wait for the sector to
appear under it as the disk rotates. The time for repositioning the arm is
called the seek time.
12. Define average seek time.(R)
The average seek time is the average of the seek times, measured over a
sequence of random requests.
13. Define rotational latency time.(U)
Once the head has reached the desired track, the time spent
waiting for the sector to be accessed to appear under the head is
called the rotational latency time
17. What are the factors to be taken into account in choosing a RAID level?(R)
The factors to be taken into account in choosing a RAID level are
E
nd
II / III 147
46. What is an indexed nested-loop join?(U)
Indexed nested loop join can be used with existing indices, as well as with
temporary indices created for the sole purpose of evaluating the join.
2
. 2Rid
.
Rid =
II / III 148
51. What is a histogram?(U)
In histogram the values for the attribute are divided into a number of
ranges, and with each range the histogram associates the number of tuples whose
attribute value lies in that range.
II / III 150
UNIT IV
PART B
2. What are the steps involved in query processing? How would you estimate
the cost of the query? (U) (MAY 2007)
9. Explain how the RAID systems improve performance and reliability. (U)(DEC
2007)
10. Describe the structure of B+ tree and list the characteristics of a B+tree.(U)
13. Describe in detail about how records are represented in a file and how to
organize them in a file. (AP)(MAY 2012).
14. Explain about spatial and mobile database (U)(NOV 2014)(NOV 2016)
II / III 151
18.Describe the benefits and drawbacks of a source driven architecture
for gathering of data datawarehouse , as compared to a destination
driven architecture (U)(NOV 2016)
UN I T V
ADVANCEDTOPICS
Unit-V Advanced Topics
Distributed Databases: Architecture, Data Storage, Transaction Processing, Query
processing and optimization – NOSQL Databases: Introduction – CAP Theorem –
Document Based systems – Key value Stores – Column Based Systems – Graph
Databases. Database Security: Security issues – Access control based on privileges –
Role Based access control – SQL Injection – Statistical Database security – Flow
control – Encryption and Public Key infrastructures – Challenges
PART-A
Request(source text)
Terminal from which the operation was invoked
User who invoked the operation
II / III 152
Date and time of the operation
Relvar(s), tuples(s),attribute(s) affected
Before images(old values)
After images(new values)
5. What is entity integrity?(U)
No component of the primary key of any base relvar is allowed to
accept nulls are called entity integrity.
II / III 155
16. What is meant by Data warehousing? (NOV 2014)(U)
A data warehouse is a relational database that is designed for query and
analysis rather than for transaction processing. It usually contains historical data
derived from transaction data, but it can include data from other sources. It
separates analysis workload from transaction workload and enables an
organization to consolidate data from several sources.
II / III 156
20. Define Crawling. (NOV 2014)(R)
Web crawling is the process of search engines combing through web
pages in order to properly index them. These ―web crawlers‖ systematically
crawl pages and look at the keywords contained on the page, the kind of content,
all the links on the page, and then returns that information to the search engine‗s
server for indexing. Then they follow all the hyperlinks on the website to get to
other websites. When a search engine user enters a query, the search engine will
go to its index and return the most relevant search results based on the keywords
in the search term. Web crawling is an automated process and provides quick, up
to date data.
II / III 157
26. Define Statistical database.(R)
A statistical database is a database used for statistical analysis
purposes. It is an OLAP (online analytical processing), instead of OLTP
(online transaction processing) system. Modern decision, and classical
statistical databases are often closer to the relational model than the
multidimensional model commonly used in OLAP systems.
28. Write about the four types (Star, Snowflake, Galaxy and Fast
constellation) of Data warehouse schemas.(DEC2015)(R)
1. STAR SCHEMA:Centralized Fact table connect the one or more denormalized data
Applications:
Cross-Marketing
Basket Data Analysis
Catalog design
II / III 158
PART B
8. Neatly write the K-means algorithm and show the intermediate results in
clustering the below given points into two clusters using K-means algorithm.
P10,0),P21,10),P3:(2,20),P4:(1,15),P5:(1000,2000),P6:(1500,1500),P7:(1
000,1250). (U) (Dec2015)
11.Suppose that you have been hired as a consultant to choose a database system
for your client‗s application .For each of the following applications, state what
type of database system (relational,persistent programming language based
OODB,object relational;do not specify a commercial product)you whould
recommend.Justify your recommendation
12.Trace the results of using the Apriori algorithm on grocery store example to
support threshold s=33.34% and confidence threshold c=60%.Show the candidate
and frequent itemsets foreach database scan.Enumerate all the final frequent
itemsets.Also indicate the association rules that are generated and highlight the
strong ones,sort them by confidence.
Transaction Id Items
T1 HotDogs,Buns,Ketchup
II / III 159
T2 HotDogs,Buns
T3 HotDogs,Coke,Chips
T4 Chips,Coke
T5 Chips,Ketchup
T6 HotDogs,Coke,Chips
II / III 160