0% found this document useful (0 votes)
30 views53 pages

Cs403 Short Notes

Uploaded by

Tanveer Abbas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views53 pages

Cs403 Short Notes

Uploaded by

Tanveer Abbas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Cs403 Short Notes

Ch: 19
Q: Normalization is based on the concept of …….?
Functional dependency

Functional dependency:
If A and B are attributes of a relation R, then B is functionally dependent on
A
if each value of A in R is associated with exactly one value of B in R;
Functional dependency Notation:

1. A functionally determines B” or “ A determines B”.


2. B is functionally dependent on A.
Determinant:
Set of attributes on left side are called determinant (A).
Dependents:
Set of attributes on right side are called dependents (B).
Super keys: (Columns)
that identifies an entity uniquely.
A minimal super key is the candidate key.
Super key:
If a determinant of functional dependency determines all attributes of that
relation then it is definitely a super key.
Candidate key:
If there is no other functional dependency whereas a subset of this
determinant is a super key then it is a candidate key.

Inference Rules:
Called inference axioms or Armstrong axioms.
Names Rules
1. Reflexivity: If B is a subset of A, then A B.
This also implies that A A always
holds. Functional dependencies of
this type are called trivial
dependencies.
2. Augmentation: If we have A B then AC BC
3. Transitivity: If we have A B and B C then A
C.
4. Additivity of Union: If we have A B and A C then A
BC.
5. Projectivity or Decomposition: If we have A BC and A B then
A C.
6. Pseudo transitivity: If we have A B and CB D then
AC D.

Normalization
Advantages of Normalization:
1. Remove or reduce redundancy.
2. Efficiently organizing data.
Two Goal of Normalization:
1. Eliminate redundant data
2. Ensure data dependencies
There are two type of data dependency:
1. Row Level.
2. Column Level.
Q: How many type of normalization?
ANS:
There three type of normalization.
1. 1st Normal Form
2. 2nd Normal Form
3. 3rd Normal Form

First Normal Form: (Lecture 52) (21 by gate….)


iff every attribute in every tuple contains an
1. Atomic value.
2. No multivalued.
Each attribute in each row , or each cell of the table, contains only one value.
Q: In first normal form for every …….there is a ….. value?
i): Tuple and unique ii): Column and unique
A nonkey attribute does not uniquely identify an instance of an entity.

Second Normal Form: (Lecture no 53) (24)


A relation is in second normal form (2NF) if and only if it is in
1. first normal form
2. All the non key attributes are fully functionally dependent on the key.
3. No partial Depedency
If a relation is in 1NF and the key consists of a single attribute, the relation is
automatically in 2NF.
The only time we have to be concerned about 2NF is when the key is
composite.

Ch: 20
What are the Anomalies in DBMS?
ANS:
An inconsistent, incomplete or incorrect state of database.
 Redundancy
 Insertion Anomaly
 Deletion Anomaly
 Updation Aonaly

Third Normal Form: (Lecture no 53)


A relational table is in third normal form (3NF) if it is already
1. in 2NF
2. every non-key column is non-transitively dependent upon its primary
key.
3. In other words, all nonkey attributes are functionally dependent only
upon the primary key.
Transitive Dependency: ( Lecture No 51)

Transitive dependency occurs when one non-key attribute determines another


non-key attribute.
Transitive dependencies cause: insertion, deletion, and update anomalies.

Boyce - Codd Normal Form: (lecture 55)


(candidate key)
A relation is in Boyce-Codd normal form if and only if every determinant is a
candidate key.
A relation R is said to be in BCNF if whenever X -> A holds in R, and A is not
in X, then X is a candidate key for R.
BCNF: Boyce – Codd Normal Form ( candidate key)
4NF: deals with multivalued dependency.
5NF: deals with possible loss less decompositions.
DKNF: Domain Key Normal Form
reduces further chances of any possible inconsistency.
A relational table is in 3NF if and only if all non-key columns are
(a) mutually independent and
(b) fully dependent upon the primary key.
Mutual independence means that no non-key column is dependent upon any
combination of the other columns.
Ch: 21
FDs: (Functional Dependency).
2NF: concern with Functional Dependency.
2NF and 1NF:
1. No partial dependency
2. No transitive dependency
Normalization is performed through Analysis or Synthesis process.
In all we have the following four FDs:
1) empId salary, empName, empMgr, empDept
2) projName, empId rating, hours
3) projName projMgr, budget, startDate
4) empDept empMgr
In the physical database design: however, the focus shifts from storage
efficiency to the efficiency in execution.

Ch: 22
For the physical database design we need to check the usage of the data in
term of its size and the frequency.
Grouping of attributes in the logical order.
Arrangement of Similar records into the secondary memory (hard disk)

DESIGNING FIELDS:
Fields mean columns.
Field is the smallest unit of application data.
Data types:
1. the structure defined for placing data in the attributes
2. Set of values along with the operations that can be performed.
Data Types
Data type Size
VARCHAR Now deprecated -
VARCHAR is a synonym
for VARCHAR2 but this
usage may change in
future versions.
CHAR(size) Fixed length character 1 to 32767
data of length size bytes.
This should be used for
fixed length data. Such as
codes A100, B102...
NUMBER (p, s) Magnitude 1E-130 .. -84 to 127
10E125 maximum
precision of 126 binary
digits, which is roughly
equivalent to 38 decimal
digits
The scale s can range
from -84 to 127. For
floating point don't
specify p,s REAL has a
maximum precision of 63
binary digits, which is
roughly equivalent to 18
decimal digits
LONG Character data of 32760
variable length (A bigger
version the VARCHAR2
data type)
DATE Valid data range 4712 to 9999
RAW(size) Raw binary data of 32767
length size bytes. You
must specify size for a
RAW value.
LONG RAW Raw binary data of 32760
variable length. (not
interpreted by PL/SQL)
BLOB Binary Large Object 4gigabytes

Data Types Size


VARCHAR2(Size) 1 to 32767
VARCHAR
CHAR(size) 1 to 32767
NUMBER (p, s) -84 to 127
LONG 32760
DATE 4712 to 9999
RAW(size) 32767
LONG RAW 32760
BLOB 4gigabytes
BLOB: Binary Large Object

CODING AND COMPRESSION TECHNIQUES:


Coding:
Coding techniques are also useful for compression of data values appearing
the data, by replacing those data values with the smaller sized codes we can
further reduce the space needed by the data for storage in the database.
Compression:
With the help of compression data values with the smaller sized codes we can
further reduce the space needed by the data for storage in the database.
Benefits by the use of data type:
1. Default value
2. Range Control
3. Null Value Control
4. Referential Integrity
Default value:
1. which are associated with a specific attribute
2. reduce the chances of inserting incorrect values.
3. preventing the attribute value be left empty.
Range Control:
Range control implemented over the data can be very easily achieved by using any data type.
As the data type enforces the entry of data in the field according to the limitations of the data type.

Null Value Control:


1. A null value is an empty value and is distinct from zero and spaces.
Databases can implement the null value control by using the different data types or their build in
mechanisms.

Referential Integrity:
keep the input values for a specific attribute in specific limits in comparison to
any other attribute of the same or any other relation.

Ch: 23
Denormalization: move from higher to lower normal forms of database
modeling in order to speed up database access.
Denormalization process is applied for deriving a physical data model from a
logical form.
Logical DB Design In physical DB design
accessed by same PK accessed by DBMS
Q: …… may decompose one logical relation into separate physical records,
combine some or do both?
i): Denormalization ii): Normalization
Many to many binary relationships mapped to three relations.
One to one binary Relationship:
Merge two Entity types into one with one to one relationship.
One to many binary Relationship: when the ET on one side does not participate in any other relationship,
then many side ET is appended with reference data rather than the FK.

Join is an expensive operation from execution point of view


De-normalization leads to merging different relations.
Partitioning splits same relation into two.
De-normalization: Merging
Partitioning: Split
Q: Give three reasons of partitioning in the process of de-normalization?
ANS:
1. Reduce workload (e.g. data access, communication costs, search space)
2. Balance workload
3. Speed up the rate of useful work (e.g. frequently accessed objects in main
memory)
There are two types of partitioning:-
1. Horizontal Partitioning (Rows)
2. Vertical Partitioning (Columns)
Horizontal Partitioning
Table is split on the basis of rows, which means a larger table is split into
smaller tables.
Horizontal partitioning types:
1. Range partitioning (range is imposed)
2. Hash partitioning (particular algorithm is applied)
3. List partitioning (values are specified)
Range Partitioning:
In this type of partitioning range is imposed on any particular attribute
Partitions may become unbalanced in:
► Range partitioning (Page 189)
►Hash partitioning
► List partitioning
► Vertical partitioning
Hash Partitioning:
In this type particular algorithm is applied and DBMS knows that algorithm
So hash partitioning reduces the chances of unbalanced partitions to a large
extent.
Q: There is no range involved in ……?
List Partitioning
Range partitioning become unbalanced
Hash Partitioning: reduces the chances of unbalanced
partitions.
List Partitioning
Ch: 24
Vertical partitioning is done on the basis of attributes (Columns).
General reasoning for horizontal or vertical partitioning is.
► Increasing the consistency
► Decreasing the query response time
►Smaller tables are more efficient to process as compared to the larger
tables.
► All of the above
Q: Why primary key is repeated in vertical partitions?
ANS:
Primary key is repeated in all vertical partitions of a table to get the original
table.
Replication:
copying a portion of the database from one environment to another and
keeping subsequent copies of the data in synchronization with the original
source.
Q: …… is the final form of denormalization?
ANS:
Replication
In replication entire table or part of table can be replicated.
Clustering Files:
To place records from different tables to place in adjacent physical locations
called clusters.
In which of the following situations, Clustering is suitable:
 Relatively static
 Relatively dynamic
Data definition language (DDL): skills to translate the physical design into
actual database objects.
Logical to Physical transformation consists of the following things:
• Transforming entities into tables
• Transforming attributes into columns
• Transforming domains into data types and constraints

Ch: 25
SQL-92.
SQL: Structured Query Language
Also pronounced as “Sequel”
A de-facto standard for relational DBMS
Standard accepted by bodies like ANSI and ISO
SQL is an ANSI standard computer language for accessing and manipulating
databases.
Standard SQL commands such as: "Select", "Insert", "Update", "Delete",
"Create", and "Drop"
Rules of SQL Format:
read, write, and remove.

SQL Commands:
Two commands:
1. DDL: Data Definition Language (create , alter , Drop and
destroy/delete)
2. DML: Data Manipulation Language (insert, retrieve , deletion and
modify)
3. DCL: Data Control language (GRANT and REVOKE)
DDL: CADDD
DML: IRDM
Following are the rules for writing the commands in SQL:-
•Reserved words are written in capital like SELECT or INSERT.
• User-defined identifiers are written in lowercase
• Identifiers should be valid, which means that they can start with @,
alphabets ,or with numbers.
The maximum length can be of 256. The reserved words should not be used
as identifiers.
[ ] it is optional.
{ } it is necessary.
| choice.
Data Types in SQL Server:

1. Integer

Integers
Name Range
Biggint -2 -9,223,372,036,854,775,808) to 261
61

- 1 (9,223,372,036,854,775,808)
Int -231 -2,147,483,648) to 231 - 1
(2,147,483,648)
Smallint -215 (-32,768) to 215 - 1 (32,767)
Tinyint 0 to 255

2. bit
Integer data with either a 1 or 0 value.

3. Decimal and Numeric


• Decimal
Fixed precision and scale numeric data from -1038 +1 through 1038 –1.
• Numeric
Functionally equivalent to decimal.

4. Text:
It handles the textual data.
Following are the different data types.
• Char: By default 30 characters, max 8000
• Varchar: Variable length text, max 8000
• Text: Variable length automatically
• nchar: nvarchar, ntext

5. Money:
: handles monetary data
o smallmoney: 6 digits, 4 decimal
o money: 15 digits, 4 decimal

6. Floating point
o Float
o Real

7. Date
o Smalldatetime
o datetime
Data dictionary or Data directory:
DDL statements are compiled, resulting in a set of tables stored in a special
file called a data dictionary or data directory.
Other name of data dictionary is metadata.
Two types of DML:
Procedural: (What to do and how to do)
in which the user specifies what data is needed and how to get it.
Nonprocedural: (What to do)
in which the user only specifies what data is needed
Ch: 26

Q: What is the first data management step in any database?


i) Create Database ii) Alter Database iii): Drop Database
Syntax of CREATE DATABASE:
Syntax: CREATE DATABASE atabasename;
EX: CREATE DATABASE mydb;
Q: To create a temporary table the "…." attribute must be specified?
ANS:
AS TEMP.
If no file size is given for a disk-based table, the table will be pre-allocated to
1MB.
If no filegrowth is given, the default is 50%.
There are two approaches for creating the tables, which are:
1. Through SQL Create command
2. Through Enterprise Manage
The check constraint: checks the values for any particular attribute.

Ch: 27
Alter Table Statement: (Columns)
make changes in the definition of a table already created through Create
statement.
Q: Write the use of alter table statement?
ANS:
It can
1. Add column
2. drop the attributes or
3. constraints, activate or deactivate constraints.
4. It modifies the design of an existing table.
5. Change column
6. Drop Column
7. Modify Column
Syntax:
ALTER TABLE table
{
ADD [COLUMN] column type [(size)] [DEFAULT default] |(See lecture
56)
ALTER [COLUMN] column type [(size)] [DEFAULT default] |
ALTER [COLUMN] column SET DEFAULT default |
DROP [COLUMN] column |
RENAME [COLUMN] column TO columnNew
}
Use RENAME COLUMN: to rename an existing column.
We cannot add, delete or modify: more than one column at a time.
TRUNCATE DELETE DROP
It is used to delete all It is used to delete one It is used to drop the
the rows of any table or many records/row. complete table from the
but rows would exist. database
Drop table: then table should be fully delete.
Truncate table: then only data inside the table should be deleted.
The Data Manipulation Language (DML) is used to or DML component or
commond:
1. Insert To add new rows to tables.
2. Select To retrieve rows from tables.
3. Update To modify the rows of tables.
DML does not used to:
a. Add new rows to tables
b. Retrieve rows from table
c. Modify the rows of tables
d. Alter a table definition
Q: SQL is a ……….language?
i) non-procedural ii) Procedural
There are two types of DML:
1. First is procedural in which: the user specifies what data is needed and
how to get it.
2. Second is nonprocedural in which: the user only specifies what data is
needed.

Insert Statement: (DML)


Q: The INSERT command in SQL is used to add …….to an existing table?
i): Row/Record/Tuple ii): Column
INSERT?
a. ONTO
b. INTO
You must follow three rules when inserting data into a table with the
INSERT...VALUES statement:
1. same data type.
2. data's size must be within the column's size.
3. The data's location in the VALUES list must correspond to the location
in the column list.
The INSERT statement has two variations.
1. The INSERT...VALUES: to insert one record.
statement inserts a set of values into one record.
2. The INSERT...SELECT: to insert multiple records.
statement is used in combination with a SELECT statement to insert
multiple records into a table based on the contents of one or more
tables.
This updated value can also be the result of an expression or calculation.

Ch: 28
The INSERT statement:
1. Insert a single record
2. Multiple records into a table.

INSERT has two formats:


INSERT INTO table-1 [(column-list)] VALUES (value-list) (single record)
And,
INSERT INTO table-1 [(column-list)] (query-specification) (multiple record)
Unlisted columns are set to null.
If the optional column-list is missing: the default column list is substituted.
The VALUES Clause: in the INSERT Statement provides a set of values to
place in the columns of a new row.
It has the following general format:
VALUES (value-1 [, value-2]...)

Select Statement: (DML)


Q: Select…..?
1. Rows
2. Columns
The basic SELECT statement has 3 clauses:
• SELECT
• FROM
• WHERE
The SELECT clause: specifies the table columns that are retrieved.
The FROM clause: specifies the tables accessed.
The WHERE clause: specifies which table rows are used.
When we miss to write WHERE clause with SELECT: Select the all rows.
When we miss to write WHERE clause with DELETE: Delete the all rows.
The WHERE clause is optional.
The SELECT clause is mandatory.
FROM clause (Follow) SELECT clause
WHERE clause (Follow) FROM clause
The syntax for the SELECT statement is:
SELECT {*|col_name[,….n]} FROM table_name

* all the attributes of any table would be available.


Alias is used to temporary rename a table name or a column name.
Which of the following prevents duplicate values to be displayed as a result of
an SQL statement?
a. DISTINCT b. DELETE c. UPDATE d. ALTER
Which of the following is used to filter rows according to some condition(s)?
a. SELECT b. FROM c. WHERE d. UPDATE
The keyword is used in SELECT statement to return different values.
a. LIKE b. IN c. DISTINCT d. WHERE
Predicate:
Following the WHERE keyword is a logical expression, also known as a
predicate.
The predicate evaluates to a SQL logical value --
true, false or unknown.
Q: Write the most basic predicate?
ANS:
The most basic predicate is a comparison:
Color = 'Red'
This predicate returns:
• True -- If the color column contains the string value -- 'Red',
• False -- If the color column contains another string value (not 'Red'), or
• Unknown -- If the color column contains null.
Ch: 29
The WHERE clause work with following SQL statement
- select, insert, update, or delete.
The format of WHERE clause is as under:
SELECT [ALL|DISTINCT]
{*|culumn_list [alias][,…..n]} FROM table_name
[WHERE <search_condition>]
WHERE is given in square brackets, which means it is optional.
Q: Display all courses of the MCS program
SELECT crCode, crName, prName
FROM course
WHERE prName = ‘MCS
Q List the course names offered to programs other than MCS?
SELECT crCode, crName, prName
FROM course
WHERE not (prName = ‘MCS’)

NOT operator
Inverses the predicate’s value.

BETWEEN Operator
The BETWEEN condition allows you to retrieve values within a specific
range.
The syntax for the BETWEEN condition is:
SELECT columns
FROM tables
WHERE column1 between value1 and value2;
The BETWEEN function can be used in any valid SQL statement:
select, insert, update, or delete.

IN operator:
The IN function helps reduce the need to use multiple OR conditions.
It is sued to check in a list of values.
The syntax for the IN function is:
SELECT columns
FROM tables
WHERE column1 in (value1, value2,.... value_n);

Like Operator
Allows you to use wildcards in the where clause of an SQL statement.
Perform: pattern matching.
The patterns that you can choose from are:
1. %: Match string of any length (including zero length)
2. _ Match a single character

ORDER BY clause (Geeky 44 45)


The ORDER BY clause allows you to sort the records in your result set.
Q: The ORDER BY clause can only be used in…..?
i): SELECT statements ii): INSERT statements iii): DELETE
statements
The syntax for the ORDER BY clause is:
SELECT columns
FROM tables
WHERE predicates ORDER BY column ASC/DESC;
The ORDER BY clause sorts the result set based on the …..specified?
1. Columns
2. Rows
ASC indicates ascending order. (By Default)
DESC indicates descending order.
Note:
Like/In/Between operate are used all valid SQL Statements:
Insert
Select
Update
Delete
ORDER by clause only used with select.

Ch: 30

Functions in SQL
one-word command that return a single value.
The value of a function can be determined by input parameters.
There are normally two types of functions:
1. First is Built: in, which are provided by any specific tool or language.
2. User defined: which are defined by the user.
Categories of functions Depend on:
1. The arguments
2. The return value
Functions are categorized as under:
– Mathematical (ABS, ROUND, SIN, SQRT)
– String (LOWER, UPPER, SUBSTRING, LEN)
– Date (DATEDIFF, DATEPART, GETDATE())
– System (USER, DATALENGTH, HOST_NAME)
– Conversion (CAST, CONVERT)
Aggregate Functions:
Functions that operate on a set of rows and return a single value.
If used among many other expressions in the item list of a SELECT statement,
the SELECT must have a GROUP BY clause.
No GROUP BY clause is required if the aggregate function is the only value
retrieved by the SELECT statement.
Following are some of the aggregate functions:

GROUP BY Clause: (Geeky 92)


used in a SELECT statement to collect data across multiple records and
group the results by one or more columns.
Aggregate function can be a function such as AVG, SUM, COUNT, MIN or
MAX.
HAVING Clause (Geeky 93)
The HAVING clause is used in combination with the GROUP BY clause. It
can be used in a SELECT statement to filter the records that a GROUP BY
returns.
Referential integrity constraint plays an important role in gathering data
from multiple tables.
Following are the methods of accessing data from different tables:
Cartesian Product:
 Inner join
 Outer Join
 Full outer join
 Semi Join
 Natural Join
Cartesian product:
A Cartesian join gives a Cartesian product.
A Cartesian join is when you join every row of one table to every row of
another table

Ch: 31
Inner Join:
Only those rows from two tables are joined that have same value in the
common attribute.
The common attributes are required to:
1. Not have the same name in both tables.
2. Must have the same domain in both tables.
Q: Write the technique of inner join?
ANS:
1. SELECT * FROM course INNER JOIN program
ON course.prName = program.prName
2. Select * FROM Course c inner join program p
ON c.prName = p.prName
3. SELECT * FROM course, program
WHERE course.prName = program.prName
Suppose we have two tables T1 and T2, Tuples of T1 that do not match some
row in T2 will not appear in ….?
a. Outer join
b. Inner join
c. Both I and II
d. None of the above
Outer Join:
The join operation that rely on null values, called outer joins.
COURSE rows without a matching PROGRAM row appear exactly once in
the result.
Several variants of the outer join:
1. Left outer join
2. Right outer join
3. Full outer join
In a right outer join: COURSE rows without a matching PROGRAM row
appear in the result, but not vice versa.
In a left outer join: PROGRAM rows without a matching COURSE row
appear in the result, but not vice versa.
In a full outer join: both COURSE and PROGRAM rows without a match
appear in the result.
(Of course, rows with a match always appear in the result, for all these
variants, just like the usual joins or inner joins).
Semi Join:
Another form of join that involves two operations.
First inner join is performed on the participating tables and then resulting
table is projected on the attributes of one table.
Following are the methods of accessing data from different tables:
Cartesian Product
 Inner join ( Rely on common attributes)
 Outer Join ( Rely on NULL values)
 Full outer join (rows without a match appear in the result)
 Semi Join (Two operation: inner join+ project)
 Natural Join
Sub query: or (nested query)
Nested query:
Query that has another query embedded within it; the embedded query is
called a sub query.
A subquery typically appears:
within the WHERE clause of a query.
A subquery sometime appears:
in the FROM clause or the HAVING clause.
If the subquery return a single value: then we can use operators like = < > etc.
If the subquery returns multiple values then we can use operators like IN,
LIKE etc.
The IN operator allows us to test whether a value is in a given set of elements;
The subquery can be nested to any level.
The queries are evaluated: in the reverse order,
1. the inner most is evaluated first,
2. then the outer one
3. finally the outer most.

DCL: Data Control Language


1. GRANT
2. REVOKE
ACCESS CONTROL
SQL-92 supports access control through the GRANT and REVOKE
commands.
The GRANT command gives users privileges to base tables and views.
The syntax of this command is as follows:
GRANT privileges ON object TO users [ WITH GRANT OPTION ]
The syntax of the REVOKE command is as follows:
REVOKE [GRANT OPTION FOR] privileges ON object FROM users
{RESTRICT | CASCADE}

Ch: 32
Program written to perform different requirement created by the users /
Organization are called …
a. System program
b. Chip Program
c. Application Program
d. Low level program
Which of the following are the general activities, which are performed during
the development of application programs?
► Data input programs ► Processing ► Editing ► Display ► All of given
(Page 238)
Which of the following is true about application programs?
► develop before the database design
► Tools selection is made after the development database
► meant to perform different operations by the user (Page 238)
► must to design before the designing and developing database

User interface:
Type of user interface: (Two)
• Text based (UI)
• Graphical User Interface (GUI) most commonly called as Form
Text Based User Interface (UI):
In text based user interface certain keyboard numbers
For example:
Adding a Record --------- 1
Deleting a Record --------- 2
Enrollment --------- 3
Result Calculation --------- 4
Exit --------- 5

An effective user interface minimizes the……. users requires to learn and


implement the system.
a. Space b. Time c. Complexity d. Efficiency
Forms:
now days used extensively in the application programs.
Q: Write the type of forms?
ANS:
Following are the different types of forms
1. Browser Based (Web based)
2. Non-Browser/Simple
Browser Based:
These are web-based forms. They are developed in HTML, scripting language or
Front Page.
Non-Browser/Simple:
Visual Basic, Developer, MS Access
User Friendly Interface:
Q: Write the type of user?
ANS:
 Beginners (What and How) (task-oriented)
 Intermediate (What) (use the index as their primary access mechanism)
 Experts (Both) (keyboardoriented)
Type of users Type of forms
Two Two
 Beginners (What and How)  Browser Based (Web based)
(task-oriented)
 Intermediate (What) (use the  Non-Browser/Simple
index as their primary access  Visual Basic, Developer, MS
mechanism) Access

It should be user friendly and user must not search for required buttons or
text boxes” this statement is about.
a. Application program
b. User friendly interface
c. Language
d. System program
Which of the following is a feature of a good interface:
► User friendly ► Consistency ► Process based
Windows Controls: (take input and display output)
like buttons, checkboxes etc.
Numbers, Dates and Text:
Normally text boxes are used for the display of dates.

Ch: 33
Following things must be ensured for input forms:
• Forms should be user friendly
• Data integrity must be ensured,
• Checks can be applied within the tables definition
How creating database in MS Access?
a. First we will run MS Access and select New option from the file. Next it
will ask the name of database.
b. After running MS Access, create database option is selected directly
c. Go to the tool option of MS Access and select create Database option with
suitable name
d. Click on the file menu, select database option and then click it
The should be user friendly. Data integrity must be ensured and checks can be
applied within the tables.
a. Input forms
b. Computer language
c. Application programming
d. Database
Forms must be designed and arranged in a systematic manner.
Which of the following is not true about input forms?
► Provide an easy, effective, efficient way to enter data into a table
► Especially useful when the person entering the data is not familiar with the
inner workings
► Provide different controls to add data into the tables
► One input forms can populate one table at a time (Page 246)

Ch: 34
Classification of Physical Storage Media:
 Media are classified according to three characteristics
 Speed of access
 Cost per unit of data
 Reliability
 We can also differentiate storage as either
 Volatile storage (Power off storage lost)
 Non-volatile storage (Power off storage not lost)
 Typical media available are:
 Cache
 Main memory
 Flash Memory
 Magnetic disk
 Optical storage (CD or DVD)
 Tape storage

Cache
Cache: Pronounced cashe.
Two types of caching are commonly used in personal computers:
1. Memory caching (RAM)
2. Disk caching
A memory cache, sometimes called a cache store or ......?
a. ROM
b. DROM
c. RAM
d. Hard disk
The most recently accessed data from the disk (as well as adjacent
sectors) is stored in a …….?
a. Memory buffer
b. EPROM
c. ROM
d. Hard disk
Cache is a portion of memory made of high speed and cheaper _used for
main memory.
a. RAM,ROM
b. ROM,RAM
c. PROM,EPROM
d. Static RAM(SRAM),DRAM
SRAM: high-speed static RAM (SRAM)
DRAM: slower and cheaper DRAM
The Intel 80486 microprocessor: contains an 8K memory cache.
The Pentium has contains a 16K memory cache.
Level 1 (L1) cashes:
Such internal caches are often called Level 1 (L1) caches.
Level 2 (L2) cache:
External cache
L1 and L2 caches sit between the…. and the…..?
CPU, DRAM
Q: Accessing the speed of RAM ………. than hard disk?
ANS:
Thousands of times faster
Cache hit:
When data is found in the cache.
Main memory: known as RAM, standing for Random Access Memory.
Main memory: Constructed from (ICs) integrated circuits.
The access time to read or write any particular byte are independent of
whereabouts in the memory that byte is, and currently is approximately 50
nanoseconds (a thousand millionth of a second).
Main memory is expensive compared to external memory.
CPU transfer data to and from the main memory: in groups of two, four or
eight bytes.
Flash memory is a form of EEPROM.
1. Non-volatile
2. Stores information on a silicon chip
FAMOS: Floating-Gate Avalanche-Injection Metal Oxide Semiconductor.
Magnetic disk is round plate on which data can be encoded.
Two basic types of disks:
1. Magnetic disks
2. Optical disks
Types of Magnetic disks:
1. Floppy Disk:
2. Hard Disk

Floppy Disk: A typical 5¼-inch floppy disk can hold 360K or 1.2MB
(megabytes).
3½-inch floppies normally store 720K, 1.2MB or 1.44MB of data.
Hard disks can store anywhere from 20MB to more than 10GB.
Q: Which disk speed is faster?
i): Hard disk ii): Floppy disk
Hard disks are also from 10 to 100 times faster than floppy disks.
Optical disks come in three basic forms:
1. CD-ROM:
2. WORM
3. Erasable Optical (EO)
CD-ROM: Most optical disks are read-only.
WORM: Stands for write-once, read-many.
Erasable optical (EO): EO disks can be read to, written to, and erased just
like magnetic disks.
Laser records data by burning microscopic holes in the surface of the disk
with a
► Optical disk (Page 257)
► Floppy disk
The machine that spins a disk is called a disk drive.
Each disk drive is one or more heads (often called read/write heads) that
actually read and write data.
Disks: Non Volatile
RAID: Redundant Array of (Independent ) or Inexpensive Disks.
Striping:
Fundamental to RAID is "striping", a method of concatenating multiple
drives into one logical storage unit.
One sector (512 bytes) or as large as several megabytes (1MB).
RAID 0 RAID 1 RAID 2 RAID 3 RAID 4 RAID
not redundant redundant Hamming error stripes data at a byte stripes data at RAID-5 is the
correction level across several a block level best choices in
codes drives across several multi-user
drives environments
Data loss Data not SCSI drives Byte-level striping At least
loss support built-in requires hardware three and
error support for efficient
detection, use
more
typically
five drives
are
required for
RAID-5
arrays.

referred to as referred to as Single user


striping. mirroring. environments:
RAID 3

only two drives


are required;
Access methods:
to find and retrieve store records are called access methods.
Sequence field:
Records are arranged on storage devices in some sequence based on the value
of some field, called sequence field.
Sequence field is often the key field that identifies the record.

Ch: 35
File Organizations:
In this scheme, all the records have the:
1. the same size
2. the same field format
3. the fields having fixed size
Different record have different keys.
Log file are structured as: a pile.
Log file also called…..?
b. Transaction file c. Temporary file d. Hash file
Files store data and programs.
The method of access which uses key trAnsformation is known as
A. Direct
B. Hash
C. Random
D. Sequential
File protection:
When multiple users have access to files, it may be desirable to control by
whom and in what ways files may be accessed. This control is known as file
protection.
Type of Direct Access File Organization: (Two)
1. Indexed Sequential
2. Direct File Organization
Indexed Sequential:
The simplest indexing structure is the single-level one.
A file whose records are pair’s key-pointer.
where the pointer is the position in the data file of the record with the given
key.

Double file access: (index + data)


1. The decrease in access time with respect to a sequential file is
significant.
An index is a sequential file.
When multiple indexes are used the concept of sequentiality of the records
within the file is useless.
Type of indexes:
Two types of indexes are usually found in the applications:
1. The exhaustive type: entry for each record in the main file, in the order
given by the indexed key.
2. The partial type: entry for all those records that contain the chosen key
field (for variable records only).
Defining Keys:
An indexed sequential file must have at least one key.
The first (primary) key is always numbered 0.
Indexed sequential file can have up to 255 keys
For file-processing efficiency it is recommended that you define no more than
7 or 8 keys.
Which of the following are the feature of Indexed sequential files?
► Records are stored in sequence and index is maintained.
► Dense and nondense types of indexes are maintained.
► Track overflows and file overflow areas are be ensured.
► Cylinder index increases the efficiency
► All

Ch: 36
….provides rapid, non-sequential , direct access to records/Rows?
a. Hashing
b. Collisions handling
c. Non hashing
d. Sequential
Q: Hashing ….?
i): Direct access ii): sequential access
Hashing:
A key record field is used to calculate the record address by subjecting it to
some calculation; a process called hashing.
A hash function h is a function from the set of all search key values K to the
set of all bucket addresses B.
A good hash function gives an average-case lookup.
The worst hash function maps all keys to the same address/bucket.
The best hash function maps all keys to distinct addresses.
Following are the major characteristics of hash function:
• No indexes to search or maintain
• Very fast direct access
• Inefficient sequential access
• Use when direct access is needed, but sequential access is not.
Hashing Algorithms:
Two types
1. Prime division/remainder
2. Folding method
Perfect hashing function:
The direct address approach requires that the function, h(k), is a one-to-one
mapping from each k to integers in (1,m). Such a function is known as a
perfect hashing function
A perfect hash function map a key to distinct location having search time
► O(1) Page no 266
► O(n)
► O(n+1)
► O(n-1)
Handling the Collisions:
Various techniques are used to manage this problem:
• Chaining,
• Overflow areas,
• Re-hashing,
• Using neighboring slots (linear probing),
• Quadratic probing,
• Random probing, ...
Which of the following is disadvantage/tradoff of chaining technique to handle
the collisions?
► Unlimited Number of elements
► Fast re-hashing
► Overhead of multiple linked lists (Page 269)
If there is a further collision, we re-hash until an empty "slot" in the table is
found.
Advantage of Re-Hashing technique to handle the collisions?
► Collisions don’t use primary table space
► Unlimited number of elements
► Fast access through use of main table space (Page269 )
One of the simplest re-hashing functions is +1 (or -1).
Linear probing: also implements a quadratic re-hash function.
Overflow area:
Another scheme will divide the pre-allocated table into two sections:
1. The primary area to which keys are mapped
2. An area for collisions, normally termed the overflow area.
Method for handling collisions:
• Open addressing (array of (key, value) pairs.)
• Chaining
The hash table is an array of (key, value) pairs.
Q: Write the difference ways of choosing an alternative location?
ANS:
1. Linear probing
2. Double hashing
3. Rehashing
Linear Probing:
The simplest probing method is called linear probing.
In linear probing, the probe sequence is simply:
the sequence of consecutive locations, beginning with the hash value of the
key.
If the end of the table is reached: the probe sequence wraps around and
continues at location 0.
Only if the table is completely full will the search fail.
Summary Hash Table Organization
Organization Advantages Disadvantages

Chaining Unlimited number of Overhead of multiple


elements Unlimited linked lists
number of collisions
Re-hashing Fast re-hashing Fast Maximum number of
access through use of elements must be known
main table space
Overflow area Fast access Collisions Two parameters which
don't use primary table govern performance
space need to be estimated

Ch: 37
An index on a table is a disk-based data structure.
(stored as file) that speeds up selections on the search key fields for the index.
An index contains a collection of data entries.
There can be multiple (different) indexes per file.
An index in a book is a list of words with the page numbers that contain each
word.
An index in a database is a list of values in a table with the storage locations of
rows in the table that contain each value.
Indexes can be created on either:
1. A single column
2. A combination of columns in a table
Inverted files or inversion , linked list, B+ trees, there are the three
implementation approaches of
a. Sequential Access
b. Hashing
c. Non sequential
d. Indexes
Following are the three-implementation approaches of Indexes:
• Inverted files or inversions
• Linked lists
• B+ Trees
Can be defined when even there is no data in the table , existing values
are checked in execution of this command , it supports section of forms
, these are the major properties of
a. Sequential approach
b. Direct approach
c. view
d. indexes
Index Classification:
 Clustered vs. Un-clustered Indexes
 Single Key vs. Composite Indexes
 Tree-based, inverted files, pointers
Primary Index
o If the index is created on the basis of the primary key of the table, then
it is known as primary indexing.

Primary keys are unique to each record and contain 1:1 relation

As primary keys are stored in sorted order, the performance of the


searching operation is quite efficient.

o Two types of primary index:

1. Dense index
2. Sparse index

Secondary Indexes:
Users often need to access data on the basis of non-key or non-unique
attribute.
Secondary Index is a two column file, which stores the address of every tuple
of the table.
Inverted files or inversions:
More than one inversions can be created, even on all attributes
Inverted file contains the values of attribute in sorted order and the relative
address of the record.
Each node stores key values and pointers; nodes are connected through links.
B+ Trees:
 Number of pointers in a node is one more than the number of keys
 Number of pointers called order of tree
 Distance of all leaf nodes from the root is same
Index can also be created on composite attributes.
Properties of Indexes:
 Indexes can be defined even when there is no data in the table
 Existing values are checked on execution of this command
 Support selections of form
 field <operator> constant
 Support equality selections
 Either “tree” or “hash” indexes help here.
Index support Range selections:
1. <, >
2. <=
3. >=
4. BETWEEN)
To select the range while creating Indexes, the operator that is not used is .
 Equals to (==)
 Less than (<)
 BETWEEN

 “Hash” indexes don’t work for these


Indexes are logically and physically independent of the data in the associated
table.
Indexes, being independent structures, require storage space.

Ch: 38
Fast random access: an index structure may be used.
Primary index or Clustering index Secondary index or Non Clustering
index
records is sequentially different from the sequential order
primary index, or clustering index. secondary indices, or nonclustering
indices

How many clustered index(es) do each database table have?


►2
►3
►5
►1
Clustered:
 If order of data records is the same as order of index data entries,
then called clustered index.
A file can be clustered on at most …….search key.
►2
►3
►5
►1
Q: A clustered index determines the …….. of data in a table?
Storage order
A index determines the storage order of data in table
a. Primary
b. Clustered
c. Dense
d. Secondary
The index can comprise multiple columns (a composite index).
A telephone directory is organized by last name and first name.
NoPRIMARY KEY constraints create clustered indexes automatically if no clustered index already
exists on the table and a nonclustered index is not specified when you create the PRIMARY KEY
constraint.
Non-clustered Indexes:
Nonclustered indexes have the same B-tree structure as clustered indexes, with two significant
differences:

The data rows are not sorted and stored in order based on their nonclustered keys.
The leaf layer of a nonclustered index does not consist of the data pages. Nonclustered indexes can
be defined on either a table with a clustered index or a heap.
NOTE: See cluster and nocluster again
Because nonclustered indexes store clustered index keys as their row locators, Clustered index keys
as small as possible.
Do not choose large columns as the keys to clustered indexes if a table also has nonclustered
indexes.

There are Two types of ordered indices:


1. Dense
2. Sparse Indices
The index which has some of the key value is classified as
a. Linear index
b. Dense index
c. Non dense index
d. Cluster index
To locate a record, we find the index record:
Largest key values ≤ search key value.
Dense indices are faster in general.
Sparse indices require less space.
No overflow blocks: we can use binary search.
as 1log2(b) blocks (as many as 7 for our 100 blocks).
Overflow blocks: then sequential search typically used, reading all b index
blocks
Indices must be updated at all levels when insertions or deletions require it.
Indexes Using Composite Search Keys:
The search key for an index can contain several fields; such keys are called
composite search keys or concatenated keys.
Two-level sparse index:
Deletion: Find (look up) the record
Insertion: Find place to insert.

Ch: 39
Views are generally used to focus, simplify and customize the perceptions each
user has of the
a. Database
b. Program
c. Operating system
d. View
 A view is defined to combine certain data from one or more tables for
different reasons
 Views have other options such as totals and subtotals.
 A "view" is essentially a dynamically generated.
Q: There are two ways to create a new view in your database:
1. Create a new view from scratch.
2. Or, make a copy of an existing view and then modify it.
There are ways to create a new view in your database
a. Three
b. Two
c. Five
Q: Types of views?
ANS:
1. Materialized View
2. Simple Views
3. Complex View
4. Dynamic Views.
Materialized View:
A materialized view is a replica of a target master from a single point in time.
materialized views are updated from one or more masters through individual
batch updates, known as a refreshes refreshes refreshes refreshes, from a
single master site or master materialized view site

Simple view:
As defined earlier simple views are created from tables and are used for
creating secure manipulation over the tables or structures of the database.
Views make the manipulations easier to perform on the database.
Dynamic Views:
In which data is not stored and the expressions used to build the view are used
to collect the data dynamically.
Dynamic views generally are:
1. complex views
2. views of views
3. views of multiple tables
Views can be referred in SQL statements like tables.
Which of the following is INCORRECT about VIEWS?
► It is not possible to left out the data which is not required for a specific
view. (Page 280)
► A database view displays one or more database records on the same page.
► Views can be used as security mechanisms
► Views are generally used to focus the perception each user has of the
database.
Deleting Views:
A view can be dropped using the DROP VIEW command.

Ch: 41
Views are virtual tables.
Other name of materialized views:
also called indexed views created through clustered index.
Indexes: cannot be accessed directly using a SELECT statement.
Materialized views: can be accessed directly using a SELECT statement.
Materialized views and Indexes are NOT similar by the following way(s).
► They must be refreshed when the data in their master tables changes.
► They can be accessed directly using a SELECT statement (Page 291)
► All of given
Materialized views in data warehouses are typically referred to as summaries.
A materialized view can be partitioned.
You can define a materialized view on a partitioned table and one or more
indexes on the materialized view.
Simple view:
View on single table is simple.
Complex view:
Involving multiple tables is called complex view.
Transaction Management:
an indivisible unit of work comprised of several operations.
ACID: Atomicity, Consistency, Isolation, and Durability.
In which of the following, Materialized Views are suitable
► Data warehousing
► Decision support
► Mobile computing
► All of the Given (Page 290)
Which of the following should be a property of a database transaction?
► Atomicity
► Consistency
► Isolation
► Durability
► All
A transaction can thus end/termination in two ways:
1. Commit (successful execution)
2. Rollback or Abort (Error)

Ch: 42
• To read a database object: it is first brought into main memory from disk, and then its value is
copied into a program variable.
Main memory to disk
• To write a database object: in-memory copy of the object is first modified and then written to
disk.

Database `objects: are the units in which programs read or write information.
Transaction:
a set of actions that are partially ordered.
Q: A transaction is seen by the DBMS as a……, or list, of actions?
► Series
► Parallel
Denote the transactions:
A transaction T reading an object O as
Reading: RT (O);
Writing: WT (O).
When the transaction T is clear from the context, we will omit the subscript.
A schedule is a list of actions (reading, writing, aborting, or committing) from
a set of transactions.
The order in which two actions of a transaction T appear in a schedule must
be the same as the order in which they appear in T.
I/O activity can be done in parallel with CPU activity in a computer.
Strict Two-Phase Locking (Strict 2PL): Has two rules.
Deadlock:
A cycle of transactions waiting for locks to be released is called a
deadlock.
The second rule in Strict 2PL is:
All locks held by a transaction are released when the transaction is completed.
We can prevent deadlocks: by giving each transaction a priority.
The lower the timestamp, the higher the transaction's priority

The oldest transaction has the highest priority.


Deadlock Prevention
Wait-die: Ti has higher wait Aborted
priority
Wound-wait Ti has higher abort Tj Ti waits
priority
In the wait-die Lower PT Can never wait Higher PT
scheme
In the wound-wait Higher Pt Can never wait Lower PT
scheme:
Wait-die scheme is non-preemptive.
Wound-wait scheme is preemptive.
Wait for graph:
The lock manager maintains a structure called a waits-for graph.
…… is used to detect the dead lock?
 Wait-for graph Page no 320
 Cross Reference Matrix
 Inner Join
 Clustered Index
Wait-for graph is maintained by……..?
 Lock manager Page no 320
 Index Manager
 View Manager
 Constraint Manager
In prevention-based schemes, the abort mechanism is used preemptively in
order to avoid deadlocks.
In detection-based schemes, the transactions in a deadlock cycle hold locks
that prevent other transactions from making progress.
Conservative 2PL:
A variant of 2PL called Conservative 2PL can also prevent deadlocks.
Conservative 2PL can reduce the time

Ch: 43
Log entry is made only for the ….. operation?
► Write
► Read
Write Sequence or operation:
X = X + 10
Write X …….
Write Sequence or operation Data Entry in log file will be
X = X + 10 <Tn, X, 33>

The assignment operation and any other mathematical or relational operation


is executed in RAM.
When the ‘commit’ statement is executed then, first, the database buffer is updated.

After execution of “commit” statement, is updated first.


 Database Buffer Page no 303
 Clustered-Index
 Non-Clustered Index
 Application Program
Q: While recovering data, which of the following files does a recovery
manager examines at first?
► A system file
►Log file (Page 303) rep
► Data dictionary
►Metadata
Which of the following serves as milestone or reference point in the log file?
 Constraints
 Relations
 Check points
 Relationships
Checkpoint:
Checkpoint is also a record or an entry in the log file.
Log File:
 Tool used for database recovery
 Transaction Record:
 <T, starts>
 <T, commits> or <T, aborts>
Transaction Operations:
Read (X) Operations of DBMS concern
X=X+5
Write (X) Concern of Recovery Manager
Y=Y*3
Write (Y) So log file contains entries only for
Commit write operations
For a write operation entry is made in log file in RAM.
Log File Entries:
Read (X) Supposing
<T, starts>
X=X+5
<T, X, 55>
Write (X) X = 50
<T, Y, 30>
Y = Y * 15
<T, commit>
Write (Y) Y = 10
<T, starts>
Commit

The structure of log file entry for immediate update technique is


< Tr, object, old_value, new_value >,
RM: Recovery manager.
Forward order: < Tr, begin> and < Tr, commit> Redone
Reverse Order: < Tr, begin > and < Tr, abort > or < Tr, begin > Undone
Concurrency Control:
The objective of the CC is to control the concurrent access of database by
multiple users at the same time called the concurrent access
Q: …….. occurs when multiple users want to update same object at the same
time?
 Uncommitted Update Problem
 Inconsistent Analysis Problem
 Lost of Joins Problem
 Lost Update Problem
Problems due to Concurrent Access:
• Lost Update Problem
• Uncommitted Update Problem
• Inconsistent Analysis Problem
Lost Update Problem:
This problem occurs when multiple users want to update same object at the
same time.
Main tool used for recovery is the log file.
The deferred update approach stores only the new value.
The immediate update approach stores the previous as well as new value.
Ch: 44
Serial Execution:
transactions are executed in a sequential order.
A transaction may consist of many operations.
A Schedule or History: (order of execution)
is a list of operations from one or more transactions.
Serial schedule:
Schedule for a serial execution is called a serial schedule.
A serial schedule always leaves the database in a consistent state.
There are two major approaches to implement the serializability;
 Locking
 Timestamping
Locking:
Locking is maintained by the:
Transaction manager
Lock manager.
Transactions perform two types of operations on objects:
1. Read
2. Write
Two types of operations, two types of locks
1. Read or Shared
2. Write or exclusive
The compatibility of locks means: that if the two locks from two different
transactions may exist at the same time.
Transaction A:
Only two shared locks are compatible with each other, none of the remaining
combinations are compatible

Ch: 45
Interaction is in the process of changing system state.
Transaction isolation levels.
Two primary modes for taking locks:
1. Optimistic
2. Pessimistic.
Pessimistic locking guarantees that the first transaction can always apply a
change to the data it first accessed.
Optimistic locking mode: the first transaction accesses data but does not take
a lock on it.
Types of Locks:
1. Shared 2. Update 3. Exclusive locks
Q: Shared locks are used when data is …..(usually in pessimistic locking
mode).
a. Read b. Write
Deadlock is a situation when two transactions are waiting for each other to
release a lock.
Wait – for Graph:
Q: ….. is used for the detection of deadlock.
ANS:
Wait – for Graph
Arrowhead represents that a transaction has locked a particular data item.
Two Phase Locking:
 The locks are granted and released in two phases, the growing and
shrinking phase
 Finer the granularity more concurrency but more overhead
When a lock at lower level is applied, compatibility is checked upward.
The granularity of locks: how much of the data is locked at one time.

You might also like