The document provides an overview of a MOOC course on database management systems. It describes the course outline, contents, and key concepts covered in each of the 8 weeks including SQL, data modeling, database design, indexing, transactions, and concurrency control.
The document provides an overview of a MOOC course on database management systems. It describes the course outline, contents, and key concepts covered in each of the 8 weeks including SQL, data modeling, database design, indexing, transactions, and concurrency control.
Submitted by Siddhant Srivastava(RA2111003030095) Aditya Gupta (RA2111003030087) under the guidance of MS. SHRUTHY GOVINDAN Under the governing NPTEL body of Bachelor of Technology in COMPUTER SCIENCE AND ENGINEERING of FACULTY OF ENGINEERING AND TECHNOLOGY Index Date of Date of S. Programs Signature Experiment Submission No. ADITYA GUPTA Data Base Management System 50 18.96/25 30.75/75 6225 Jan-Mar 2024 (8 week course) Roll No: NPTEL24CS21S657900519 To verify the certificate No. of credits recommended: 2 or 3 Course Outline • Week 1: Introduction to DBMS & Relational Model • Week 2: SQL - Introductory & Intermediate • Week 3: Advanced SQL and ER Modelling • Week 4: RDBMS Design – Dependency & Normal Form • Week 5: Application Design and Storage • Week 6: Indexing and Hashing • Week 7: Transaction and Concurrency Control • Week 8: Recovery and Query Processing & Optimization WEEK 1 CONTENTS : 1. Introduction to DBMS 2. Levels Of Abstractions 3. Data Models 4. DDM AND DDL 5. Transactions 6. Introduction to relational Models & Relational Query Languages Introduction to database management System DBMS contains information about a particular enterprise ● Collection of interrelated data ● Set of programs to access the data ● An environment that is both convenient and efficient Abstraction in DataBase Management System Data abstraction in a DBMS is the process of hiding unnecessary details about the data from users, while still providing them with the ability to access and manipulate the data as needed. This is done by creating multiple layers of abstraction, each of which provides a different level of detail about the data Levels Of Abstraction There are basically three layers of abstraction : 1. Physical Layer 2. Logical Layer 3. View Layer Data Models A collection of tools for describing 1. Data 2. Data relationships 3. Data semantics 4. Data constraints DDL & DML Data Definition Language : Data Manipulation Language : Specification notation for defining the Language for accessing and database schema manipulating the data organized Example: create table instructor ( by the appropriate data model . ID char(5), name varchar(20), dept_name varchar(20), salary numeric(8,2)) Transaction Management A transaction is a collection of operations that performs a single logical function in a database application . ● Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. ● Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. WEEK 2 CONTENTS : 1. SQL 2. Basic Query Structure 3. Basic Operations 4. Modification Of the DataBase SQL The SQL data-definition language (DDL) allows the specification of information about relations, including: ● The schema for each relation ● The domain of values associated with each attribute ● Integrity constraints Domain Types In SQL ● Char(n):Character value of length n ● Varchar(n):variable character of maximum length n ● Int: Integer value of size n . ● numeric(p,d): Value with p digits on left side of decimal and d digits on right . ● float(n): Float value of length n . Basic Query Structure ● A typical SQL query has the form : ● select:A1 , A2 , . . . . . . . . . . ,An ● From : R1 , R2 , . . . . . . . . . . . . Rm ● Where P Here , 1. Ai represents an attribute 2. Ri represents a relation 3. P is a predicate Various Operations Various DBMS Operations include :- 1. Cartesian Product 2. Rename operation 3. Aggregate Functions Modification of the DataBase Modification of database include : 1. Deletion of tuples 2. Insertion of new tuples 3. Updating of Values WEEK 3 CONTENTS ● Accessing SQL from a programming Language ● Triggers ● Relational Algebra ● Design Process ● E-R Model ● Design Issues Accessing SQL from a programming Language API (application-program interface) for a program to interact with a database server . Application makes calls to Connect with the database server ● Send SQL commands to the database server ● Fetch tuples of result one-by-one into program variables Some examples include JDBC , ODBC(open database connectivity) , Embedded SQL TRIGGERS A trigger is a statement that is executed automatically by the system as a side effect of a modification to the database To design a trigger mechanism, we must: Specify the conditions under which the trigger is to be executed. Specify the actions to be taken when the trigger executes. Triggering event can be insert, delete or update Relational Algebra : The operators take one or two relations as inputs and produce a new relation as a result Six basic operators ● select: ● Union: ● set difference: ● Cartesian product: x ● rename: Entity Relation Model ● An entity is an object that exists and is distinguishable from other objects. Example: specific person, company, event, plant ● An entity set is a set of entities of the same type that share the same properties. Example: set of all persons, companies, trees, holidays ● An entity is represented by a set of attributes; i.e., descriptive properties possessed by all members of an entity set. Entity Relation Model WEEK 4 CONTENTS ● Combine Schema ● Functional Dependencies ● Normalization or Schema Refinement ● Desirable Properties of Decomposition ● Database Design Process Combine Schemas Suppose we combine schemas containing information of instructor and department , repetition of information like building , budget etc can occur this leads to duplicate slides . Functional Dependencies: ● Require that the value for a certain set of attributes determines uniquely the value for another set of attributes ● A functional dependency is a generalization of the notion of a key ● K is a superkey for relation schema R if and only if K→R ● Functional dependencies allow us to express constraints that cannot beexpressed using superkeys. Example of functional dependencies Normalization or Schema Refinement 1. Normalization or Schema Refinement is a technique of organizing the data in the database 2. A systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics : ● Insertion Anomaly ● Update Anomaly ● Deletion Anomaly 3.Normalization is used for mainly two purpose: Eliminating redundant (useless) data Ensuring data dependencies make sense, that is, data is logically stored WEEK 5 CONTENTS ● Application Architecture ● Application Security ● Application performance ● File Storage Application Architecture Application Performance ● Performance is an issue for popular Web sites ● Caching techniques used to reduce cost of serving pages by exploiting commonalities between requests →At the server site: ● Caching of JDBC connections between servlet requests – a.k.a. connection pooling ● Caching results of database queries– Cached results must be updated if underlying database changes →At the client’s network ● Caching of pages Application Security →Databases Are regularly attacked by outsiders for the valuable informations stored →Databases are regularly subjected to SQL Injections →Various precautionary measures taken are : ● Password directories are not stored in plain text ● Application authentication ● Application Authorization Physical Storage Media WEEK 6 CONTENTS ● Basic concept of indexing ● Balanced Binary Search Tree ● Hashing Basic Concepts Of Indexing →Indexing mechanisms used to speed up access to desired data. →An index file consists of records (called index entries) of the form →Search Key - attribute to set of attributes used to look up records in a file →Index files are much smaller than the original files. →ordered indices:sarch keys are stored in sorted order . →hash indices : search keys are distributed normally Binary Search tree : Static Hashing ● A bucket is a unit of storage containing one or more records (a bucket is typically a disk block) ● In a hash file organization we obtain the bucket of a record directly from its search-key value using a hash function ● Hash function h is a function from the set of all search-key values K to the set of all bucket addresses B ● Hash function is used to locate records for access, insertion as well as deletion ● Records with different search-key values may be mapped to the same bucket thus entire buckethas to be searched sequentially to locate a record WEEK 7 & WEEK 8 CONTENTS ● Transaction ● Serializibility ● Conflicts Transaction - Failure Classification Logical errors: transaction cannot complete due to some internal error condition System errors: the database system must terminate an active transaction due to an error System crash: a power failure or other hardware or software failure causes the system to crash Recovery Methods : →Log Based Recovery →Transaction Commit →Undo and Redo operations →Checkpoints Implementing Locking → A lock manager can be implemented as a separate process to which transactions send lock and unlock requests → The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to roll back, in case of a deadlock) → The requesting transaction waits until its request is answered DeadLock Handling →System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for another transaction in the set →Deadlock prevention protocols ensure that the system will never enter into a deadlock state. →Some prevention strategies : ● Require that each transaction locks all its data items before it begins execution ● Impose partial ordering of all data items and require that a transaction can lock data items only in the order specified by the partial order ● Timeout Based Schemes