0% found this document useful (0 votes)
43 views

Advanced Database Concepts

This document provides an introduction to advanced database concepts. It discusses key terminologies like databases, database management systems, and examples of well-known DBMSs. It also addresses how to represent and operate on data using relational algebra and SQL queries. The document notes that indexing can speed up relational operations through precomputed data structures like B-trees and hash tables. It introduces transactions and the ACID properties of atomicity, consistency, isolation, and durability. Finally, it briefly discusses how transactions are implemented using locking and timestamp mechanisms and mentions that understanding these concepts is useful for database programming.

Uploaded by

Bilal Tasneem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Advanced Database Concepts

This document provides an introduction to advanced database concepts. It discusses key terminologies like databases, database management systems, and examples of well-known DBMSs. It also addresses how to represent and operate on data using relational algebra and SQL queries. The document notes that indexing can speed up relational operations through precomputed data structures like B-trees and hash tables. It introduces transactions and the ACID properties of atomicity, consistency, isolation, and durability. Finally, it briefly discusses how transactions are implemented using locking and timestamp mechanisms and mentions that understanding these concepts is useful for database programming.

Uploaded by

Bilal Tasneem
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

ADVANCED DATABASE CONCEPTS

INTRODUCTION

TERMINOLOGIES

A database is an organized collection of data. It is the collection

of schemas, tables, queries, reports, views, and other objects.


A database management system (DBMS) is a computer software application

that interacts with the user, other applications, and the database itself to capture
and analyze data. A general-purpose DBMS is designed to allow the definition,
creation, querying, update, and administration of databases.
Well-known DBMSs include MySQL, Microsoft SQL Server, Oracle.

HOW TO REPRESENT DATA?

HOW TO OPERATE ON DATA?

Given the data, say, a set of tables, how to answer queries?

Difficulty: Queries may depend crucially on the data in all tables.

SQL

Find all products under price 200 manufactured in Japan?

SELECT x.PName, x.Price


FROM Product x, Company y
WHERE x.Manufacturer=y.CName
AND y.Country=`Japan'
AND x.Price 200

HOW TO SPEED UP THE OPERATION?


Relational operations can sometimes be computed much faster if we

have precomputed a suitable data structure on the data. This is called


Indexing.
Most notably, two kinds of index structures are essential to database

performance:

B-trees.

External hash tables.

For example, hash tables may speed up relational operations that

involve finding all occurrences in a relation of a particular value.

HOW TO DEAL WITH TRANSACTIONS?


Transactions with the ideal ACID properties resolve the semantic

problems that arise when many concurrent users access and change the
same database.
ACID (atomicity, consistency, isolation, and durability) is an acronym for

learning and remembering the four primary attributes ensured to any


transaction by a transaction manager (which is also called a transaction
monitor).
These attributes are

ACID PROPERTIES
Atomicity. In a transaction involving two or more discrete pieces of

information, either all of the pieces are committed or none.


Consistency. A transaction either creates a new and valid state of

data, or, if any failure occurs, returns all data to its state before the
transaction was started.
Isolation. A transaction in process and not yet committed must remain

isolated from any other transaction.


Durability. Committed data is saved by the system such that, even in

the event of a failure and system restart, the data is available in its
correct state.

We will talk about how transactions are implemented using locking and timestamp mechanisms.

We will talk about how transactions are implemented using

locking and timestamp mechanisms later


This knowledge is useful in database programming, e.g., it makes it

possible in some cases to avoid (or reduce) rollbacks of transactions,


and generally make transactions wait less for each other.

SUMMARIZE

And you need math!!

Data Representation, Relational


Algebra, SQL (Datalog), etc.

Indexing, Query
Optimization,
Concurrency Control, etc.

INSTRUCTOR

Syed Jaseemuddin
Email: [email protected]
Contact: 021-99261261-2287

GRADING [40 MARKS]

Assignments 25% {Solutions should be typed and unique}


And one reading assignment (25%) (next slide for details)
Selected/volunteer students will give presentations
Mid-term (50%)

READING ASSIGNMENT
One or a group of two read some (1 to many) papers/surveys/articles and write a

report (4 pages for one, and 8 pages for a group of two) on what you think of the
articles you read (not just a repeat of what they have said).

Topics can be found in Redbook

http://redbook.cs.berkeley.edu/bib4.html
Want more topics on the course (google the papers / surveys yourself)
Selected students/groups will give 25mins talks (20mins presentation +5mins Q&A) in

class.

The best individual/group will get a bonus in their final grades.


A penalty will be given if you agree to give a talk but cannot do at the end, while the

quality of the talk is irrelevant.

THE GOAL OF THIS COURSE

Open / change your views of the world (of databases)


Seriously, it is not just SQL programming.

BIG DATA

BIG DATA
Extremely large data sets that may be analyzed computationally to reveal

patterns, trends, and associations, especially relating to human behavior and


interactions.
"much IT investment is going towards managing and maintaining big data
Big data is everywhere

: an index of over 19 billion web pages


:over 40 billion of pictures

SOURCE AND CHALLENGE


Source:

Retailer databases: Amazon, Walmart

Logistics, financial & health data: Stock prices

Social network: Facebook, twitter

Pictures by mobile devices: iphone

Internet traffic: IP addresses

New forms of scientific data: Large Synoptic Survey Telescope

Challenge

Volume

Velocity

Variety (Documents, Stock records, Personal proles, Photographs, Audio & Video, 3D models,
Location data, . . . )

The main technical challenges

You might also like