ECI 4166 Data Modelling and Database Systems Unit 03
ECI 4166 Data Modelling and Database Systems Unit 03
ECI 4166 Data Modelling and Database Systems UNIT 03 DATABASE OPERATIONS
Introduction Relational database management systems are composed of objects or relations. They are managed by operations and controlled by data integrity constraints. To access the database it has to execute a structured query language (SQL) statement which is the American National Standards Institute (ANSI) standard language for operating relational databases. Both ANSI and ISO (International Standard Organisation) have accepted SQL as the standard language for relational databases. The language contains a large set of operators for partitioning and combining relations. The database can be modified by using the SQL statements. In this unit students are learning the basic SQL statements to operate and manage a relational database through adequate knowledge in theory of sets.
Objectives: After going through this unit, you should be able to; Define concepts of set theory and apply fundamental SET operations to a given problem Define and manipulate data in a database using primary SQL statements Retrieve data from the database using SQL Access database through general purpose programming language Describe library procedures and functions to access databases
References: Elmasri, Navathe, Somayajulu, Gupta, Fundamentals of DATABASE SYSTEMS, th Pearson Education, 4 ED,2008, Chapter 07
Contents 9.1 Basic concepts of Set Theory 9.1.1. Set Theory 9.1.2. Set Operations 9.2 Set operations with SQL 9.3 Introduction to relational algebra and calculus
Further reading
Introduction The relational model is based on the concepts of sets and the relationship between these sets. In fact a data model needs three concepts, structure, rules and manipulation. In this session we will be looked at the structure and some of the rules, but to be able to use the database effectively we also need to be able to manipulate it, i.e. insert, update, delete and retrieve data.
Objectives After going through this session you should be able to; Define basic concepts of set theory Apply basic operations of sets to a given problem
9.1.
The concept of set is fundamental to mathematics and computer science. Everything mathematical starts with sets. For example, relationships between two objects are represented as a set of ordered pairs of objects, the concept of ordered pair is defined using sets, natural numbers, which are the basis of other numbers, are also defined using sets, the concept of function, being a special type of relation, is based on sets, and graphs and digraphs consisting of lines and points are described as an ordered pair of sets.
You will be learned the following basic concepts of set theory in this session; i. ii. iii. iv. v. subset union intersect complement symmetric difference
9.1.1.
Set Theory
The approach to set theory is called "naive set theory" as opposed to more accurate "axiomatic set theory". It was first developed by the German mathematician George Cantor at the end of the 19th century.
9.1.2.
Set Operations
One of the big advantages of relational databases is that data can be retrieved from any set of relational tables using simple relational operations. Because the model is based on set theory, then queries can also be expressed using a theoretical notation. You will be learned the following three main set operators in this session; i. ii. iii. UNION operator INTERSECT operator EXCEPT operator
9.2
The relational algebra is an example of this that you have learned in the previous unit. The use of set theory provides a basis for all relational query languages such as SQL. Please refer Unit 03 of supplementary material.
Refer : Elmasri, Navathe, Somayajulu, Gupta, Fundamentals of DATABASE th SYSTEMS, Pearson Education, 4 ED, Chapter 07
Contents 10.1 Basic SQL Operations 10.2 Data Definition Language (DDL) Further reading
Introduction SQL statements are divided into two major categories as data definition language (DDL) and data manipulation language (DML). Both of these categories contain far more statements than we can present here in this session, and each of the statements is complex than it shows in this introduction. If you want to master this area, we strongly recommend that you find a SQL reference for your own database software as a supplement to this session. Please refer Unit 03 of supplementary material for more details.
Objectives At the end of this session, you will be able; categorise and define SQL statements describe the database management capabilities provided by DBMS tool use the DBMS database tool to access a database
10.1
IBM Sequel language developed as part of System R project at the IBM San Jose Research Laboratory. Renamed Structured Query Language (SQL) ANSI and ISO standard SQL:
SQL-86 SQL-89 SQL-92 SQL:1999 (language name became Y2K compliant!) SQL:2003
Commercial systems offer most, if not all, SQL-92 features, plus varying feature sets from later standards and special proprietary features.
10.2.
A Data Definition Language (DDL) is a computer language for defining data structures. DDL statements are used to build and modify the structure of your tables and other objects in the database. When you execute a DDL statement, it takes effect immediately. DDL allows the specification of not only a set of relations but also information about each relation, including: The schema for each relation. The domain of values associated with each attribute. Integrity constraints The set of indices to be maintained for each relation. Security and authorization information for each relation. The physical storage structure of each relation on disk.
DML has a specified notation for defining the database schema Example: create table account ( account-number char(10), balance integer)
DDL compiler generates a set of tables stored in a data dictionary. Data dictionary contains metadata (i.e., data about data) Database schema Data storage and definition language o Specifies the storage structure and access methods used
Integrity constraints o o o Domain constraints Referential integrity (references constraint in SQL) Assertions
Authorization
Students will be learning the below three main commands on database, table, view; CREATE command ALTER command DROP command
Further reading
Further reading
Introduction In this session, introduce a selection of join algorithms that can be used in various situations to reduce the complexity of query oriented joins. Joins are one of the basic constructions of SQL and Databases as such - they combine records from two or more database tables into one row source, one set of rows with the same columns. And these columns can originate from either of the joined tables as well as be formed using expressions and built-in or user-defined functions.
Objectives At the end of this session, you will be able to; Process simple and complex SELECT statements including subqueries, temporary tables, and unions Identify ways to optimize queries Define and use transactions Recognize the importance of concurrency control Define a transaction Use the BEGIN, COMMIT, and ROLLBACK statements in a transaction Describe the types of concurrency control in a database Describe the four levels of locking granularity
11.1
Most of the actions you need to perform on a database are done with SQL statements. Some database systems require a semicolon at the end of each SQL
statement. Semicolon is the standard way to separate each SQL statement in database systems that allow more than one SQL statement to be executed in the same call to the server.
11.1.1. SELECT ~ FROM ~ WHERE The SELECT statement has many optional clauses: WHERE specifies which rows to retrieve. GROUP BY groups rows sharing a property so that an aggregate function can be applied to each group. HAVING selects among the groups defined by the GROUP BY clause. ORDER BY specifies an order in which to return the rows.
11.1.2. Basic query operation: the join in SQL In most queries, we will want to see data from two or more tables. To do this, we need to join the tables in a way that matches up the right information from each one to the other. The easiest way to understand the join is to think of the database software looking one-by-one at each pair of rows from the two tables.
11.1.3. SQL technique: join types An SQL JOIN clause combines records from two tables in a database. It creates a set that can be saved as a table or used as is. A JOIN is a means for combining fields from two tables by using values common to each. ANSI standard SQL specifies four types of JOINs: INNER, OUTER, LEFT, and RIGHT. Inner Join An inner join requires each record in the two joined tables to have a matching record. An inner join essentially combines the records from two tables (A and B) based on a given join-predicate. INNER JOIN - only rows satisfying selection criteria from both joined tables are selected.
Outer joins An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each recordeven if no other matching record exists. Outer joins subdivide further into left outer joins, right outer joins, and
full outer joins, depending on which table(s) one retains the rows from (left, right, or both). LEFT OUTER JOIN - rows satisfying selection criteria from both joined tables are selected as well as all remaining rows from left joined table are being kept along with Nulls instead of actual right joined table values. RIGHT OUTER JOIN - rows satisfying selection criteria from both joined tables are selected as well as all remaining rows from right joined table are being kept along with Nulls instead of actual left joined table values. FULL OUTER JOIN - rows satisfying selection criteria from both joined tables are selected as well as all remaining rows both from left joined table and right joined table are being kept along with Nulls instead of values from other table.
Natural join A natural join offers a further specialization of equi-joins. The join predicate arises implicitly by comparing all columns in both tables that have the same column-name in the joined tables. The resulting joined table contains only one column for each pair of equally-named columns.
11.2
Sometimes, the information that we need is not actually store in the database, but has to be computed in some way from the stored data. There are many functions in any implementation of SQL. Unfortunately, many of the functions are defined quite differently in different database packages, so you should always consult a reference manual for your specific software. We can compute values from information that is in a table simply by showing the computation in the SELECT clause. Each computation creates a new column in the output table, just as if it were a named attribute. You will be learned below SQL Functions
Average COUNT MAX MIN SUM
11.3 Transactions and Concurrency Control Further reading: Please refer Unit 3 of supplementary material. Refer : Elmasri, Navathe, Somayajulu, Gupta, Fundamentals of DATABASE SYSTEMS, Pearson Education, 4th ED
Contents 12.1 Data manipulation language 12.2 Database manipulation commands Further reading
Introduction
Data Manipulation Language (DML) is use to retrieve, insert, delete and update data in a database. The most popular DML is that of SQL, which is used to retrieve and manipulate data in a Relational database.
Objectives At the end of this session, you will be able to use SQL Data Manipulation Statements (DML) to change the contents of a database 12.1. Data manipulating language
Data Manipulation Language (DML) is a language for accessing and manipulating the data organized by the appropriate data model. DML is also known as query language. These languages can be classified in to two; Procedural user specifies what data is required and how to get those data Declarative (nonprocedural) user specifies what data is required without specifying how to get those data
SQL is the most widely used query language. DML is a family of computer languages used by computer programs or database users to retrieve, insert, delete and update data in a database. DML statements are used to work with the data in tables.
12.2.
DML have their functional capability organized by the initial word in a statement. In the case of SQL, these verbs are: * Insert * Update * Delete
Refer : Elmasri, Navathe, Somayajulu, Gupta, Fundamentals of DATABASE th SYSTEMS, Pearson Education, 4 ED, Chapter 07
Contents 13.1 Access database via general purpose programming language 13.2 Library procedures and functions to access database 13.3 Block structures and Control structures Further reading
Introduction Application programs are programmed in some programming language. In addition to the normal control structures and functions of a general purpose programming language is also aware of the database and provides means to work with it. Database operations are build in the language.
Objectives
At the end of this session, you should be able to, access database through a general purpose programming language, describe library procedures and functions to access database
13.1.
One problem in programming database applications is how to embed the database operations into the programs. Common general purpose programming languages (C, Cobol, Pascal, etc.) do not include databases in their kernel structures. The alternatives in programming database applications are either to use special database programming languages that are aware of databases or to use add-ons for general purpose programming languages. Two add-on techniques are commonly used *embedded SQL * application programming interfaces (API).
10
Embedded SQL In embedded SQL, SQL-statements can be used mixed with statements of the used programming language, for example C or Cobol. In order to distinguish the SQLstatements from the statements of the host programming language these statements are marked as belonging to the SQL (EXEC SQL - marks). A pre-compiler is needed to analyze the embedded SQL-statements and to transform them into calls of functions defined in a database interface library. These functions will execute the actual database operations. The output of the pre-compiler is then compiled using the compiler of the used programming language.
Database interface libraries are specific to database management systems as well as the pre-compilers. However, the way of how the database operations are embedded in the programs is standardized. Thus it should be possible to change the database management system and just recompile the program to work with a new DBMS.
Application programming interfaces (APIs) A database application programming interface (database API) is a library of functions to carry out database operations. Each database management system provides a library of its own. For example, Oracle has a library called Oracle Call Level Interface (Oracle CLI). This is actually the library used in the calls generated by the embedded SQL pre-compiler. Database management system specific libraries are called native APIs.
13.2
There are also libraries, for example ODBC (Microsoft Open Database Connection) and JDBC (for Java), that are independent of database management systems. These libraries make it possible to use various database management systems, even within one program. On the programmers point of view these libraries provide a common interface for database programming. However, to use a certain type of database, a driver for that particular type is needed. For example, an Oracle driver is needed in order to use Oracle databases. In the following we outline the use of database in Java programming language using JDBC -API.
For database operations it provides * SQL datatypes as datatypes in the program, * utilize database schema in defining data types * use loops for processing query results
13.3
"BEGIN ... END" statement structure is used to group multiple statements into a single statement block, which can be used in other statement structures as a single statement. For example, a statement block can be used in an "IF ... ELSE ..." statement structure as a single statement.
11
"IF ... ELSE IF ... ELSE ..." statement structure is used to select one of the specified statements to be executed based on specified Boolean conditions. The structure of PL/pgSQL is fairly simple, mainly due to the fact that each portion of code is designed to exist as a function. While it may not look immediately similar to other languages, PL/pgSQL's structure is similar to other programming languages such as C, in which each portion of code acts (and is created) as a function, all variables must be declared before being used, and code segments accept arguments when called and return arguments at their end. PL/pgSQL is a block-structured language. The complete text of a function definition must be a block. A block is defined as:
Stored procedures are one of the oldest methods of encapsulating database logic, but they are not the only method available. Many relational databases nowadays have views, constraints, referential integrity with cascading update, delete, stored functions, triggers and the like. Stored procedures are one of numerous mechanisms of encapsulating database logic in the database. They are similar to regular programming language procedures in that they take arguments, do something, and sometimes return results and sometimes even change the values of the arguments they take when arguments are declared as output parameters. Stored Functions are very similar to stored procedures You will find that they are very similar to stored functions in that they can return data; however stored procedures can not be used in queries. Views are one of the greatest things invented since sliced bread. The main beauty of a view is that it can be used like a table in most situations, but unlike a table, it can encapsulate very complex calculations and commonly used joins. Triggers are objects generally tied to a table or view that run code based on certain events such as inserting data, before inserting data, updating/deleting data and before these events happen. Triggers are especially useful for one particular situation and that is for implementing instead of logic.
Refer : Elmasri, Navathe, Somayajulu, Gupta, Fundamentals of DATABASE th SYSTEMS, Pearson Education, 4 ED, Chapter 07
12