Introduction To Databases: Lecture One
Introduction To Databases: Lecture One
Lecture One
Relational and Object-Oriented Database Concepts Introduction to Database Design and Techniques
File-Based Systems
Collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data.
Duplication of data
Same data is held by different programs. Wasted space and potentially different values and/or different formats for the same item.
4
Database Approach
Arose because:
Definition of data was embedded in application programs, rather than being stored separately and independently. No control over access and manipulation of data beyond that imposed by application programs.
Result:
the database and Database Management System (DBMS).
5
Database
Shared collection of logically related data (and a description of this data), designed to meet the information needs of an organization. System catalog (metadata) provides description of data to enable programdata independence. Logically related data comprises entities, attributes, and relationships of an organizations information.
Database Approach
A view mechanism.
Provides users with only the data they want or need to use.
7
Views
Allows each user to have his or her own view of the database. A view is essentially some subset of the database.
Views
Benefits include:
Reduce complexity; Provide a level of security; Provide a mechanism to customize the appearance of the database; Present a consistent, unchanging picture of the structure of the database, even if the underlying database is changed.
Second generation
Relational
Third generation
Object-Oriented
10
11
Terminology
Database: persistent collection of data Database Management System (DBMS): software that controls access to the database Database Administrator (DBA): person who controls database Data Model: general structure of the data in the database Data Language: commands used to define the data model and give users access to the database
12
Utility of Databases
Data has value independent of use Organized approach to data management Eliminate redundancy in data Share data Archive data Security of data Integrity of data
13
Database access is a key feature of current enterprise computing Relational DB: tables To link/merge tables and extract/write information:
Structured Query Language (SQL) language of all modern databases (but many dialects)
SQL is transparent; operates with statements like SELECT, INSERT, DELETE, etc. SQL provides its result sets in table format
14
One vs. multiple user access Internet browsers make it easy to access database programs (compared with traditional client/server programs)
15
Database
Database is a collection of tables (relations) Data are stored in tables
Tables
Each table has a name Each table has a set of columns (fields) and rows of data (records) Each table has a fixed number of columns Each table has an arbitrary number of rows
Columns
Each column has a name Columns are accessed by name No standard column ordering Data in a column belongs to a particular domain
Columns are the attributes of the dataset Each value in a column is from the same domain Each value in a column is of the same data type
17
Rows
Each row entry is either a simple value or empty ("null") Rows are sets of values for the columns (attribute values) Primary key: a set of columns that uniquely identifies each row Each row must be unique given the primary key (no duplicates) Rows are referenced by the primary key Row order cannot be determined by the user Does not make sense to say the fourth row like it does in a paper table or spreadsheet
18
Data Types
Database Design
Database design deals with how to design a database Importance of Good Design
Poor design results in unwanted data redundancy Poor design generates errors leading to bad decisions
Practical Approach
Focus on principles and concepts of database design Importance of logical design
20
Create a balanced design which is good for all users Based on a set of assumptions about the world being modeled Determine the data to be stored Determine the relations among the data Determine the operations to be performed Specify the structure of the tables
21
3.
4. 5. 6. 7.
Identify all the objects, entities, and attributes Identify all the dependencies, draw a dependency diagram Design tables to represent the data items and dependencies Verify the design Implement the database Design the queries Test and revise
22
Determine the objects of your Database For each object, describe each entity to be stored
example: better to store first name and last name separately
23
24
25
PERSON
BIRTHDATE
a student has one and only one final grade for a course
STUDENT
STUDENT
CLASSES
PERSON
SISTERS
27
APPOINTMENT
CLIENT
TIME 29
30
Choose an attribute at the end of a path Follow the chain of arrows upwards
each multi-valued dependency on the path becomes a primary key for the table combine all single-valued attributes at first level up into a single table all attributes on the path should be included in the table stop when you reach a bubble that has no arrows coming into it each path becomes a separate table
Mark off your traversed path Repeat until all paths have been traversed
31
Do you have too many tables? too few? If your design does not appear correct
go back to step 1 you must repeat all steps of process in order do not try to rearrange dependency diagram to give you the tables you think you should have
32
Assuming the order of rows and columns is known this is not a spreadsheet! do not assume sorted order unless you explicitly sort Guessing the design, not following the process Storing what you can compute (when the value will change) transitive dependency e.g., do not store age if you are already storing birth date Represent multi-valued dependencies in fixed size sets if you know that there are exactly X number of something, create X singlevalued dependencies, otherwise use multi-valued dependency Adding a key when a unique value exists adding an ID number for each person when you are already storing their social security number 33
Results
Databases with these characteristics are called 3NF (Third Normal Form) databases Normalization to be discussed later in the course
34
Design
Worst: Compensate for poor design and limited SQL with programming.
35
Program
SQL
Program
Application Development
tasks Feasibility
Identify scope, costs, and schedule
Analysis
Gather information from users
Design
Define tables, relationships, forms, reports
Development
Create forms, reports, and help; test
Implementation
Transfer data, install, train, review
time
36
DBMS Features/Components
Database engine
Storage Retrieval Update
Report writer Forms generator (input screens) Application generator Communications Programming Interface
37
Data Tables
Database Engine
Data Dictionary Concurrency and Lock Manager Administration
Product Customer ItemID Integer, Unique CustomerID Description Integer, Text, 100 Unique char Name Text, 50 char
User Identification Access Rights Backup and Recovery
Security
Utilities
38
39
40
42
43
DBMS Components
Communication Network
3GL Connector
Program
44
Relational Database
Customer(CustomerID, Name,
Order(OrderID, CustomerID, OrderDate, ItemsOrdered(OrderID, ItemID, Quantity, Items(ItemID, Description, Price,
45
Object-Oriented DBMS
Order
OrderID CustomerID NewOrder DeleteOrder
Customer
CustomerID Name Add Customer Drop Customer Change Address
OrderItem
OrderID ItemID OrderItem DropOrderItem
Item
ItemID Description New Item Sell Item Buy Item
46
Objects
Object Definition Object Name Properties Methods Inheritance Combine into one table. Use multiple tables and link by primary key. More efficient. Need to add rows to many tables.
Inheritance
Commercial Contact VolumeDiscount Government Contact BalanceDue
ComputeDiscount
BillLateFees AddCustomer
47
Separate inherited classes. Link by primary key. Adding a new customer requires new rows in each table.
CustomerID Contact VolumeDiscount
CommercialCustomer
48
OO Difficulties: Methods
IBM Server Unix Server
Database Object
Personal Computer
Database Object
Customer Method: Add New Customer Program code
Application
Customer
Name Address Phone
How can a method run on different computers? Different processors use different code. Possibility: Java 49
End of Lecture
50