Notes Chapter 1.1 Lecture 1.4(Referential Data Structure, Schema, Instances and Keys)
Notes Chapter 1.1 Lecture 1.4(Referential Data Structure, Schema, Instances and Keys)
CHAPTER 1.1
Lecture-1.4 (Referential Data Structure, Schema, Instances and
Keys)
Relational data model is the primary data model, which is used widely around the
world for data storage and processing. This model is simple and it has all the
properties and capabilities required to process data with storage efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This
format stores the relation among entities. A table has rows and columns, where rows
represent records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is
called a tuple.
Relation schema − A relation schema describes the relation name (table name),
attributes, and their names.
Relation key − Each row has one or more attributes, known as relation key, which
can identify the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as
attribute domain.
Database Schema
A database schema is the skeleton structure that represents the logical view of the
entire database. It defines how the data is organized and how the relations among
them are associated. It formulates all the constraints that are to be applied on the
data.
A database schema defines its entities and the relationship among them. It contains
a descriptive detail of the database, which can be depicted by means of schema
diagrams. It’s the database designers who design the schema to help programmers
understand the database and make it useful.
A
database schema can be divided broadly into two categories −
For example: In the following diagram, we have a schema that shows the
relationship between three tables: Course, Student and Section. The diagram only
shows the design of the database, it doesn’t show the data present in those tables.
Schema is only a structural view(design) of a database as shown in the diagram
below.
The design of a database at physical level is called physical schema, how the data
stored in blocks of storage is described at this level.
Design of database at view level is called view schema. This generally describes
end user interaction with database systems.
Database Instance
A database instance is a state of operational database with data at any given time. It
contains a snapshot of the database. Database instances tend to change with time. A
DBMS ensures that its every instance (state) is in a valid state, by diligently following
all the validations, constraints, and conditions that the database designers have
imposed.
Definition of instance: The data stored in database at a particular moment of time
is called instance of database. Database schema defines the variable declarations in
tables that belong to a particular database; the value of these variables at a moment
of time is called the instance of that database.
For example, lets say we have a single table student in the database, today the table
has 100 records, so today the instance of the database has 100 records. Lets say we
are going to add another 100 records in this table by tomorrow so the instance of
database tomorrow will have 200 records in table. In short, at a particular moment
the data stored in database is called the instance, that changes over time when we
add or delete data from the database.
Keys
Key plays an important role in relational database; it is used for identifying unique
rows from table. It also establishes relationship among tables.
Super Key: – A super key is a set of one of more columns (attributes) to uniquely
identify rows in a table.
Alternate Key:– Out of all candidate keys, only one gets selected as primary key,
remaining keys are known as alternate or secondary keys.
Composite Key:– A key that consists of more than one attribute to uniquely identify
rows (also known as records & tuples) in a table is called composite key.
Foreign Key: – Foreign keys are the columns of a table that points to the primary
key of another table. They act as a cross-reference between tables.
Lets take an example to understand the concept of primary key. In the following
table, there are three attributes: Stu_ID, Stu_Name & Stu_Age. Out of these three
attributes, one attribute or a set of more than one attributes can be a primary key.
Attribute Stu_Name alone cannot be a primary key as more than one students
can have same name.
Attribute Stu_Age alone cannot be a primary key as more than one students
can have same age.
Attribute Stu_Id alone is a primary key as each student has a unique id that
can identify the student record in the table.
Note: In some cases an attribute alone cannot uniquely identify a record in a table,
in that case we try to find a set of attributes that can uniquely identify a row in table.
We will see the example of it after this example.
101 Steve 23
102 John 24
103 Robert 28
104 Steve 29
105 Carl 29
In the above example, we already had a table with data and we were trying to
understand the purpose and meaning of primary key, however you should know that
generally we define the primary key during table creation. We can define the primary
key later as well but that rarely happens in the real world scenario.
Lets say we want to create the table that we have discussed above with the
customer id and product id set working as primary key. We can do that in SQL like
this:
(
Stu_Id int primary key,
Definition of Super Key in DBMS: A super key is a set of one or more attributes
(columns), which can uniquely identify a row in a table. Often some students are
confused between super key and candidate key, so we will also discuss candidate
key and its relation with super key in this.
Answer is simple – Candidate keys are selected from the set of super keys, the only
thing we take care while selecting candidate key is: It should not have any
redundant attribute. That’s the reason they are also termed as minimal super key.
Table: Employee
Super keys: The above table has following super keys. All of the following sets of
super key are able to uniquely identify a row of the employee table.
{Emp_SSN}
{Emp_Number}
{Emp_SSN, Emp_Number}
{Emp_SSN, Emp_Name}
{Emp_SSN, Emp_Number, Emp_Name}
{Emp_Number, Emp_Name}
Only these two sets are candidate keys as all other sets are having redundant
attributes that are not necessary for unique identification.
I have been getting lot of comments regarding the confusion between super key and
candidate key. Let me give you a clear explanation.
1. First you have to understand that all the candidate keys are super keys. This is
because the candidate keys are chosen out of the super keys.
2. How we choose candidate keys from the set of super keys? We look for those keys
from which we cannot remove any fields. In the above example, we have not chosen
{Emp_SSN, Emp_Name} as candidate key because {Emp_SSN} alone can identify a
unique row in the table and Emp_Name is redundant.
Primary Key:
A Primary key is selected from a set of candidate keys. This is done by database
admin or database designer. We can say that
either {Emp_SSN} or {Emp_Number} can be chosen as a primary key for the table
Employee.
Lets take an example of table “Employee”. This table has three attributes: Emp_Id,
Emp_Number & Emp_Name. Here Emp_Id & Emp_Number will be having unique
values and Emp_Name can have duplicate values as more than one employees can
have same name.
1. {Emp_Id}
2. {Emp_Number}
3. {Emp_Id, Emp_Number}
4. {Emp_Id, Emp_Name}
6. {Emp_Number, Emp_Name}
Lets select the candidate keys from the above set of super keys.
{Emp_Id}
{Emp_Number}
Note: A primary key is selected from the set of candidate keys. That means we can
either have Emp_Id or Emp_Number as primary key. The decision is made by DBA
(Database administrator)
For example:
In the below example the Stu_Id column in Course_enrollment table is a foreign key
as it points to the primary key of the Student table.
Course_enrollment table:
Course_Id Stu_Id
C01 101
C02 102
C03 101
C05 102
C06 103
C07 102
Student table:
101 Chaitanya 22
102 Arya 26
103 Bran 25
104 Jon 21
Note: Practically, the foreign key has nothing to do with the primary key tag of
another table, if it points to a unique column (not necessarily a primary key) of
another table then too, it would be a foreign key. So, a correct definition of foreign
key would be: Foreign keys are the columns of a table that points to the candidate
key of another table.
Definition of Composite key: A key that has more than one attributes is known as
composite key. It is also known as compound key.
Note: Any key such as super key, primary key, candidate key etc. can be called
composite key if it has more than one attributes.
Composite key Example
Lets consider a table Sales. This table has four columns (attributes) – cust_Id,
order_Id, product_code & product_count.
Table – Sales
None of these columns alone can play a role of key in this table.
Column cust_Id alone cannot become a key as a same customer can place multiple
orders, thus the same customer can have multiple entires.
Column order_Id alone cannot be a primary key as a same order can contain the
order of multiple products, thus same order_Id can be present multiple times.
Column product_code cannot be a primary key as more than one customers can
place order for the same product.
Column product_count alone cannot be a primary key because two orders can be
placed for the same product count.
Based on this, it is safe to assume that the key should be having more than one
attributes:
As we have seen in the candidate key guide that a table can have multiple candidate
keys. Among these candidate keys, only one key gets selected as primary key, the
remaining keys are known as alternative or secondary keys.
Table: Employee/strong>
{Emp_Id}
{Emp_Number}
DBA (Database administrator) can choose any of the above key as primary key. Lets
say Emp_Id is chosen as primary key.
OTHER REFRENCES