A Step-By-Step Guide To Normalization in DBMS With Examples
A Step-By-Step Guide To Normalization in DBMS With Examples
In this article, I’ll explain what normalisation in a DBMS is and how to do it, in simple terms.
By the end of the article, you’ll know all about it and how to do it.
Table of Contents
1. What Is Database Normalization?
2. Why Normalize a Database?
3. What Are The Normal Forms?
4. What Is First Normal Form?
5. What Is Second Normal Form?
6. What Is Third Normal Form?
7. Fourth Normal Form and Beyond
It’s something a person does manually, as opposed to a system or a tool doing it. It’s commonly done by database developers and database
administrators.
It can be done on any relational database, where data is stored in tables that are linked to each other. This means that normalization in a DBMS (Database
Management System) can be done in Oracle, Microsoft SQL Server, MySQL, PostgreSQL and any other type of database.
To perform the normalization process, you start with a rough idea of the data you want to store, and apply certain rules to it in order to get it to a more
efficient form.
https://www.databasestar.com/database-normalization/ 1/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Normalization in a DBMS is done to achieve these points. Without normalization on a database, the data can be slow, incorrect, and messy.
Data Anomalies
An anomaly is where there is an issue in the data that is not meant to be there. This can happen if a database is not normalised.
Let’s take a look at the different kinds of data anomalies that can occur and that can be prevented with a normalised database.
Our Example
We’ll be using a student database as an example in this article, which records student, class, and teacher information.
Student ID Student Name Fees Paid Course Name Class 1 Class 2 Class 3
Business Programming
2 Maria Griffin 500 Computer Science Biology 1
Intro 2
This is not a normalised table, and there are a few issues with this.
Insert Anomaly
An insert anomaly happens when we try to insert a record into this table without knowing all the data we need to know.
For example, if we wanted to add a new student but did not know their course name.
Student ID Student Name Fees Paid Course Name Class 1 Class 2 Class 3
Business
2 Maria Griffin 500 Computer Science Biology 1 Programming 2
Intro
5 Jared Oldham 0 ?
We would be adding incomplete data to our table, which can cause issues when trying to analyse this data.
Update Anomaly
An update anomaly happens when we want to update data, and we update some of the data but not other data.
https://www.databasestar.com/database-normalization/ 2/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
For example, let’s say the class Biology 1 was changed to “Intro to Biology”. We would have to query all of the columns that could have this Class field and
rename each one that was found.
Student ID Student Name Fees Paid Course Name Class 1 Class 2 Class 3
Intro to
1 John Smith 200 Economics Economics 1
Biology
Intro to Business
2 Maria Griffin 500 Computer Science Programming 2
Biology Intro
There’s a risk that we miss out on a value, which would cause issues.
Delete Anomaly
A delete anomaly occurs when we want to delete data from the table, but we end up deleting more than what we intended.
For example, let’s say Susan Johnson quits and her record needs to be deleted from the system. We could delete her row:
Student ID Student Name Fees Paid Course Name Class 1 Class 2 Class 3
Business
2 Maria Griffin 500 Computer Science Biology 1 Programming 2
Intro
But, if we delete this row, we lose the record of the Biology 2 class, because it’s not stored anywhere else. The same can be said for the Medicine course.
We should be able to delete one type of data or one record without having impacts on other records we don’t want to delete.
There are three main normal forms that you should consider (Actually, there are six normal forms in total, but the first three are the most common).
Whenever the first rule is applied, the data is in “first normal form“. Then, the second rule is applied and the data is in “second normal form“. The third rule
is then applied and the data is in “third normal form“.
Fourth and fifth normal forms are then achieved from their specific rules.
Alright, so there are three main normal forms that we’re going to look at. I’ve written a post on designing a database, but let’s see what is involved in
getting to each of the normal forms in more detail.
Let’s start with a sample database. In this case, we’re going to use a student and teacher database at a school. We mentioned this earlier in the article
when we spoke about anomalies, but here it is again.
https://www.databasestar.com/database-normalization/ 3/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
We have a set of data we want to capture in our database, and this is how it currently looks. It’s a single table called “student” with a lot of columns.
Date
Student Fees Subject Teacher Teacher Course
of Address Subject 1 Subject 2 Subject 3
Name Paid 4 Name Address Name
Birth
21 Arrow Street,
Susan 03- 13- Biology 2 Sarah
South Boston Medicine
Johnson Feb-01 Jan-91 (Science) Francis
56128
To apply first normal form to a database, we look at each table, one by one, and ask ourselves the following questions of it:
1. Does the combination of all columns make a unique row every single time?
2. What field can be used to uniquely identify the row?
Does the combination of all columns make a unique row every single time?
No. There could be the same combination of data, and it would represent a different row. There could be the same values for this row and it would be a
separate row (even though it is rare).
Is this the student name? No, as there could be two students with the same name.
If there is no unique field, we need to create a new field. This is called a primary key, and is a database term for a field that is unique to a single row.
(Related: The Complete Guide to Database Keys)
When we create a new primary key, we can call it whatever we like, but it should be obvious and consistently named between tables. I prefer using the ID
suffix, so I would call it student ID.
Student (student ID, student name, fees paid, date of birth, address, subject 1, subject 2, subject 3, subject 4, teacher name, teacher
address, course name)
https://www.databasestar.com/database-normalization/ 4/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
The way I have written this is a common way of representing tables in text format. The table name is written, and all of the columns are shown in brackets,
with the primary key underlined.
This example is still in one table, but it’s been made a little better by adding a unique value to it.
Want to find a tool that creates these kinds of diagrams? There are many tools for creating these kinds of diagrams. I’ve listed 76 of them in this guide to
Data Modeling Tools, along with reviews, price, and other features. So if you’re looking for one to use, take a look at that list.
It means that the first normal form rules have been applied. It also means that each field that is not the primary key is determined by that primary key,
so it is specific to that record. This is what “functional dependency” means.
Student (student ID, student name, fees paid, date of birth, address, subject 1, subject 2, subject 3, subject 4, teacher name, teacher
address, course name)
Are all of these columns dependent on and specific to the primary key?
The primary key is student ID, which represents the student. Let’s look at each column:
student name: Yes, this is dependent on the primary key. A different student ID means a different student name.
fees paid: Yes, this is dependent on the primary key. Each fees paid value is for a single student.
date of birth: Yes, it’s specific to that student.
address: Yes, it’s specific to that student.
subject 1: No, this column is not dependent on the student. More than one student can be enrolled in one subject.
subject 2: As above, more than one subject is allowed.
subject 3: No, same rule as subject 2.
subject 4: No, same rule as subject 2
teacher name: No, the teacher name is not dependent on the student.
teacher address: No, the teacher address is not dependent on the student.
https://www.databasestar.com/database-normalization/ 5/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
course name: No, the course name is not dependent on the student.
We have a mix of Yes and No here. Some fields are dependent on the student ID, and others are not.
Subject
First, the subject 1 column. It is not dependent on the student, as more than one student can have a subject, and the subject isn’t a part of the definition of
a student.
I’ve called it subject name because that’s what the value represents. When we are writing queries on this table or looking at diagrams, it’s clearer what
subject name is instead of using subject.
Now, is this field unique? Not necessarily. Two subjects could have the same name and this would cause problems in our data.
So, what do we do? We add a primary key column, just like we did for student. I’ll call this subject ID, to be consistent with the student ID.
This means we have a student table and a subject table. We can do this for all four of our subject columns in the student table, removing them from the
student table so it looks like this:
Student (student ID, student name, fees paid, date of birth, address, teacher name, teacher address, course name)
We’ll cover that shortly. For now, let’s keep going with our student table.
Teacher
The next column we marked as No was the Teacher Name column. The teacher is separate to the student so should be captured separately. This means
we should move it to its own table.
We should also move the teacher address to this table, as it’s a property of the teacher. I’ll also rename teacher address to be just address.
Just like with the subject table, the teacher name and address is not unique. Sure, in most cases it would be, but to avoid duplication we should add a
primary key. Let’s call it teacher ID,
https://www.databasestar.com/database-normalization/ 6/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Course
The last column we have to look at was the Course Name column. This indicates the course that the student is currently enrolled in.
While the course is related to the student (a student is enrolled in a course), the name of the course itself is not dependent on the student.
So, we should move it to a separate table. This is so any changes to courses can be made independently of students.
We now have our tables created from columns that were in the student table. Our database so far looks like this:
Student (student ID, student name, fees paid, date of birth, address)
Using the data from the original table, our data could look like this:
Student
Subject
1 Economics 1 (Business)
2 Biology 1 (Science)
4 Programming 2 (IT)
5 Biology 2 (Science)
Teacher
teacher
teacher name address
ID
https://www.databasestar.com/database-normalization/ 7/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
2 Sarah Francis
Course
1 Computer Science
2 Dentistry
3 Economics
4 Medicine
How do we link these tables together? We still need to know which subjects a student is taking, which course they are in, and who their teachers are.
We have four separate tables, capturing different pieces of information. We need to capture that students are taking certain courses, have teachers, and
subjects. But the data is in different tables.
A foreign key is a column in one table that refers to the primary key in another table. Related: The Complete Guide to Database Keys.
It’s used to link one record to another based on its unique identifier, without having to store the additional information about the linked record.
Student (student ID, student name, fees paid, date of birth, address)
To link the two tables using a foreign key, we need to put the primary key (the underlined column) from one table into the other table.
Let’s start with a simple one: students taking courses. For our example scenario, a student can only be enrolled in one course at a time, and a course can
have many students.
We need to either:
Add the course ID from the course table into the student table
Add the student ID from the student table into the course table
In this situation, I ask myself a question to work out which way it goes:
Does a table1 have many table2s, or does a table2 have many table1s?
If it’s the first, then table1 ID goes into table 2, and if it’s the second then table2 ID goes into table1.
Does a course have many students, or does a student have many courses?
Based on our rules, the first statement is true: a course has many students.
This means that the course ID goes into the student table.
https://www.databasestar.com/database-normalization/ 8/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Student (student ID, course ID, student name, fees paid, date of birth, address)
I’ve italicised it to indicate it is a foreign key – a value that links to a primary key in another table.
When we actually populate our tables, instead of having the course name in the student table, the course ID goes in the student table. The course name
can then be linked using this ID.
25 Apr
4 2 Matt Long 850 14 Milk Lane, South Boston 56128
1992
This also means that the course name is stored in one place only, and can be added/removed/updated without impacting other tables.
I’ve created a YouTube video to explain how to identify and diagram one-to-many relationships like this:
You're offline
Teacher
We’ve linked the student to the course. Now let’s look at the teacher.
How are teachers related? Depending on the scenario, they could be related in one of a few ways:
A student can have one teacher that teaches them all subjects
A subject could have a teacher than teaches it
A course could have a teacher that teaches all subjects in a course
https://www.databasestar.com/database-normalization/ 9/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
In our scenario, a teacher is related to a course. We need to relate these two tables using a foreign key.
Does a teacher have many courses, or does a course have many teachers?
In our scenario, the first statement is true. So the teacher ID goes into the course table:
Student (student ID, course ID, student name, fees paid, date of birth, address)
Course
course
teacher ID course name
ID
1 1 Computer Science
2 3 Dentistry
3 1 Economics
4 2 Medicine
Teacher
teacher
teacher name address
ID
James
1 44 March Way, Glebe 56100
Peterson
2 Sarah Francis
This allows us to change the teacher’s information without impacting the courses or students.
So we’ve linked the course, teacher, and student tables together so far.
Does a subject have many students, or does a student have many subjects?
A student can be enrolled in many subjects at a time, and a subject can have many students in it.
How can we represent that? We could try to put one table’s ID in the other table:
https://www.databasestar.com/database-normalization/ 10/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
25 Apr
4 2 Matt Long 850 14 Milk Lane, South Boston 56128
1992
But if we do this, we’re storing many pieces of information in one column, possibly separated by commas.
If we have this kind of relationship, one that goes both ways, it’s called a many to many relationship. It means that many of one record is related to many
of the other record.
A many to many relationship is common in databases. Some examples where it can happen are:
If we can’t represent this relationship by putting a foreign key in each table, how can we represent it?
This is a table that is created purely for storing the relationships between the two tables.
Student (student ID, course ID, student name, fees paid, date of birth, address)
It has two columns. Student ID is a foreign key to the student table, and subject ID is a foreign key to the subject table.
student ID subject ID
1 1
1 2
2 2
2 3
2 4
3 5
And so on.
https://www.databasestar.com/database-normalization/ 11/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
It allows us to store many subjects for each student, and many students for each subject.
It separates the data that describes the records (subject name, student name, address, etc.) from the relationship of the records (linking ID to ID).
It allows us to add and remove relationships easily.
It allows us to add more information about the relationship. We could add an enrolment date, for example, to this table, to capture when a student
enrolled in a subject.
You might be wondering, how do we see the data if it’s in multiple tables? How can we see the student name and the name of the subjects they are
enrolled in?
Well, that’s where the magic of SQL comes in. We use a SELECT query with JOINs to show the data we need. But that’s outside the scope of this article –
you can read the articles on my Oracle Database page to find out more about writing SQL.
One final thing I have seen added to these joining tables is a primary key of its own. An ID field that represents the record. This is an optional step – a
primary key on a single new column works in a similar way to defining the primary key on the two ID columns. I’ll leave it out in this example.
Student (student ID, course ID, student name, fees paid, date of birth, address)
I’ve called the table Subject Enrolment. I could have left it as the concatenation of both of the related tables (student subject), but I feel it’s better to
rename the table to what it actually captures – the fact a student has enrolled in a subject. This is something I recommend in my SQL Best Practices post.
I’ve also underlined both columns in this table, as they represent the primary key. They can also represent a foreign key, which is why they are also
italicised.
This database structure is in second normal form. We almost have a normalised database.
https://www.databasestar.com/database-normalization/ 12/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
It means that every attribute that is not the primary key must depend on the primary key and the primary key only.
For example:
This means that column A determines column B which determines column C. This is a transitive functional dependency, and it should be removed. Column
C should be in a separate table.
Student (student ID, course ID, student name, fees paid, date of birth, address)
Do any of the non-primary-key fields depend on something other than the primary key?
No, none of them do. However, if we look at the address, we can see something interesting:
address
We can see that there is a relationship between the ZIP code and the city or suburb. This is common with addresses, and you may have noticed this if you
have filled out any forms for addresses online recently.
How are they related? The ZIP code, or postal code, determines the city, state, and suburb.
In this case, 56128 is South Boston, and 56125 is North Boston. (I just made this up so this is probably inaccurate data).
This falls into the pattern we mentioned earlier: A determines B which determines C.
Student determines the address ZIP code which determines the suburb.
We can move the ZIP code to another table, along with everything it identifies, and link to it from the student table.
Student (student ID, course ID, student name, fees paid, date of birth, street address, address code ID)
Address Code (address code ID, ZIP code, suburb, city, state)
I’ve created a new table called Address Code, and linked it to the student table. I created a new column for the address code ID, because the ZIP code may
refer to more than one suburb. This way we can capture that fact, and it’s in a separate table to make sure it’s only stored in one place.
https://www.databasestar.com/database-normalization/ 13/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Both of these tables have no columns that aren’t dependent on the primary key.
The teacher table also has the same issue as the student table when we look at the address. We can, and should use the same approach for storing
address.
Teacher (teacher ID, teacher name, street address, address code ID)
Address Code (address code ID, ZIP code, suburb, city, state)
It uses the same Address Code table as mentioned above. We aren’t creating a new address code table.
This table is OK. The course name is dependent on the course ID.
Student (student ID, course ID, student name, fees paid, date of birth, street address, address code ID)
Address Code (address code ID, ZIP code, suburb, city, state)
Teacher (teacher ID, teacher name, street address, address code ID)
So, that’s how third normal form could look if we had this example.
https://www.databasestar.com/database-normalization/ 14/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
For most database normalisation exercises, stopping after achieving Third Normal Form is enough.
It satisfies a good relationship rules and will greatly improve your data structure from having no normalisation at all.
There are a couple of steps after third normal form that are optional. I’ll explain them here so you can learn what they are.
A multivalued dependency is probably better explained with an example, which I’ll show you shortly. It means that there are other attributes in the table
that are not dependent on the primary key, and can be moved to another table.
https://www.databasestar.com/database-normalization/ 15/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Student (student ID, course ID, student name, fees paid, date of birth, street address, address code ID)
Address Code (address code ID, ZIP code, suburb, city, state)
Teacher (teacher ID, teacher name, street address, address code ID)
However, let’s take a look at the address fields: street address and address code.
There are a lot of “what if” questions here. There is a way we can resolve them and improve the quality of the data.
The address can then be linked to the teacher and student tables.
In this table, we have a primary key of address ID, and we have stored the street address here. The address code table stays the same.
We need to link this to the student and teacher tables. How do we do this?
Do we also want to capture the fact that a student or teacher can have multiple addresses? It may be a good idea to future proof the design. It’s
something you would want to confirm in your organisation.
For this example, we will design it so there can be multiple addresses for a single student.
https://www.databasestar.com/database-normalization/ 16/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Student (student ID, course ID, student name, fees paid, date of birth)
Address Code (address code ID, ZIP code, suburb, city, state)
The address code ID has been removed from the Student table, because the relationships between student and address is now captured in the
joining table called Student Address.
The teacher’s address is also captured in the joining table Teacher Address, and the address code ID has been removed from the Teacher table. I
couldn’t think of a better name for each of these tables.
Address still links to address code ID
So, that’s how you can achieve fourth normal form on this database.
There are a few enhancements you can make to this design, but it depends on your business rules:
Combine the student and teacher tables into a person table, as they are both effectively people, but teachers teach a class and students take a
class. This table could then link to subjects and define that relationship as “teaches” or “is enrolled in”, to cater for this.
Relate a course to a subject, so you can see which subjects are in a course
Split the address into separate fields for unit number, street number, address line 1, address line 2, and so on.
Split the student name and teacher name into first and last names to help with displaying data and sorting.
https://www.databasestar.com/database-normalization/ 17/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
These changes could improve the design, but I haven’t detailed them in any of these steps as they aren’t required for fourth normal form.
I hope this explanation has helped you understand what the normal forms are and what normalization in DBMS is. Do you have any questions on this
process? Share them in the section below.
Lastly, if you enjoy the information and career advice I’ve been providing, sign up to my newsletter below to stay up-to-date on my articles. You’ll also
receive a fantastic bonus. Thanks!
JAMES SHALLOW
APRIL 9, 2018 AT 6:19 AM
Thanks so much for explaining this concept Ben. To me as a learner, this is the best way to grab this concept.
Reply
BEN
APRIL 10, 2018 AT 4:40 AM
Reply
NICKY
SEPTEMBER 21, 2023 AT 8:25 PM
Absolutely, the best and easiest explanation I have seen. Very helpful.
Reply
RONALD LÓPEZ
AUGUST 22, 2018 AT 8:01 AM
Saludos Ben, buen post. Podrías por favor revisar la simbología que utilizaste en la relación de las tablas Student y Course, dado que comentaste en las
líneas de arriba “significa que la identificación del curso entra en la tabla de estudiantes.” Comenta si la relación sería: Course -< Student
Reply
BEN
AUGUST 25, 2018 AT 8:48 AM
Hi Ronald,
Sure, I’ll check this and update the post.
(Google Translate: Greetings Ben, good post. Could you please check the symbology you used in the Student and Course table relationship,
since you commented on the lines above “it means that the course identification enters the student table.” Comment if the relationship would
be: Course – < Student)
Reply
RONALD LÓPEZ
AUGUST 24, 2018 AT 3:32 AM
Saludos Ben, la simbología de Courses y Student según planteaste es de “1 a n” verifica si sería ” -< "
Buen post.
https://www.databasestar.com/database-normalization/ 18/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Reply
BEN
AUGUST 25, 2018 AT 8:48 AM
Thanks Ronald!
(Google Translate: Greetings Ben, the symbology of Courses and Student as you raised is “1 to n” verifies if it would be “- <" Good post.)
Reply
KOK
OCTOBER 23, 2018 AT 2:58 PM
thank you for sharing these things to us , damn i really love it. You guys are really awesome
Reply
BOLARINWA
NOVEMBER 7, 2018 AT 7:48 AM
Reply
BLESSEDJENY
FEBRUARY 4, 2019 AT 6:18 PM
Reply
BEN
FEBRUARY 14, 2019 AT 3:53 AM
Reply
DJ
APRIL 26, 2019 AT 2:55 PM
This is a nice compilation and a good illustration of how to carry out table normalization. I wish you can provide script samples that represent the ERD you
presented here. It will be so much helpful.
Reply
BEN
MAY 18, 2019 AT 3:40 PM
Hi DJ,
Glad you like the article. Good idea – I’ll create some sample scripts and add them to the post.
Thanks,
Ben
Reply
https://www.databasestar.com/database-normalization/ 19/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
NATI
APRIL 28, 2019 AT 7:11 AM
Good job! This is a great way explaining this topic. You made it look easy to understand. But, one question I have for you is where is a best scenario in real
life used the fourth normal form?
Reply
BEN
MAY 18, 2019 AT 3:43 PM
Hi Nati,
Thanks, I’m glad you like the article.
I’m not sure what would be a realistic example of using fourth normal form. I think many 3NF databases can be turned into 4NF, depending on
how to want to record values (either a value in a table or an ID to another table). I haven’t used it that often.
Reply
SUBZAR AHMED
MAY 15, 2019 AT 5:25 PM
Dear Sir’
Can we call the Fourth Normal Form as a BCNF (Boyce Codd Normal Form).
Or not?
Reply
BEN
MAY 18, 2019 AT 3:23 PM
Hi Subzar, I think Fourth Normal Form is slightly different to BCNF. I haven’t used either method but I know they are a little different. I think 4NF
comes after BCNF.
Reply
AJAY
JUNE 5, 2019 AT 11:59 AM
Hey Ben,
Your notes are too helpful. Will recommend my other friends for sure.
Thanks a lot :)
Reply
SCOTTPLETCHER
JULY 24, 2019 AT 5:16 AM
You’ve done a truly superb job on the majority of this. You’ve used far more details than most people who provide examples do.
Particularly good is the splitting of addresses into a separate table. That is just not done nearly enough.
Reasoning (condensed):
(1) A student must be able to enroll first, without yet specificying a course. The student’s name and other enrollment data are not related to any specific
course.
(2) A course has data of its own not related to the teacher: # of credit/lab hours, cost(s), first term offered, last term offered, etc.. Should the only teacher
currently teaching a course withdraw from it, the course data should not be lost. Courses have prerequisites, sometimes complex ones, that have nothing
to do with who is teaching the course.
Reply
https://www.databasestar.com/database-normalization/ 20/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
BEN
JULY 25, 2019 AT 5:28 AM
Thanks for the feedback Scott and glad you like the post!
I understand your reasoning, and yes ideally there would be those two intersection tables. These scenarios are things that we, as developers,
would clarify during the design.
The example I went with was a simple one where a student must have a course and a course must have a teacher, but in the real world it would
cater to your scenarios.
Thanks again,
Ben
Reply
TATYI CHEUNG
NOVEMBER 4, 2019 AT 9:36 PM
Good
Reply
PATRICK STAR
NOVEMBER 30, 2019 AT 12:20 AM
bery good
Reply
Reply
BEN
JANUARY 7, 2020 AT 12:31 PM
Reply
ANJUM
MARCH 22, 2020 AT 5:44 AM
As a grade 11 teacher, I am well aware of the complexities students face when teaching/explaining the concepts within this topic. This article is brilliant
and breaks down such a confusing topic quite nicely with examples along the way. My students will definitely benefit from this, thank you so much.
Reply
MASUNDA BUNDU
APRIL 30, 2020 AT 7:03 AM
I am just starting out in SQL and this was quite helpful. Thanks.
Reply
https://www.databasestar.com/database-normalization/ 21/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
MANI CHANDU
MAY 31, 2020 AT 2:52 AM
what a tremendous insights about the normalisation you have explained and its gave me a lot of confedence Thank u some much ben
my deepest gratitude for sharing knowledge . you truly proved sharing is caring
with regards
chandu, india
Reply
TIM REEKS
JUNE 20, 2020 AT 9:13 PM
Hi
Came across your material while searching for Normalisation material, as wanting to use my time to improve my Club Membership records, having gained
a ‘Database qualification’ some 20 to 30 years years ago, I think, I needed to refresh my memory! Anyway – some queries:
1. Shouldn’t the Student Name be broken down or decomposed into StudentForename and StudentSurname, since doesn’t 1NF require this?
2. Shouldn’t Teacher Name be converted as per 121 above as well?
Reply
BEN
JUNE 21, 2020 AT 2:15 PM
Hi Tim, yes that’s a good point and it would be better to break it into two names for that purpose.
Reply
TOBI
JUNE 22, 2020 AT 4:26 AM
Wow, this is the very first website i finally thoroughly understood normalization. Thanks a lot.
Reply
TIM REEKS
JUNE 25, 2020 AT 6:56 PM
Hello again
I have thought thru the data I need to record, I think, time will tell I suspect. Anyway we run upto 18 competitions, played by teams of 1, 2 3 or 4 members,
thus I think that there may be Many to Many relationship between Member and Competition tables, as in Many Member records may be related to Many
Competition records [potential for a Pairs or Triples or Fours teams to win all the Competitions they enter], am I correct?
Also should I design the Competition table as CompID, Year, Comp1, Comp 2, Comp3, each Header having the name of the Competition, then I presume a
table that links the two, along the lines of:
Regards, Tim
Reply
TIM
JUNE 28, 2020 AT 6:33 PM
Hello again, thinking further, I presume that I could create 18 tables, one per Competition to capture the annual results.
https://www.databasestar.com/database-normalization/ 22/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Again though, presume my single Comp table (see above) shouldn’t have a column per comp, as this is a repeating group
So do I create a ‘joining table’, that records Year and Comp, another that records Member and Comp and one that records Member and Year.
I may be over thinking it, but as No Comp this year, I think I need to be able to record this, I think
Reply
BEN
JUNE 29, 2020 AT 5:46 AM
Hi Tim, good question. You could create one table per competition, but you would have 18 tables which have the same structure and different
data, which is quite repetitive. I would suggest having a single table for comp, and a column that indicates which competition it is for. It’s OK for
a table to have a value that is repeated like this, as it identifies which competition something refers to.
Yes you could create joining tables for Year and Comp (if there is a many to many relationship between them) and Member and Comp as well.
What’s the relationship between Member and Year?
Reply
TIM REEKS
JULY 1, 2020 AT 12:45 AM
Hello Ben
Thanks for reply, however, would it be easier to say create a Comp table of 18 records, a Comp Type table which has 2 records, that is Annual and One Day,
another table for Comp Year, which will record the annual competition results based on:
2019 – 1 – 4 – 1
2019 – 1 – 4 – 2
2019 – 1 – 4 – 3
2019 – 1 – 4 – 4
2019 – 1 – 3 – 1
2019 – 1 – 3 – 2
2019 – 1 – 3 – 3
2019 – 2 – 1 – 1
so members 1 to 4 won the 2019 one day comp Bickford Fours; and
members 1 to 3 won the 2019 one day comp Arthur Johnson Triples; and
member 1 won the 2019 annual singles championship
Reply
BEN
JULY 7, 2020 AT 7:40 AM
Hi Tim, yes I think that would work! Storing the comp data separately from the comp type (and so on) will ensure the data is not repeated and is
only stored in one place. Good luck!
Reply
ANDREW J. CHRUSCICKI
JULY 22, 2020 AT 1:31 AM
Ben,
Thank you so much! I was only able to grasp the concept of normalization in one hour or so because how you simplified the concepts through a simple
running example.
Reply
PRIDE
https://www.databasestar.com/database-normalization/ 23/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
AUGUST 6, 2020 AT 3:10 AM
Thanks a lot sir Daniel i have really understood this you are a great teacher
Reply
WOJTEK
NOVEMBER 6, 2020 AT 8:05 AM
Firstly :Thank you for your website. Secondly: I have still problem with understanding the Second normal form. For example you have written : “student
name: Yes, this is dependent on the primary key. A different student ID means a different student name.”. So in your design I am not allowed to have 2
students with same names? What will happen when this, not so uncommon situation occurs?
Reply
BEN
NOVEMBER 6, 2020 AT 2:26 PM
Reply
OLIVER OVIEDO
NOVEMBER 9, 2020 AT 6:04 AM
Fantastic
This is the best explanation on why and how to normalize Tables… excellent work, maybe the best explanation out there….
Reply
SIDIEU DELPHIN
NOVEMBER 16, 2020 AT 9:46 AM
Hi, Thanks for the post. That is exactly what I was looking for. But I have a question, how would I insert into student address and teacher address. Best
regards
Reply
KAT
JULY 1, 2021 AT 4:22 PM
This is amazing, very well explained. Simple example made it easy to understand. Thank you so much!
Reply
ELBI
AUGUST 4, 2021 AT 6:38 PM
Reply
https://www.databasestar.com/database-normalization/ 24/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
ROB
JULY 16, 2022 AT 6:52 AM
Sorry but your example is not in 1NF. 1NF dictates that you cannot have a repeating group. The subjects are a repeating group. Not saying you got the
design wrong at the end just that you failed to remove the repeating group in 1NF. When you went to 2NF you made rules for the repeating group multiple
times and took care of it but it should have been taken care of in 1NF.
Reply
JOEL
AUGUST 4, 2022 AT 8:04 PM
Reply
DELINDA ROSS
AUGUST 27, 2022 AT 2:05 AM
This is great!
You make it easy and simple to learn where I can understand.
Reply
BRU
SEPTEMBER 23, 2022 AT 12:21 AM
If you”re going to break address into atoms, (unit, street, city, zip, etc.), would you also break down telephone numbers, (country code, area code,
exchange, unit number) into separate fields in another table? How elemental do you go?
Reply
MOHAMMAD TAUFEEQ
NOVEMBER 5, 2022 AT 10:30 PM
Clearly explained :)
Reply
MADRIEN
NOVEMBER 17, 2022 AT 11:06 PM
Wow c’est vraiment très utile, ça vient d’apporter un plus ma connaissance. Vraiment merci beaucoup pour cet article.
Reply
DC
DECEMBER 5, 2022 AT 3:17 PM
Hello,
Great post!
Just wanted to check – for the ERD diagram in the 2NF example, wouldn’t the relationship between Course and Student be the other way around? A
course can have many students, but a student can only be enrolled in one course. So the crows feet symbol should be on the Student table.
https://www.databasestar.com/database-normalization/ 25/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Reply
PC
FEBRUARY 15, 2024 AT 1:03 PM
Hi Ben,
I am hoping you can respond to the question above. I noticed the same and I was hoping to find a correction or explanation in the notes.
I second everyone else’s compliments on a great post, too.
Reply
INNOCENT ISAACK
JANUARY 8, 2023 AT 11:17 AM
From table Student (student ID, course ID, student name, fees paid, date of birth, street address, address code ID)
why you put “street address” column remember that “address code ID” is available idenitify address of student i dont understand here
Reply
ALINA
JANUARY 10, 2023 AT 10:20 PM
Does the combination of all columns make a unique row every single time?
No. There could be the same combination of data, and it would represent a different row. There could be the same values for this row and it would be a
separate row (even though it is rare).
Reply
YOUVRAJ POTHEEGADOO
FEBRUARY 9, 2023 AT 10:28 PM
I stumbled on this article explaining so succinctly about normalization of database. I must admit that it helped me understand the concept using an
example far much better than just theory.
Will recommend it to my friends. I am in the process of putting this knowledge into an Employee Management System.
Reply
KELVIN
FEBRUARY 20, 2023 AT 1:10 AM
DAN
MARCH 7, 2023 AT 7:35 AM
The model breaks as soon as you add a 2nd course for a student. CourseID should not be in the student table. You need a separate table that ties
studentID and courseID together
https://www.databasestar.com/database-normalization/ 26/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
Reply
BEN
MARCH 8, 2023 AT 7:45 AM
That’s a good point. Yes you would need a separate table for that scenario.
Reply
DINO
JUNE 2, 2023 AT 9:54 AM
The ERDs for the 2nd / 3rd / 4th normal form show the crows foot at the COURSE table, but the STUDENT table receives the FK course_id. The crows
foot represents the many end of the relationship, but there is no representation in the course table of any student data point.
Imho, this is where it is broken in addition to what Dan has pointed to.
Reply
BLACK HERETIC
MAY 27, 2023 AT 11:59 PM
Reply
MARVIL
JUNE 27, 2023 AT 11:30 PM
Well done Ben, this is the best normalization tutorial I have ever seen. Thanks so much. keep up the good work.
Reply
TRINASW
AUGUST 2, 2023 AT 2:18 AM
Thanks for taking the time to create this. This is one of the best breakdowns that I’ve read regarding DB Normalization. However, I can’t really wrap my
mind around 4NF as I would like to know how to use it in a real life scenario. Thanks again as this was extremely helpful to me!
Reply
VIJAY
AUGUST 6, 2023 AT 2:20 PM
Reply
ELODIE
OCTOBER 17, 2023 AT 3:09 AM
If the statement is true that each student can only have 1 course, then the relationship is not shown correctly in the ERD.
Reply
https://www.databasestar.com/database-normalization/ 27/28
3/9/24, 12:02 PM A Step-By-Step Guide to Normalization in DBMS With Examples
KELAN
DECEMBER 5, 2023 AT 7:12 PM
Hi, I don’t quite understand how your example satisfies third normal form. It seems that street address could depend on address ID, which depends upon
the primary key. In other words, if address ID were to change, I would assume street address would also have to change. So doesn’t that create transitive
dependency?
ii) I also don’t understand why you made two address tables, couldn’t they be combined into one?
Thanks.
Reply
TEJAS
JANUARY 7, 2024 AT 3:33 PM
What I always think that how to determine functional dependency? In this example , After 1NF, we know that course is not functionally dependent on
student id because it is day to day example & it make sense.
however While designing DB for client whose domain is unknown to us, how to know effectively the functional dependency from client? What question we
should ask as layman?
Reply
TASLIM
JANUARY 14, 2024 AT 4:12 PM
Reply
Leave a Comment
Your email address will not be published. Required fields are marked *
Type here..
Website This site uses Akismet to reduce spam. Learn how your comment data is processed.
https://www.databasestar.com/database-normalization/ 28/28