
- MySQL - Home
- MySQL - Introduction
- MySQL - Features
- MySQL - Versions
- MySQL - Variables
- MySQL - Installation
- MySQL - Administration
- MySQL - PHP Syntax
- MySQL - Node.js Syntax
- MySQL - Java Syntax
- MySQL - Python Syntax
- MySQL - Connection
- MySQL - Workbench
- MySQL Databases
- MySQL - Create Database
- MySQL - Drop Database
- MySQL - Select Database
- MySQL - Show Database
- MySQL - Copy Database
- MySQL - Database Export
- MySQL - Database Import
- MySQL - Database Info
- MySQL Users
- MySQL - Create Users
- MySQL - Drop Users
- MySQL - Show Users
- MySQL - Change Password
- MySQL - Grant Privileges
- MySQL - Show Privileges
- MySQL - Revoke Privileges
- MySQL - Lock User Account
- MySQL - Unlock User Account
- MySQL Tables
- MySQL - Create Tables
- MySQL - Show Tables
- MySQL - Alter Tables
- MySQL - Rename Tables
- MySQL - Clone Tables
- MySQL - Truncate Tables
- MySQL - Temporary Tables
- MySQL - Repair Tables
- MySQL - Describe Tables
- MySQL - Add/Delete Columns
- MySQL - Show Columns
- MySQL - Rename Columns
- MySQL - Table Locking
- MySQL - Drop Tables
- MySQL - Derived Tables
- MySQL Queries
- MySQL - Queries
- MySQL - Constraints
- MySQL - Insert Query
- MySQL - Select Query
- MySQL - Update Query
- MySQL - Delete Query
- MySQL - Replace Query
- MySQL - Insert Ignore
- MySQL - Insert on Duplicate Key Update
- MySQL - Insert Into Select
- MySQL Indexes
- MySQL - Indexes
- MySQL - Create Index
- MySQL - Drop Index
- MySQL - Show Indexes
- MySQL - Unique Index
- MySQL - Clustered Index
- MySQL - Non-Clustered Index
- MySQL Operators and Clauses
- MySQL - Where Clause
- MySQL - Limit Clause
- MySQL - Distinct Clause
- MySQL - Order By Clause
- MySQL - Group By Clause
- MySQL - Having Clause
- MySQL - AND Operator
- MySQL - OR Operator
- MySQL - Like Operator
- MySQL - IN Operator
- MySQL - ANY Operator
- MySQL - EXISTS Operator
- MySQL - NOT Operator
- MySQL - NOT EQUAL Operator
- MySQL - IS NULL Operator
- MySQL - IS NOT NULL Operator
- MySQL - Between Operator
- MySQL - UNION Operator
- MySQL - UNION vs UNION ALL
- MySQL - MINUS Operator
- MySQL - INTERSECT Operator
- MySQL - INTERVAL Operator
- MySQL Joins
- MySQL - Using Joins
- MySQL - Inner Join
- MySQL - Left Join
- MySQL - Right Join
- MySQL - Cross Join
- MySQL - Full Join
- MySQL - Self Join
- MySQL - Delete Join
- MySQL - Update Join
- MySQL - Union vs Join
- MySQL Keys
- MySQL - Unique Key
- MySQL - Primary Key
- MySQL - Foreign Key
- MySQL - Composite Key
- MySQL - Alternate Key
- MySQL Triggers
- MySQL - Triggers
- MySQL - Create Trigger
- MySQL - Show Trigger
- MySQL - Drop Trigger
- MySQL - Before Insert Trigger
- MySQL - After Insert Trigger
- MySQL - Before Update Trigger
- MySQL - After Update Trigger
- MySQL - Before Delete Trigger
- MySQL - After Delete Trigger
- MySQL Data Types
- MySQL - Data Types
- MySQL - VARCHAR
- MySQL - BOOLEAN
- MySQL - ENUM
- MySQL - DECIMAL
- MySQL - INT
- MySQL - FLOAT
- MySQL - BIT
- MySQL - TINYINT
- MySQL - BLOB
- MySQL - SET
- MySQL Regular Expressions
- MySQL - Regular Expressions
- MySQL - RLIKE Operator
- MySQL - NOT LIKE Operator
- MySQL - NOT REGEXP Operator
- MySQL - regexp_instr() Function
- MySQL - regexp_like() Function
- MySQL - regexp_replace() Function
- MySQL - regexp_substr() Function
- MySQL Fulltext Search
- MySQL - Fulltext Search
- MySQL - Natural Language Fulltext Search
- MySQL - Boolean Fulltext Search
- MySQL - Query Expansion Fulltext Search
- MySQL - ngram Fulltext Parser
- MySQL Functions & Operators
- MySQL - Date and Time Functions
- MySQL - Arithmetic Operators
- MySQL - Numeric Functions
- MySQL - String Functions
- MySQL - Aggregate Functions
- MySQL Misc Concepts
- MySQL - NULL Values
- MySQL - Transactions
- MySQL - Using Sequences
- MySQL - Handling Duplicates
- MySQL - SQL Injection
- MySQL - SubQuery
- MySQL - Comments
- MySQL - Check Constraints
- MySQL - Storage Engines
- MySQL - Export Table into CSV File
- MySQL - Import CSV File into Database
- MySQL - UUID
- MySQL - Common Table Expressions
- MySQL - On Delete Cascade
- MySQL - Upsert
- MySQL - Horizontal Partitioning
- MySQL - Vertical Partitioning
- MySQL - Cursor
- MySQL - Stored Functions
- MySQL - Signal
- MySQL - Resignal
- MySQL - Character Set
- MySQL - Collation
- MySQL - Wildcards
- MySQL - Alias
- MySQL - ROLLUP
- MySQL - Today Date
- MySQL - Literals
- MySQL - Stored Procedure
- MySQL - Explain
- MySQL - JSON
- MySQL - Standard Deviation
- MySQL - Find Duplicate Records
- MySQL - Delete Duplicate Records
- MySQL - Select Random Records
- MySQL - Show Processlist
- MySQL - Change Column Type
- MySQL - Reset Auto-Increment
- MySQL - Coalesce() Function
MySQL - Handling Duplicates
Tables or result sets in a database usually contain duplicate records. While duplicates are generally allowed, there are situations where it is necessary to prevent them. In such cases, it becomes essential to identify and remove duplicate records from a database table.
Importance of Handling MySQL Duplicates
There are various reasons why handling duplicates in a database becomes necessary. One of the main reasons is that the existence of duplicates in an organizational database will lead to logical errors. In addition to it, we need to handle redundant data to prevent the following consequences −
- Duplicate data occupies storage space, reducing the efficiency of database usage and increasing storage costs.
- Dealing with duplicate records consumes additional resources, driving up the overall cost of maintaining the database.
- Duplicates in a database can lead to logical errors in data, affecting the integrity and reliability of the information stored.
Preventing Duplicate Entries
You can use a PRIMARY KEY or a UNIQUE Index on a table with the appropriate fields to prevent duplicate record entries into a table.
Example
The following table contains no such index or primary key, so it would allow duplicate records for first_name and last_name.
CREATE TABLE CUSTOMERS ( first_name CHAR(20), last_name CHAR(20), sex CHAR(10) );
To prevent multiple records with the same first and last name values from being created in this table, add a PRIMARY KEY to its definition. When you do this, it is also necessary to declare the indexed columns to be NOT NULL, because a PRIMARY KEY does not allow NULL values −
CREATE TABLE CUSTOMERS ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name) );
Using INSERT IGNORE Query −
The existence of a unique index in a table normally causes an error when attempting to insert a record that duplicates an existing record in the indexed column(s).
To handle this situation without generating an error, you can use the "INSERT IGNORE" command. When a record is not a duplicate, MySQL inserts it as usual. However, if the record is duplicate, the "IGNORE" keyword instructs MySQL to discard it without producing an error.
The provided example does not result in an error, and it also ensures that duplicate records are not inserted −
INSERT IGNORE INTO CUSTOMERS (LAST_NAME, FIRST_NAME) VALUES ('Jay', 'Thomas'), ('Jay', 'Thomas');
We get the following output −
Query OK, 1 row affected, 1 warning (0.01 sec) Records: 2 Duplicates: 1 Warnings: 1
Using REPLACE Query −
Instead of using the INSERT command, consider using the REPLACE command. When dealing with a new record, it is inserted just as with INSERT. However, if it is a duplicate, the new record replaces the old one.
REPLACE INTO CUSTOMERS (LAST_NAME, FIRST_NAME) VALUES ( 'Ajay', 'Kumar'), ( 'Ajay', 'Kumar');
Following is the output of the above code −
Query OK, 2 rows affected (0.01 sec) Records: 2 Duplicates: 0 Warnings: 0
Your choice between the INSERT IGNORE and REPLACE commands should depend on the specific duplicate-handling behaviour you wish to achieve. The INSERT IGNORE command retains the first set of duplicated records and discards the remaining. On the other hand, the REPLACE command keeps the last set of duplicates and removes any earlier instances.
Using UNIQUE Constraint −
Another way to enforce uniqueness is to add a UNIQUE index rather than a PRIMARY KEY to a table −
CREATE TABLE CUSTOMERS ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), UNIQUE (last_name, first_name) );
Counting and Identifying Duplicates
You can use the COUNT function and GROUP BY clause to count and identify duplicate records based on specific columns.
Example
Following is the query to count duplicate records with first_name and last_name in a table −
SELECT COUNT(*) as repetitions, last_name, first_name FROM CUSTOMERS GROUP BY last_name, first_name HAVING repetitions > 1;
This query will return a list of all the duplicate records in the CUSTOMERS table. In general, to identify sets of values that are duplicated, follow the steps given below.
Determine which columns may contain duplicated values.
Include those columns in the column selection list, along with COUNT(*).
List the columns in the GROUP BY clause as well.
Apply a HAVING clause to filter unique values by requiring the group counts to be greater than one.
Eliminating Duplicates from a Query Result
You can use the DISTINCT command along with the SELECT statement to find out unique records available in a table.
SELECT DISTINCT last_name, first_name FROM CUSTOMERS ORDER BY last_name;
An alternative to the DISTINCT command is to add a GROUP BY clause that specifies the columns you are selecting. This approach eliminates duplicates and retrieves only the unique combinations of values from the specified columns.
SELECT last_name, first_name FROM CUSTOMERS GROUP BY (last_name, first_name);
Removing Duplicates Using Table Replacement
If you have duplicate records in a table and you want to remove all the duplicate records from that table, then follow the procedure given below −
CREATE TABLE tmp AS SELECT DISTINCT last_name, first_name, sex FROM CUSTOMERS; DROP TABLE CUSTOMERS; ALTER TABLE tmp RENAME TO CUSTOMERS;
Handling Duplicates Using a Client Program
We can also handle duplicate using the client program.
Syntax
To handle duplicates value through a PHP program, we need to execute the "INSERT IGNORE" statement using the mysqli function query() as follows −
$sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; $mysqli->query($sql);
To handle duplicates value through a JavaScript program, we need to execute the "INSERT IGNORE" statement using the query() function of mysql2 library as follows −
sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; con.query(sql);
To handle duplicates value through a Java program, we need to execute the "INSERT IGNORE" statement using the JDBC function execute() as follows −
String sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; statement.execute(sql);
To handle duplicates value through a Python program, we need to execute the "INSERT IGNORE" statement using the execute() function of the MySQL Connector/Python as follows −
sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')" cursorObj.execute(sql)
Example
Following are the programs −
$dbhost = 'localhost'; $dbuser = 'root'; $dbpass = 'password'; $db = 'TUTORIALS'; $mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db); if ($mysqli->connect_errno) { printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); $sql = "CREATE TABLE person_tbl (first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name))"; if($mysqli->query($sql)){ printf("Table created successfully...!\n"); } //let's insert some records $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; if($mysqli->query($sql)){ printf("First record inserted successfully...!\n"); } $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...!\n"); } //now lets insert duplicate record with IGNORE keyword $sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; if($mysqli->query($sql)){ printf("Duplicate record inserted successfully using IGNORE keyword...!\n"); } $sql = "SELECT * from person_tbl"; if($result = $mysqli->query($sql)){ printf("Table records: \n"); while($row = mysqli_fetch_array($result)){ printf("First Name: %s, Last name: %s, Sex: %s", $row['first_name'], $row['last_name'], $row['sex']); printf("\n"); } } //lets insert a duplicate record $sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')"; if(!$mysqli->query($sql)){ printf("You can't insert any duplicate records...!\n"); } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();
Output
The output obtained is as shown below −
Table created successfully...! First record inserted successfully...! Second record inserted successfully...! Duplicate record inserted successfully using IGNORE keyword...! Table records: First Name: Thomas, Last name: Jay, Sex: First Name: Smith, Last name: John, Sex: PHP Fatal error: Uncaught mysqli_sql_exception: Duplicate entry 'John-Smith' for key 'person_tbl.PRIMARY' in D:\test\handlingduplicates.php:48
var mysql = require('mysql2'); var con = mysql.createConnection({ host:"localhost", user:"root", password:"password" }); //Connecting to MySQL con.connect(function(err) { if (err) throw err; // console.log("Connected successfully...!"); // console.log("--------------------------"); sql = "USE TUTORIALS"; con.query(sql); //create table sql = "CREATE TABLE person_tbl (first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name))"; con.query(sql, function(err, result){ if (err) throw err; console.log("Table created successfully....!"); }); //now let's insert some records sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; con.query(sql, function(err, result){ if (err) throw err; console.log("First record inserted successfully...!"); }); sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')"; con.query(sql, function(err, result){ if (err) throw err; console.log("Second record inserted successfully...!"); }); //now lets insert duplicate record with IGNORE keyword sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; con.query(sql, function(err, result){ console.log("Insert duplicate record with IGNORE keyword") if (err) throw err; console.log("Duplicate record inserted successfully with the help of IGNORE keyword"); }); //lets insert a duplicate record sql = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')"; con.query(sql, function(err, result){ console.log("Insert duplicate record"); if (err) throw err; console.log("You can't insert the duplicate record because columns are primary key"); }); sql = "SELECT * FROM person_tbl"; con.query(sql, function(err, result){ console.log("Table records(with ID auto_increment sequence)."); if (err) throw err; console.log(result); }); });
Output
The output obtained is as shown below −
Table created successfully....! First record inserted successfully...! Second record inserted successfully...! Insert duplicate record with IGNORE keyword Duplicate record inserted successfully with the help of IGNORE keyword Insert duplicate record Insert duplicate record D:\test1\duplicate.js:43 if (err) throw err; ^ Error: Duplicate entry 'John-Smith' for key 'person_tbl.PRIMARY'
import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.Statement; public class HandleDuplicates { public static void main(String[] args) { String url = "jdbc:mysql://localhost:3306/TUTORIALS"; String user = "root"; String password = "password"; ResultSet rs; try { Class.forName("com.mysql.cj.jdbc.Driver"); Connection con = DriverManager.getConnection(url, user, password); Statement st = con.createStatement(); //System.out.println("Database connected successfully...!"); //create table String sql = "CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name))"; st.execute(sql); System.out.println("Table created successfully....!"); //let's insert some records String sql1 = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; st.execute(sql1); System.out.println("Record inserted successfully...!"); String sql2 = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')"; st.execute(sql2); System.out.println("Duplicate record inserted successfully...!"); //lets print the table records String sql3 = "SELECT * FROM PERSON_TBL"; rs = st.executeQuery(sql3); System.out.println("Table records: "); while(rs.next()) { String fname = rs.getString("first_name"); String lname = rs.getString("last_name"); System.out.println("First name: " + fname + ", Last name: " + lname); } }catch(Exception e) { e.printStackTrace(); } } }
Output
The output obtained is as shown below −
Table created successfully....! Record inserted successfully...! Duplicate record inserted successfully...! Table records: First name: Thomas, Last name: Jay
import mysql.connector # Connecting to MySQL con = mysql.connector.connect( host="localhost", user="root", password="password", database="tut" ) # Creating a cursor object cursorObj = con.cursor() # Creating the table create_table_query = """ CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name) ) """ cursorObj.execute(create_table_query) print("Table 'person_tbl' is created successfully!") # Inserting some records first_record = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')" print("First record inserted successfully!") cursorObj.execute(first_record) second_record = "INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')" print("Second record inserted successfully!") cursorObj.execute(second_record) # Insert duplicate record with IGNORE keyword sql = "INSERT IGNORE INTO person_tbl (last_name, first_name) VALUES( 'Jay', 'Thomas')" print("Duplicate record inserted successfully with the help of IGNORE keyword") cursorObj.execute(sql) # Insert a duplicate record (this will throw an error) try: cursorObj.execute("INSERT INTO person_tbl (last_name, first_name) VALUES( 'John', 'Smith')") print("Insert duplicate record") except mysql.connector.Error as err: print("Insert duplicate record error:", err) con.commit() # Retrieving records cursorObj.execute("SELECT * FROM person_tbl") records = cursorObj.fetchall() # Printing the records print("Table records.") for record in records: print(record) # Closing the connection cursorObj.close() con.close()
Output
The output obtained is as shown below −
First record inserted successfully! Second record inserted successfully! Duplicate record inserted successfully with the help of IGNORE keyword Insert duplicate record error: 1062 (23000): Duplicate entry 'John-Smith' for key 'person_tbl.PRIMARY' Table records. ('Thomas', 'Jay', None) ('Smith', 'John', None)