
- MySQL - Home
- MySQL - Introduction
- MySQL - Features
- MySQL - Versions
- MySQL - Variables
- MySQL - Installation
- MySQL - Administration
- MySQL - PHP Syntax
- MySQL - Node.js Syntax
- MySQL - Java Syntax
- MySQL - Python Syntax
- MySQL - Connection
- MySQL - Workbench
- MySQL Databases
- MySQL - Create Database
- MySQL - Drop Database
- MySQL - Select Database
- MySQL - Show Database
- MySQL - Copy Database
- MySQL - Database Export
- MySQL - Database Import
- MySQL - Database Info
- MySQL Users
- MySQL - Create Users
- MySQL - Drop Users
- MySQL - Show Users
- MySQL - Change Password
- MySQL - Grant Privileges
- MySQL - Show Privileges
- MySQL - Revoke Privileges
- MySQL - Lock User Account
- MySQL - Unlock User Account
- MySQL Tables
- MySQL - Create Tables
- MySQL - Show Tables
- MySQL - Alter Tables
- MySQL - Rename Tables
- MySQL - Clone Tables
- MySQL - Truncate Tables
- MySQL - Temporary Tables
- MySQL - Repair Tables
- MySQL - Describe Tables
- MySQL - Add/Delete Columns
- MySQL - Show Columns
- MySQL - Rename Columns
- MySQL - Table Locking
- MySQL - Drop Tables
- MySQL - Derived Tables
- MySQL Queries
- MySQL - Queries
- MySQL - Constraints
- MySQL - Insert Query
- MySQL - Select Query
- MySQL - Update Query
- MySQL - Delete Query
- MySQL - Replace Query
- MySQL - Insert Ignore
- MySQL - Insert on Duplicate Key Update
- MySQL - Insert Into Select
- MySQL Indexes
- MySQL - Indexes
- MySQL - Create Index
- MySQL - Drop Index
- MySQL - Show Indexes
- MySQL - Unique Index
- MySQL - Clustered Index
- MySQL - Non-Clustered Index
- MySQL Operators and Clauses
- MySQL - Where Clause
- MySQL - Limit Clause
- MySQL - Distinct Clause
- MySQL - Order By Clause
- MySQL - Group By Clause
- MySQL - Having Clause
- MySQL - AND Operator
- MySQL - OR Operator
- MySQL - Like Operator
- MySQL - IN Operator
- MySQL - ANY Operator
- MySQL - EXISTS Operator
- MySQL - NOT Operator
- MySQL - NOT EQUAL Operator
- MySQL - IS NULL Operator
- MySQL - IS NOT NULL Operator
- MySQL - Between Operator
- MySQL - UNION Operator
- MySQL - UNION vs UNION ALL
- MySQL - MINUS Operator
- MySQL - INTERSECT Operator
- MySQL - INTERVAL Operator
- MySQL Joins
- MySQL - Using Joins
- MySQL - Inner Join
- MySQL - Left Join
- MySQL - Right Join
- MySQL - Cross Join
- MySQL - Full Join
- MySQL - Self Join
- MySQL - Delete Join
- MySQL - Update Join
- MySQL - Union vs Join
- MySQL Keys
- MySQL - Unique Key
- MySQL - Primary Key
- MySQL - Foreign Key
- MySQL - Composite Key
- MySQL - Alternate Key
- MySQL Triggers
- MySQL - Triggers
- MySQL - Create Trigger
- MySQL - Show Trigger
- MySQL - Drop Trigger
- MySQL - Before Insert Trigger
- MySQL - After Insert Trigger
- MySQL - Before Update Trigger
- MySQL - After Update Trigger
- MySQL - Before Delete Trigger
- MySQL - After Delete Trigger
- MySQL Data Types
- MySQL - Data Types
- MySQL - VARCHAR
- MySQL - BOOLEAN
- MySQL - ENUM
- MySQL - DECIMAL
- MySQL - INT
- MySQL - FLOAT
- MySQL - BIT
- MySQL - TINYINT
- MySQL - BLOB
- MySQL - SET
- MySQL Regular Expressions
- MySQL - Regular Expressions
- MySQL - RLIKE Operator
- MySQL - NOT LIKE Operator
- MySQL - NOT REGEXP Operator
- MySQL - regexp_instr() Function
- MySQL - regexp_like() Function
- MySQL - regexp_replace() Function
- MySQL - regexp_substr() Function
- MySQL Fulltext Search
- MySQL - Fulltext Search
- MySQL - Natural Language Fulltext Search
- MySQL - Boolean Fulltext Search
- MySQL - Query Expansion Fulltext Search
- MySQL - ngram Fulltext Parser
- MySQL Functions & Operators
- MySQL - Date and Time Functions
- MySQL - Arithmetic Operators
- MySQL - Numeric Functions
- MySQL - String Functions
- MySQL - Aggregate Functions
- MySQL Misc Concepts
- MySQL - NULL Values
- MySQL - Transactions
- MySQL - Using Sequences
- MySQL - Handling Duplicates
- MySQL - SQL Injection
- MySQL - SubQuery
- MySQL - Comments
- MySQL - Check Constraints
- MySQL - Storage Engines
- MySQL - Export Table into CSV File
- MySQL - Import CSV File into Database
- MySQL - UUID
- MySQL - Common Table Expressions
- MySQL - On Delete Cascade
- MySQL - Upsert
- MySQL - Horizontal Partitioning
- MySQL - Vertical Partitioning
- MySQL - Cursor
- MySQL - Stored Functions
- MySQL - Signal
- MySQL - Resignal
- MySQL - Character Set
- MySQL - Collation
- MySQL - Wildcards
- MySQL - Alias
- MySQL - ROLLUP
- MySQL - Today Date
- MySQL - Literals
- MySQL - Stored Procedure
- MySQL - Explain
- MySQL - JSON
- MySQL - Standard Deviation
- MySQL - Find Duplicate Records
- MySQL - Delete Duplicate Records
- MySQL - Select Random Records
- MySQL - Show Processlist
- MySQL - Change Column Type
- MySQL - Reset Auto-Increment
- MySQL - Coalesce() Function
MySQL - Delete Duplicate Records
The MySQL Delete Duplicate Records
Duplicate records in a database, including MySQL, is a very common occurrence. A MySQL database stores data in the form of tables consisting of rows and columns. Now, a record is said to be duplicated when two or more rows in a database table have same values.
This redundancy might occur due to various reasons −
- The row might be inserted twice.
- When raw data is imported from external sources.
- There might be a bug in the database application.
Whatever might be reason, deleting such redundancy becomes important to increase the data accuracy with less errors, or to increase the efficiency of database performance.
Find Duplicate Values
Before removing duplicate records, we must find whether they exist in a table or not. This is possible using the following ways −
GROUP BY Clause
COUNT() Method
Example
Let us first create table named "CUSTOMERS" containing duplicate values −
CREATE TABLE CUSTOMERS( ID int, NAME varchar(100) );
Using the following INSERT query, insert few records into the "CUSTOMERS" table. Here, we have added "John" as duplicate record 3 times −
INSERT INTO CUSTOMERS VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John');
The CUSTOMERS table obtained is as follows −
id | name |
---|---|
1 | John |
2 | Johnson |
3 | John |
4 | John |
Now, we are retrieving the record that is duplicated in the table using the COUNT() method and GROUP BY clause as shown in the following query −
SELECT NAME, COUNT(NAME) FROM CUSTOMERS GROUP BY NAME HAVING COUNT(NAME) > 1;
Output
Following is the output obtained −
NAME | COUNT(NAME) |
---|---|
John | 3 |
Delete Duplicate Records
To delete duplicate records from a database table, we can use the DELETE command. However, this DELETE command can be used in two ways to remove duplicates from a table −
Using DELETE... JOIN
Using ROW_NUMBER() Function
Using DELETE... JOIN
To use DELETE... JOIN command in order to remove duplicate records from a table, we perform inner join on itself. This is applicable for cases that are not completely identical.
For instance, suppose there is a repetition of customer details in customer records, but the serial number keeps incrementing. Here, the record is duplicated even if the ID is not same.
Example
In the following query, we are using the CUSTOMERS table created previously to remove duplicate records using DELETE... JOIN command −
DELETE t1 FROM CUSTOMERS t1 INNER JOIN CUSTOMERS t2 WHERE t1.id < t2.id AND t1.name = t2.name;
Output
Following is the output obtained −
Query OK, 2 rows affected (0.01 sec)
Verification
We can verify whether the duplicate records have been removed or not using the following SELECT statement −
SELECT * FROM CUSTOMERS;
We can see in the table obtained that the query removed duplicates and leave distinct records in the table −
ID | NAME |
---|---|
2 | Johnson |
4 | John |
Using ROW_NUMBER() Function
The ROW_NUMBER() Function in MySQL is used to assign a sequential number, starting from 1, to each row in a result-set obtained from a query.
Using this function, MySQL allows you to detect the duplicate rows, which can be removed with the DELETE statement.
Example
Here, we are applying the ROW_NUMBER() function to the CUSTOMERS table having duplicate values in the 'NAME' column. We will assign row numbers within a partition based on the 'NAME' column using the following query −
SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS;
Following is the output obtained −
id | row_num |
---|---|
1 | 1 |
3 | 2 |
4 | 3 |
2 | 1 |
Now, with the following statement, delete the duplicate rows (rows with a row number greater than 1) −
DELETE FROM CUSTOMERS WHERE id IN( SELECT id FROM (SELECT id, ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS row_num FROM CUSTOMERS) AS temp_table WHERE row_num>1 );
We get the output as shown below −
Query OK, 2 rows affected (0.00 sec)
To verify whether the duplicate records have been removed or not, use the following SELECT query −
SELECT * FROM CUSTOMERS;
The result produced is as follows −
ID | NAME |
---|---|
1 | John |
2 | Johnson |
Delete Duplicate Records Using Client Program
We can also delete duplicate records using client program.
Syntax
To delete duplicate records through a PHP program, we need to perform inner join with "DELETE" command using the mysqli function query() as follows −
$sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; $mysqli->query($sql);
To delete duplicate records through a JavaScript program, we need to perform inner join with "DELETE" command using the query() function of mysql2 library as follows −
sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; con.query(sql)
To delete duplicate records through a Java program, we need to perform inner join with "DELETE" command using the JDBC function execute() as follows −
String sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; statement.execute(sql);
To delete duplicate records through a Python program, we need to perform inner join with "DELETE" command using the execute() function of the MySQL Connector/Python as follows −
delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name" cursorObj.execute(delete_query)
Example
Following are the programs −
$dbhost = 'localhost'; $dbuser = 'root'; $dbpass = 'password'; $db = 'TUTORIALS'; $mysqli = new mysqli($dbhost, $dbuser, $dbpass, $db); if ($mysqli->connect_errno) { printf("Connect failed: %s
", $mysqli->connect_error); exit(); } //printf('Connected successfully.
'); //let's create a table $sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; if($mysqli->query($sql)){ printf("DuplicateDeleteDemo table created successfully...!\n"); } //now lets insert some duplicate records; $sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John')"; if($mysqli->query($sql)){ printf("First record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(2,'Johnson')"; if($mysqli->query($sql)){ printf("Second record inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(3,'John')"; if($mysqli->query($sql)){ printf("Third records inserted successfully...!\n"); } $sql = "INSERT INTO DuplicateDeleteDemo VALUES(4,'John')"; if($mysqli->query($sql)){ printf("Fourth record inserted successfully...!\n"); } //display the table records $sql = "SELECT * FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records(before deleting): \n"); while($row = mysqli_fetch_array($result)){ printf("ID: %d, NAME %s", $row['ID'], $row['NAME']); printf("\n"); } } //now lets count duplicate records $sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1"; if($result = $mysqli->query($sql)){ printf("Duplicate records: \n"); while($row = mysqli_fetch_array($result)){ print_r($row); } } //lets delete dupliacte records $sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; if($mysqli->query($sql)){ printf("Duplicate records deleted successfully...!\n"); } $sql = "SELECT ID, NAME FROM DuplicateDeleteDemo"; if($result = $mysqli->query($sql)){ printf("Table records after deleting: \n"); while($row = mysqli_fetch_row($result)){ print_r($row); } } if($mysqli->error){ printf("Error message: ", $mysqli->error); } $mysqli->close();
Output
The output obtained is as shown below −
DuplicateDeleteDemo table created successfully...! First record inserted successfully...! Second record inserted successfully...! Third records inserted successfully...! Fourth record inserted successfully...! Table records(before deleting): ID: 1, NAME John ID: 2, NAME Johnson ID: 3, NAME John ID: 4, NAME John Duplicate records: Array ( [0] => John [NAME] => John [1] => 3 [COUNT(NAME)] => 3 ) Duplicate records deleted successfully...! Table records after deleting: Array ( [0] => 2 [1] => Johnson ) Array ( [0] => 4 [1] => John )
var mysql = require('mysql2'); var con = mysql.createConnection({ host: "localhost", user: "root", password: "Nr5a0204@123" }); // Connecting to MySQL con.connect(function (err) { if (err) throw err; console.log("Connected!"); console.log("--------------------------"); // Create a new database sql = "Create Database TUTORIALS"; con.query(sql); sql = "USE TUTORIALS"; con.query(sql); sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100));" con.query(sql); sql = "INSERT INTO DuplicateDeleteDemo VALUES(1,'John'),(2,'Johnson'),(3,'John'),(4,'John');" con.query(sql); sql = "SELECT * FROM DuplicateDeleteDemo;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records of DuplicateDeleteDemo Table:**"); console.log(result); console.log("--------------------------"); }); //Fetching records that are duplicated in the table sql = "SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records that are duplicated in the table:**"); console.log(result); console.log("--------------------------"); }); sql = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; con.query(sql); sql = "SELECT * FROM DuplicateDeleteDemo;" con.query(sql, function(err, result){ if (err) throw err console.log("**Records after deleting Duplicates:**"); console.log(result); }); });
Output
The output obtained is as shown below −
Connected! -------------------------- **Records of DuplicateDeleteDemo Table:** [ { ID: 1, NAME: 'John' }, { ID: 2, NAME: 'Johnson' }, { ID: 3, NAME: 'John' }, { ID: 4, NAME: 'John' } ] -------------------------- **Records that are duplicated in the table:** [ { NAME: 'John', 'COUNT(NAME)': 3 } ] -------------------------- **Records after deleting Duplicates:** [ { ID: 2, NAME: 'Johnson' }, { ID: 4, NAME: 'John' } ]
import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import java.sql.Statement; public class DeleteDuplicates { public static void main(String[] args) { String url = "jdbc:mysql://localhost:3306/TUTORIALS"; String user = "root"; String password = "password"; ResultSet rs; try { Class.forName("com.mysql.cj.jdbc.Driver"); Connection con = DriverManager.getConnection(url, user, password); Statement st = con.createStatement(); //System.out.println("Database connected successfully...!"); String sql = "CREATE TABLE DuplicateDeleteDemo(ID int,NAME varchar(100))"; st.execute(sql); System.out.println("Table DuplicateDeleteDemo created successfully...!"); //let's insert some records into it... String sql1 = "INSERT INTO DuplicateDeleteDemo VALUES (1,'John'), (2,'Johnson'), (3,'John'), (4,'John')"; st.execute(sql1); System.out.println("Records inserted successfully....!"); //print table records String sql2 = "SELECT * FROM DuplicateDeleteDemo"; rs = st.executeQuery(sql2); System.out.println("Table records(before deleting the duplicate rcords): "); while(rs.next()) { String id = rs.getString("id"); String name = rs.getString("name"); System.out.println("Id: " + id + ", Name: " + name); } //let delete duplicate records using delete join String sql3 = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name"; st.execute(sql3); System.out.println("Duplicate records deleted successfully....!"); String sql4 = "SELECT * FROM DuplicateDeleteDemo"; rs = st.executeQuery(sql4); System.out.println("Table records(after deleting the duplicate rcords): "); while(rs.next()) { String id = rs.getString("id"); String name = rs.getString("name"); System.out.println("Id: " + id + ", Name: " + name); } }catch(Exception e) { e.printStackTrace(); } } }
Output
The output obtained is as shown below −
Table DuplicateDeleteDemo created successfully...! Records inserted successfully....! Table records(before deleting the duplicate rcords): Id: 1, Name: John Id: 2, Name: Johnson Id: 3, Name: John Id: 4, Name: John Duplicate records deleted successfully....! Table records(after deleting the duplicate rcords): Id: 2, Name: Johnson Id: 4, Name: John
import mysql.connector # Establishing the connection connection = mysql.connector.connect( host='localhost', user='root', password='password', database='tut' ) # Creating a cursor object cursorObj = connection.cursor() # Creating the table 'DuplicateDeleteDemo' create_table_query = '''CREATE TABLE DuplicateDeleteDemo(ID int, NAME varchar(100))''' cursorObj.execute(create_table_query) print("Table 'DuplicateDeleteDemo' is created successfully!") # Inserting records into 'DuplicateDeleteDemo' table sql = "INSERT INTO DuplicateDeleteDemo (ID, NAME) VALUES (%s, %s);" values = [(1, 'John'), (2, 'Johnson'), (3, 'John'), (4, 'John')] cursorObj.executemany(sql, values) print("Values inserted successfully") # Display table display_table = "SELECT * FROM DuplicateDeleteDemo;" cursorObj.execute(display_table) # Printing the table 'DuplicateDeleteDemo' results = cursorObj.fetchall() print("\nDuplicateDeleteDemo Table:") for result in results: print(result) # Retrieve the duplicate records duplicate_records_query = """ SELECT NAME, COUNT(NAME) FROM DuplicateDeleteDemo GROUP BY NAME HAVING COUNT(NAME) > 1; """ cursorObj.execute(duplicate_records_query) dup_rec = cursorObj.fetchall() print("\nDuplicate records:") for record in dup_rec: print(record) # Delete duplicate records delete_query = "DELETE t1 FROM DuplicateDeleteDemo t1 INNER JOIN DuplicateDeleteDemo t2 WHERE t1.id < t2.id AND t1.name = t2.name" cursorObj.execute(delete_query) print("Duplicate records deleted successfully") # Verification display_table_after_delete = "SELECT * FROM DuplicateDeleteDemo;" cursorObj.execute(display_table_after_delete) results_after_delete = cursorObj.fetchall() print("\nDuplicateDeleteDemo Table (After Delete):") for result in results_after_delete: print(result) # Closing the cursor and connection cursorObj.close() connection.close()
Output
The output obtained is as shown below −
Table 'DuplicateDeleteDemo' is created successfully! Values inserted successfully DuplicateDeleteDemo Table: (1, 'John') (2, 'Johnson') (3, 'John') (4, 'John') Duplicate records: ('John', 3) Duplicate records deleted successfully DuplicateDeleteDemo Table (After Delete): (2, 'Johnson') (4, 'John')