How to Find and Remove Duplicates in Excel
Last Updated :
16 Dec, 2024
Removing duplicates in Excel is essential when cleaning up data to ensure accuracy and avoid redundancy. Whether you’re working with small datasets or large spreadsheets, Excel provides built-in tools and methods to help you identify and remove duplicates effectively. This guide will walk you through all the methods, with tips for beginners and advanced users.
1. Using the Built-In Remove Duplicates Tool
The Remove Duplicates tool in Excel is one of the simplest and most efficient methods for cleaning up your data. Follow the below steps to remove duplicates in excel using the built-in remove duplicate tool.
Step 1: Open Excel Spreadsheet and Select Your Data
- Highlight the range of data where you want to find duplicates.
- Include column headers if applicable.
Select the DataStep 2: Go to the Ribbon
Navigate to Data > Remove Duplicates in the toolbar
Select Data >> Click on Remove DuplicatesStep 3: Choose Columns to Check Duplicates
- In the dialog box, select the columns you want to check for duplicates.
- If your data has headers, check the "My data has headers" box.
- Here, we have selected columns ID, Name and Age.
Select Columns to Check for DuplicatesStep 4: Remove Duplicates
- Click OK.
- Excel will remove duplicates and display a message showing how many duplicates were removed and how many unique values remain.
Preview Results2. Remove Duplicates Using the Advanced Filter Option
The Advanced Filter option in Excel is a versatile tool that allows you to remove duplicates while preserving unique records. This method is particularly useful when you want to filter duplicates and extract unique data into a separate location.
Step 1: Prepare your Data Set
Open Microsoft Excel and input your data into the sheet. Ensure your dataset is clean and includes clear headers. In the given example we have ID, Name and Age.
Prepare your DataStep 2: Go to the "Data" Tab and Select "Advanced" Option
Navigate to the Data tab on the Excel ribbon and Click on Advanced option in the Sort & Filter group.
Go to Data Tab>> Select "Advanced " OptionOnce you click Advanced, a dialog box will appear with the following options:
Choose the Action
- Filter the List, In-Place: Use this option if you want to display the filtered results directly within your current dataset.
- Copy to Another Location: Choose this option to extract the filtered data to a separate location (preferred for duplicates extraction).
Specify the List Range
- In the List Range field, highlight the column or dataset you want to filter for duplicates.
- Ensure you include headers in the selection.
Set Criteria and Click OK
Check the box labeled Unique Records Only. This will ensure only unique values are extracted or displayed.
Set the CriteriaStep 4: Preview the Results
Excel will extract the unique records to the specified location.
Note: This method is only useful when you need to keep unique records only.
Duplicates RemovedFor more control or advanced data handling, use the COUNTIF function to identify duplicates manually.
Step 1: Combine Data Using the Concatenate Operator
- Create a new column and name it Combined.
- Use the & operator to combine data from all relevant columns into a single string.
- In the below example, we are combining A2, B2, and C2.
=A2 & B2 & C2
- Drag the formula down to combine all rows.
- This combined column acts as a unique identifier for each row.
Combine the DataStep 2: Count Duplicates Using the COUNTIF Function
- Create another column and name it Count.
- Use the COUNTIF function to count how many times each combined entry appears in the dataset.
Formula:
=COUNTIF(D$2:D$11, D2)
Here:
- C$2:C$6 is the range of the combined column.
- C2 is the cell being checked for duplicates.
- Drag the formula down to populate the count for all rows.
- Any row with a count greater than 1 is a duplicate.
Enter the OUNTIF FormulaStep 3: Apply the Filter
- Select Column E by clicking on the column header or highlighting the data range.
- Navigate to the Data Tab on the Excel ribbon.
- Click on the Filter Icon in the Sort & Filter group to apply a filter to the selected column.
Alternate Method:
After identifying duplicates in your dataset, you can use Excel’s Built-in Remove Duplicates Tool (explained in Method 1) to quickly eliminate the duplicates. This approach ensures efficient removal of repeated entries while retaining unique values
Select the Column >> Go to Data Tab>>Click on the Filter IconStep 4: Click on the Filter Dropdown Icon and Check the box next to 1
Click on the small filter dropdown icon in the header of Column E.
In the filter menu:
- Uncheck Select All to clear all selections.
- Check the box next to 1 to filter only for rows containing the value 1.
- Click OK to apply the filter and view the filtered results.
Filter Drop-Down Icon>> Tick the Check Box 1 >>Press OKStep 5: Preview Results
Now you can see only the Unique Values.
Preview ResultsFor large datasets or complex scenarios, Power Query provides a powerful way to clean and transform data.
Step 1: Go to the Data Tab and Select From Table
Go to Data tab given at the top and under Get & Transform section select From Table option.
Go to Data Tab >> Click "From Table"A pop-up window will appear on the screen. In the window, tick mark the check on My table has headers and click on Ok button.
Check on "My table has Header" >> Click "Ok"Step 3: Power Query Editor Displayed
Now, the Power Query Editor opens on the screen.
Power Query editor appearsStep 4: Remove Duplicates
- In the Power Query Editor, select the columns you want to check for duplicates.
- Go to Home > Remove Rows > Remove Duplicates.
Select Remove DuplicatesStep 5: Load Cleaned Data Back to Excel
Click Close & Load to insert the cleaned data into a new worksheet.
Preview ChangesIf you want to identify duplicates without removing them:
Note: While it doesn’t automatically remove duplicates, it allows you to identify them so you can delete them manually. Follow these steps:
Step 1: Select the Data Range
Highlight the range of cells to check for duplicates.
Go to Home > Conditional Formatting > Highlight Cell Rules > Duplicate Values.
Go to Home Tab>>Select Conditional formatting>>Click on Highlight Cell rules >> Choose Duplicate ValuesStep 3: Customize Formatting > Click OK
- Choose how you want duplicates to be highlighted (e.g., red fill, bold text).
- Excel will automatically highlight all duplicate values in your selection.
Preview the Highlighted ValuesStep 4: Manually Review and Remove Duplicates
- Review the highlighted cells to identify duplicate entries.
- Manually delete duplicate rows as needed:
- Right-click on a row with duplicate data.
- Select Delete Row to remove it.
- Repeat the process until all duplicates are removed.
Right-Click on the Duplicate row and Select DeleteStep 5: Preview Results
Now, all the duplicate values has been removed manually.
Preview Results6. Removing Duplicate Rows with Pivot Tables
Pivot Tables in Excel are a powerful tool to identify and count duplicates easily. They help organize and summarize large datasets into a clear format, making it simple to spot repeated values. Here’s how to use Pivot Tables to remove duplicates:
Step 1: Prepare Your Dataset
Before creating a Pivot Table, ensure your dataset is clean.
Prepare your DataStep 2: Select the Data, Go to Insert Tab and Select Pivot Table
Highlight the column containing the data you want to analyze for duplicates. If your dataset contains multiple columns, select the entire dataset to maintain context. Go to Insert Tab and Select Pivot Table:
Select the Data>> Go to Insert Icon >>Select Pivot TableOnce the Pivot Table is inserted, a blank table will appear along with the Pivot Table Fields pane.
Drag the Column into Rows
- Drag the column containing duplicate-prone data (e.g., "Data") into the Rows area.
- This will create a list of unique values from that column.
Drag the Same Column into Values
- Drag the same column into the Values area.
- By default, the Values area aggregates data using "Sum." Change this to "Count"
Drag the Column to Rows and Values FiledStep 4: Identify and Remove Duplicates
- Unique values will be listed in the Rows area.
- The Values area will display the count of each row’s occurrences.
Identify duplicates:
- Any value with a count greater than 1 is a duplicate.
- Manually remove duplicates from the original dataset by filtering or sorting based on the duplicate criteria identified in the Pivot Table.
Preview the ResultsUsing Pivot Tables to find duplicates is a quick and efficient method for organizing and analyzing data. It not only identifies duplicates but also provides a count of how many times each value occurs, making it a versatile solution for data cleanup and reporting tasks.
Third-party tools designed for Excel provide advanced features to efficiently manage duplicates. These tools often come with specialized options such as case-sensitive duplicate checks, advanced filtering, and automated duplicate removal across multiple sheets. They are especially useful for handling large datasets and offer additional customization beyond Excel’s built-in functionalities, saving time and effort in data cleaning and analysis.
8. Manually Identifying and Removing Duplicates
For smaller datasets, you can manually find and delete duplicates.
Step 1: Sort Data
Use Data > Sort to organize data alphabetically or numerically.
Step 2: Manually Compare Rows:
- Scroll through the sorted data to find duplicates.
- Delete rows that are duplicates.
Tips and Best Practices
- Backup Your Data: Always create a copy of your data before removing duplicates, as changes cannot always be undone.
- Use Filters: Apply filters to narrow down your data and locate duplicates easily.
- Check for Hidden Characters: Duplicates may not match exactly due to extra spaces or hidden characters. Use the TRIM function to clean data: =TRIM(A2)
- Case Sensitivity: Excel’s built-in tools are not case-sensitive. Use Power Query for case-sensitive duplicate removal.
- Combine Methods: Use conditional formatting to highlight duplicates before removing them, ensuring no critical data is lost.
Conclusion
Finding and removing duplicates in Excel is a vital skill for maintaining clean and accurate datasets. With options ranging from simple built-in tools to advanced methods like Power Query, Excel provides solutions for every user. Use this guide to confidently identify and manage duplicates in your spreadsheets while preserving data integrity.
Similar Reads
How to Find and Remove Duplicate Files on Linux?
Most of us have a habit of downloading many types of stuff (songs, files, etc) from the internet and that is why we may often find we have downloaded the same mp3 files, PDF files, and other extensions. Your disk spaces are unnecessarily wasted by Duplicate files and if you want the same files on a
4 min read
How to Remove Duplicates From Array Using VBA in Excel?
Excel VBA code to remove duplicates from a given range of cells. In the below data set we have given a list of 15 numbers in âColumn Aâ range A1:A15. Need to remove duplicates and place unique numbers in column B. Sample Data: Cells A1:A15 Sample Data Final Output: VBA Code to remove duplicates and
2 min read
How to Find Duplicates in Excel (2025 Step-by-Step Guide)
Managing data effectively in Excel often requires identifying duplicate entries, which can clutter your analysis or lead to inaccuracies. Knowing how to find duplicates in Excel is essential for maintaining data integrity. From using COUNTIF to find duplicates to applying advanced filters and PivotT
9 min read
How to Add, Use and Remove Filter in Excel
Filtering data in Excel is an essential skill for anyone dealing with large datasets. Whether you want to organize your information, find specific entries, or simplify your data analysis process, mastering the Excel filter function is a must. In this article, we'll walk you through everything you ne
11 min read
How to Find Duplicate Records in SQL?
To find duplicate records in SQL, we can use the GROUP BY and HAVING clauses. The GROUP BY clause allows us to group values in a column, and the COUNT function in the HAVING clause shows the count of the values in a group. Using the HAVING clause with a condition of COUNT(*) > 1, we can identify
3 min read
How to Find Duplicate Values in Excel Using VLOOKUP?
In this article, we will look into how we can use the VLOOKUP to find duplicate values in Excel. To do so follow the below steps: Let's make two columns of different section to check VLOOKUP formula on columns:Created Two Columns Here is the formula we are going to use:=VLOOKUP(List1,List2,TRUE,FALS
2 min read
How to Remove Duplicates in Google Sheets
Google Sheets as a part of Google Workspace, is one of the popular cloud-based spreadsheet applications widely used for data management and analysis. It allows users to create and edit data on spreadsheets and enables us to share spreadsheets online which can be accessible from any device with inter
5 min read
How To Remove Duplicates From Vector In R
A vector is a basic data structure that is used to represent an ordered collection of elements of the same data type. It is one-dimensional and can contain numeric, character, or logical values. It is to be noted that the vector in C++ and the vector in R Programming Language are not the same. In C+
4 min read
How to Remove Duplicate Elements from NumPy Array
In this article, we will see how to remove duplicate elements from NumPy Array. Here we will learn how to Remove Duplicate Elements from a 1-D NumPy Array and 2-D NumPy Array. Input1: [1 2 3 4 5 1 2 3 1 2 9]Output1: [1 2 3 4 5 9]Explanation: In this example, we have removed duplicate elements from o
7 min read
How to Erase Duplicates and Sort a Vector in C++?
In this article, we will learn how to remove duplicates and sort a vector in C++.The simplest method to remove the duplicates and sort the vector is by using sort() and unique() functions. Letâs take a look at an example:C++#include <bits/stdc++.h> using namespace std; int main() { vector<i
3 min read