Remove Duplicates in Excel
In excel, a large amount of data may contain duplicate values and is not identifiable easily. Therefore, we need to be sure of what combination of columns to be used to find the duplicate values. Most common method to remove duplicates in excel includes using the in-built tools, advanced filter options, using excel formulas or make use of power query.
To make sure our data has been duplicated, it is necessary to find those duplicates before removing them. Without removing duplicates, we will get the wrong summary reports, and numbers will go wrong.
For instance, look at the following data in Excel.
We have 8 values, but the green-colored cells have duplicate values. There is no logic in keeping the duplicate values and going ahead with the calculations. Hence, we need to remove these duplicate values to make the data accurate.
Table of contents
- Remove Duplicates in excel feature is a built-in function. This function will look for duplicates in multiple columns.
- The UNIQUE function will extract only the unique records. However, if we want to filter non-duplicate values, we need to set the last argument to TRUE.
- The advanced file can remove filters’ duplicate values and copy the unique values to the new location if we check the box ‘Unique records only’.
- Power Query tool can remove the duplicates and load only the unique records
How to Remove Duplicates in Excel?
There are various ways we can remove duplicates in Excel. Let us understand each of those methods in detail.
#1 – Using Inbuilt Tool
To deal with duplicate data, Excel has built-in tools called Remove Duplicates. This feature is available under the Data tab and under the Data Tools group in Excel.
This option will remove all the duplicate records and retains only a unique set of records. For instance, we have the following data in Excel.
To remove duplicate records, follow the steps listed here:
- Select the data from A1:A9.
- Go to the Data tab and under the Data Tools group in Excel.
- The Remove Duplicate window opens up. Since we have selected the data, including the header, make sure the My data has headers check box is ticked.
Click on OK, and it will remove all the duplicate values. The popup message will display as shown in the following image.
It indicates that the 2 duplicate values are found, and in the end, we get 6 unique values list.
We will now have a unique list.
Previously there were 3 ‘B’ items, but after removing duplicates, we have only one ‘B’ item.
- To understand how the remove duplicates in excel feature works, let us modify the old data with an extra column.
With the existing column, we have added one more column called Value and filled it with some values.
- Now let us select the data and hit on remove duplicates in excel.
- Click on OK.
We will be able to see the pop-up that says, No duplicate values found.
The reason for this is Excel looks for duplicate values on row-by-row basis. So, we have selected two columns to check duplicate values.
For instance, let us filter out only item B values.
Even though the item names are duplicate, the values are different, so Excel treats them as unique values based on two-column selections. If two rows contain the same values, then Excel treats them as duplicate values.
For instance, look at the following image.
We have changed the value of the third line item to 279; now, rows 3 and 9 will be treated as duplicates, and the last record will be removed if duplicates are removed in the excel feature.
#2 – Using Advanced Filter Option
The advanced filter option is there to filter values based on the criteria given by the users. However, we can also use the advanced filter as the option to filter unique records.
The advanced Filter option is available under the Data tab.
Let us use the following data to apply the Advanced filter feature.
We have a few duplicate records in the above sales table. Therefore, let us apply the Advanced filter option and filter unique records from the list.
- Step 1: Go to the Data tab; under the Sort & Filter group, click on the Advanced Filter option.
- Step 2: This will bring the following Advanced Filter window. Choose the Copy to another location option under action.
- Step 3: In the List Range box, choose the data range from A1:C9.
- Step 4: In the Copy to option, choose the empty cell where we need to copy the filtering data.
- Step 5: Next, the important thing that we need to do it we need to check the box Unique records only. It will help us to filter unique records and ignore duplicates.
- Step 6: Click on OK, which will copy the unique values to the chosen location, i.e., cell E1.
It has copied only the unique records from the range A1:C9.
#3 – Use Formulas to Remove Duplicates in Excel
We can also remove duplicates by using Excel formulas. There are multiple ways we can use the formulas in Excel. First, take a look at the UNIQUE function in Excel.
Array: The range or array from which we are trying to extract unique values.
[by col]: This is an optional argument. If we want to find a unique value from the column, we need to give the input as TRUE, and if we want to find duplicates from rows, then we need to give FALSE. Remember, FALSE is the default value if this argument is omitted.
[Exactly Once]: If we want to extract values appearing only once, we can use this argument by providing TRUE. By default, it will take FALSE as the input argument and extract distinct values.
Note: The UNIQUE function is available for Office 365 and Excel 2021 versions.
Let us use the same data from the advanced filter example.
- Enter the UNIQUE function in cell E1.
- Choose the array from A1:C9.
- Ignore the last 2 arguments as of now. Then, close the bracket and hit the Enter key to get the result.
There we go, the UNIQUE function has given us all the unique records by removing duplicate records.
- If we want to extract records appearing only once, we can use exactly one argument as TRUE.
As we can see, if any duplicates are there, it has ignored those entries and returned only records that are there only once.
#4 – Using the Power Query Tool
Using Excel Power Query, duplicates can be removed dynamically from the data source.
Let us use the previous data table for this example as well.
- Step 2: After converting the data to Excel Table format, select any of the cells in the table. Go to the Data tab, and click on From Table/Range under the Get and Transform Data group.
- Step 3: It will launch the Power Query Editor window, as shown in the following image.
- Step 4: Select all the columns by holding the Ctrl key.
- Step 5: Now, under the Home tab, click on the Remove Rows drop-down option and click on the Remove Duplicates in excel option.
- Step 6: As soon as we click on the Remove Duplicates option, it will remove all the duplicates and get the following list.
- Step 7: Click on the Close and Load option under the Home tab.
It will load the data to Excel by inserting a new worksheet.
The dynamic nature of this method is whenever new duplicate data is added to the source list; we have to click on the Refresh button of the query created while loading the data from the power query.
Important Things to Note
- ALT + A + M is the shortcut to bring the ‘remove duplicates’ window.
- While dealing with multiple columns to remove duplicates, Excel will look for duplicates in each column of all rows.
- The UNIQUE function will return the #REF error if we do not have enough space to retrieve the values.
- The UNIQUE function is available for Office 365 and Excel 2021 versions.
- Power Query loads the data to the new worksheet and maintains the live connection to the Excel Table format.
- The Advanced Filter option will not remove duplicates in excel if we ignore the ‘Unique records only’ option.
- In Power Query, all the columns must be selected before we click on remove duplicates in excel.
Frequently Asked Questions (FAQs)
To keep only one unique record, we need to use remove duplicate in excel. For example, we have the following records in Excel.
Select the entire data range.
Go to the Data tab and click on remove duplicates in excel.
This will bring the remove duplicates window. Click on OK in this window.
This will remove all the duplicate records.
Remove Duplicates in Excel is available under the Data tab.
ALT + A + M is the shortcut to remove duplicates in Excel.
The formula to remove duplicates in Excel is UNIQUE function. It helps remove all the duplicate records and gets only unique ones. This function is available only for Office 365 versions and Excel 2021 versions.
This article must help understand Remove Duplicates in Excel with its formula and examples. You can download the template here to use it instantly.
This has been a guide to Remove Duplicates in Excel. Here we learn different methods to remove duplicate data with examples and downloadable excel template. You can learn more from the following articles –