Monday, February 7, 2022

Excel Data Cleaning Techniques: Separate First, Middle, and Last Names Using Formulas; Duplicate Detection and Removal


Separate First, Middle, and Last Names Using Formulas 

Suppose you have the dataset as shown below and you want to quickly separate the first, last, and middle names in one cell.


The below formula will give you the first name:

=LEFT(A2,SEARCH(" ",A2)-1)


The above formula uses the SEARCH function to get the position of the space character in between the first and last name. The LEFT function then uses this space position number to extract all the text before it.

2. Get the Last Name by going to cell B2 and typing = LEFT(A2,SEARCH(" ",A2)-1)


3. Get the First Name by typing in cell C2 =RIGHT(A2,LEN(A2)-SEARCH(" ",A2))

4. Get the Middle Name by typing in cell D2 =RIGHT(A2,LEN(A2)-SEARCH("@",SUBSTITUTE(A2," ","@",LEN(A2)-LEN(SUBSTITUTE(A2," ","")))))



Finding Duplicates

There can be 2 things you can do with duplicate data – Highlight It or Delete It. It is possible to find duplicate data in Excel and to highlight them. The following example demonstrates the process of finding duplicate data in Excel.

1. Select the range of the data in which duplicates are to be searched for. In this case, cells A1:B38 have been selected.


2. Go to Home > Styles > Conditional Formatting.


3. Choose the option Highlight Cells Rules. Then click on Duplicate Values.

4. Select a formatting style. Click OK.


Result: The duplicate names are highlighted.

Removing Duplicates

Upon finding the duplicate data, Excel can also remove them. The following example illustrates the process.

1. Select any specific cell within the data set selection.

2. Go to Data tab > Data Tools. Then click on Remove Duplicates.


The dialog box appears.

3. Check on all the checkboxes and then click OK.




Result: Excel omits all duplicate rows whilst keeping the first one intact.











No comments:

Post a Comment