Today we are going to show you how to clean data in Excel. Data forms are the backbone of any analysis you do on Excel. And when it comes to data, there are many things that can go wrong. In these cases, you might need some simple ways to clean up data in Excel, we are going to show you several options in this tutorial.
How to clean data in Excel?
Getting rid of extra spaces
Extra spaces are painfully hard to detect. While you can somehow detect extra spaces between words or numbers, some spaces won’t even be visible. Here is an easy way to get rid of these extra spaces: use the TRIM function.
Syntax: TRIM (text)
The Excel TRIM function takes the cell reference (or text) as input. It removes leading and trailing spaces, as well as extra spaces between words (except single spaces).
Select and edit all blank cells
Blank cells can cause some problems. You might want to fill all blank cells with ‘0’ or you simply can highlight them. If there is a large data set, doing this manually can take hours. Fortunately, there is a way to select all blank cells at once.
- Select the entire data set
- Press F5 (opens the Go To dialog box)
- Click the Special… button (at the bottom left). The Go to Special dialog box.
- Select Blank and click OK
This selects all blank cells in your data set. If you want to enter 0 in all these cells, just type it and press Control + Enter (Remember if you just press Enter, the value is only inserted in the active cell).
Convert numbers stored as text to numbers
Sometimes, when importing data from text files or external databases, numbers are stored as text. Also, some people often use an apostrophe (‘) before a number to make it text. This can create serious problems if you use these cells in calculations. Here is a foolproof way to convert these numbers stored as text into numbers.
- To any blank cell, type 1
- Select the cell where you have typed 1 and press Control + C
- Select the cell / range you want to convert to numbers
- Select Paste -> Paste Special (keyboard shortcut – Alt + E + S)
- In the Paste Special dialog box, select Multiply (in the operations category)
- Click OK. This converts all numbers in text format to numbers.
You can do two things with duplicate data, highlight or delete.
Highlight the duplicate data:
- Select the data and go to Home -> Conditional Formatting -> Highlight Cell Rules -> Duplicate Values.
- Specify the formatting and all duplicate values will be highlighted.
Remove duplicates from the data:
- Select the data and go to Data -> Remove duplicates.
- If your data has headers, make sure that the check box at the top right is checked.
- Select the columns you want to remove duplicates from and click OK.
This removes the duplicate values from the list.
There are two ways to highlight errors in Excel:
Using conditional formatting:
- Select the complete data set
- Go to Home -> Conditional Formatting -> New Rule
- In the New Formatting Rule dialog box, select “Format only cells that contain”.
- In the rule description, select Errors from the drop-down menu
- Set the formatting and click OK. This highlights any error values to the selected data set.
Using Go to Special:
- Select the complete data set
- Press F5 (the Go to dialog opens)
- Click the special button in the lower-left corner
- Select Formulas and uncheck all options except Errors
Select all cells that contain an error. Now you can manually highlight, delete them or type anything.
Change text to proper lowercase/uppercase/lowercase
When you inherit a workbook or import data from text files, often the names or titles are not consistent. Sometimes all text may be lowercase or uppercase, or it may be a combination of both. You can make everything consistent by using these three functions:
- LOWER (): converts all text to lowercase.
- CAPS (): converts all text to uppercase.
- CLOSE (): converts all text to upper and lower case.
Analyze the data using the text in the column.
When you get data from a database or import it from a text file, it may happen that all text is restricted in one cell. You can analyze this text in several cells using the Text to column functionality in Excel.
- Select the data/text you want to analyze
- Go to Data -> Text to column (opens the Text to column wizard)
- Select the data type (select Delimited if the data are not equally spaced and are separated by characters such as comma, hyphen, period…). Click on Next.
- Select Delimiter (the character that separates your data). You can select a predefined delimiter or anything else via the Other option
- Select the data format. You can select the target cell. If the target cell is not selected, the current cell is overwritten.
Nothing lowers the credibility of your work like a spelling mistake. Use the F7 shortcut to run a spell check on your dataset.
Remove all formatting
You can use several databases to get the data into Excel. Each database has its own data formatting. When you have all the data in place, here is how you can remove all the formatting at the same time:
- Select the data set.
- Go to Start -> Delete -> Delete formats
- Similarly, delete only comments, links, or content.
Use Search and replace to clean data to Excel
Search is indispensable when cleaning data. For example, you can select and remove all zeros, change references in formulas, search and change formatting, etc.