Data Cleanup Tips for Microsoft Excel


Excel

I have been cleaning up data files in Microsoft Excel for a very long time. This article will likely always be a work in process, so check back any time you have a data cleanup job to do. Or email us to let us know a task you'd like us to include here.

Data Isn't Recognized Properly

You add up a column of data, and you get zero, or anything but the right result. You try to do a calculation using a cell that contains a date, but you can't get it to work. Sometimes this is because your data came from somewhere else, and Excel simply refuses to see your data as values, and insists on seeing it as text.

To force Excel to see your numeric values properly, just perform some arithmetic on them. Just copy any blank cell. Select your "bad" cells. Hit Edit Paste Special and choose Add. You'll just have to reformat your cells again. Careful! You should always backup your files before you perform a "fix". You never know if it will work or not.

Having hidden rows and columns can also provide unexpected results. Make sure none of your rows or columns are hidden, providing you with incorrect "manual" results when your Excel results may be just fine after all.

Clean Up Extra Spaces or Characters Using Find and Replace

This is simple; it just takes a series of tasks. If you have what you believe is multiple spaces between words or numbers within the cells, or perhaps you have 8 different formats for phone numbers, use Find and Replace on the column.

To get rid of spaces within cells. Maybe there's 2 spaces between some of your words and you have multiple words in the cells.

  • Find what:  Type 2 spaces using the spacebar

  • Replace with:  Type 1 space using the spacebar

  • Replace all

To cleanup phone lists when 8 people have submitted them all differently. Once you've deleted all the extra characters, and you have just 9 digits, you can then format the cells as a phone number, and it works just right.

  • Find what:  Type a left parenthesis (or a right parenthesis or a hyphen or a period)

  • Replace with:  Leave this empty!

  • Replace all

Use the TRIM Function to Remove Spaces

The TRIM function removes spaces to the left and right of your data, but not spaces between data within a single cell.

Insert a blank column to the right of the column you want to trim. For our purposes, we'll use column A as the column we want to trim, and B2 is where we'll put our first formula.

=TRIM(A2)

Copy the formula down using the fill handle. In order to work with the trimmed text, highlight the cells, hit Edit Copy, then hit Edit Paste Special, Values, OK. That removes the underlying formulas. You can now move your column heading over to your new column of trimmed data, and delete the original column.

Use the CLEAN Function to Remove Non-Printing Characters

Thanks to reader "Andy" for pointing out to me that the CLEAN function belonged here!

Insert a blank column to the right of the column you want to clean. For our purposes, we'll use column A as the column we want to clean, and B2 is where we'll put our first formula.

=CLEAN(A2)

Copy the formula down using the fill handle. In order to work with the cleaned text, highlight the cells, hit Edit Copy, then hit Edit Paste Special, Values, OK. That removes the underlying formulas. You can now move your column heading over to your new column of cleaned data, and delete the original column.

Vertical to Horizontal Addresses

Pretty much everyone can figure out that, in the demonstration below, we need to use formulas. But what they can't figure out is how to copy them properly.

Create three "sets". A set is the formula, and the corresponding blank rows that follow it. In the case below, we make 3 sets, and select them. From here, we copy down. DO NOT STOP dragging until you've copied all the way down. If you do, your "set" becomes however many cells are selected at the time you begin dragging again.

If you have addresses that have two lines and four lines, then you'll be better off using simple COUNTs to determine how many lines there are in the address, then cut those addresses out and work on those on a different sheet. Here, I see if the 3rd cell below the name is blank. If it is, I do nothing. If there's something in it, we're alerted that there's a problem with this record.

Sorting as a Data Cleanup Tool

In the scenario above, however, we could still easily use formulas to get our data separated into columns. So what if we have many that have city addresses in Columns E and F? We can sort them once we've converted them to values. Sorting makes data cleanup a breeze. It helps to find wrongly spelled city names, bad zip codes, and other data that's been fouled by the scanning process.