The easiest way to split cells and columns, combine cells and columns, and correct other common OCR transcription and data entry errors in Google Sheets is by using the free add-on, Paperpusher. This blog post shows how to split cells and columns if you don’t want to download the add-on.
When I copy-paste tables from one format to a spreadsheet, I often find myself with cells and columns that I need to split together. Usually, Google Sheets is pretty good about handling this. For instance, if I have a bunch of comma separated values, as shown below:
|5, 6, 8, 12|
I can simply go to Data and click on “Split text to columns”, as shown in the screenshot below. I can let Google figure it out automatically, or choose what separator I need. Easy as that.
But what if your separators aren’t consistent? Take the following 9 cells as an example:
|Original||Desired split 1||Desired split 2|
In the first, I want to split 56 into two cells, 5 and 6. In the second, I want to split 56A into numbers and a letter. In the third, I want to split between the number and the word.
To split between these, I need to use regular expressions. This allows me to look into “strings”, or pieces of text, and manipulate them based on rules. The more general the rules, the less I’ll have to write. Also, before I start using regular expressions, I need to force the cells to be text, which I can either do using the formula command =TEXT, or I can do with the following menu option.
My plan is to create rules to put the correct string in “Desired split 1”, then I’ll just tell Google to put what’s left in “Desired split 2”. Let’s go for it.
In order to select the “5” out of “56”, I tell Google to select the first number in the string. I do this by using this formula in B2:
This tells Google to extract from cell A2 (“56″) any single digit from 0-9. Once I’ve done that, I can tell Google to put in cell C2 whatever is left from cell A2. My specific formula for cell C2 is:
This tells Google to look at A2, and replace the “5” in A2 with nothing “”. I’ll use this same formula for cells C3 and C4.
For cell B3, though, I need to find out a way to tell Google to extract only the numbers from cell A3. I do this with the formula
This tells Google to extract from cell A3 any digit [0-9] repeated any number of times *. So, I get 56. I use the same formula from cell C2 to put the rest into cell C3.
Finally, for cell A4, I need to find a way to tell Google to extract only the numbers. But, there’s a twist: the comma is included in the number. If I extract the numbers without the comma, I’m going to be left with an awkward dangling comma, which won’t make sense. So I use the formula:
This tells Google to extract from A4 any number of digits [0-9]*, followed by a comma [,], followed by any number of digits [0-9]*. It’ll extract that and only that. I can then repeat my formula for cell C4 that I used for C3.
Obviously, it’s impossible to combine my formulae for cells B2, B3, and B4, because in B2 I wanted to separate the numbers, but in B3 and B4 I wanted to keep them together. But, I can combine my formula for B3 and B4, and tell Google that I always want to extract the number, regardless of whether there’s a comma or not, and regardless of whether there’s a space between the last number and the first letter or not.
So, I use the formula:
This uses the OR operator |, and it tells Google to extract from A3 either a number formatted with a comma, like “180,000”, or | a number formatted without a comma. Notice that the order here is important: I tell Google first to check if there’s a comma, then, if there’s not, to extract the number without a comma. If it was in the reverse order, Google wouldn’t check if there’s a comma, and would just extract “180”, which are the digits before the comma.
It’s easiest to split cells and columns, combine cells and columns, and correct other common OCR transcription and data entry errors in Google Sheets by using the free add-on, Paperpusher. Check it out!