FAQ

What do I need before I start using Paperpusher?
A scanner or scanning app to get a picture of the paper receipt or invoice onto your computer and OCR software to get the text from the picture into a spreadsheet.

I recommend Adobe Scan as your free scanning app. It works on both iPhone and Android, and provides accurate, high quality scans that you can save directly to Google Drive.

If you’re looking for free OCR software, I recommend the Copyfish extension for Google Chrome and Firefox. It does an excellent job at translating numbers, letters, and symbols from a picture of a receipt or invoice into text that you can copy and paste into a spreadsheet. It cannot, however, preserve format.

If you’re willing to pay $30 for OCR software, I recommend the ABBYY Screenshot Reader. Not only does it do an excellent job at translating pictures into text, but it can also preserve the format of a receipt or invoice, which gives you less work for cleaning it up later.

How do I get my paper receipt or invoice into a spreadsheet?

Get the Adobe Scan app for your phone, and Copyfish for your computer. Sign into the Adobe Scan App with your Google account, then use the Scanning App to take a pic. Once it’s done, crop the pic (square bottom third from left) so you only see the necessary parts of the receipt.

Save PDF, then click Share on the pic. Click Share File, save to Google Drive. Open the pdf in Google Drive, maximize the pdf so it takes up your whole screen (bigger is better). Then use Screenshot reader’s “Table to Clipboard” function, or just click on Copyfish Chrome extension. Paste the data into your spreadsheet.

This whole process should take you about 30 seconds, once it’s all set up.

By the way, if you’re willing to spend $10, you should buy the ABBYY Finescanner app for your phone. It does the same thing as Adobe Scan and Copyfish together, but it does a much better job.

Why is my OCR software performing poorly?

Here’s how you make your OCR software perform at its best:

 

  1. Make sure the ink on the receipt is dark, and the receipt is as smooth as possible.
  2. Take pictures with good lighting and no shadows.
  3.  Crop pdfs after you scan them in the Adobe Scan app so that only the important info is left.
  4. When you use the OCR software, make all the text as big as possible on your screen (while still making sure it all fits). Bigger text is always better.
  5. Only OCR important information from a receipt or invoice. If you don’t need the information, don’t OCR it.
  6. For a long or complex receipt or invoice, OCR one column at a time. So, use OCR on the units first, then the product names, and lastly the prices.

Lastly, if you’re willing to spend $10, you should buy the ABBYY Finescanner app for your phone. It does a much better job at OCR than Copyfish.

How do I use the add-on?
Once your data is in your spreadsheet (see steps 1, 2, 3), click on “Process Receipt/Invoice”.

Once you open the add-on, you’ll see a sidebar like this:

Paperpusher sidebar

Click on any of those blanks to start working on that issue. I advise that you work from top to bottom. It’ll make your life easier.

 

  1. “Fix typos” fixes characters that the OCR got wrong, specifically when it puts numbers in place of letters or vice versa (like “O” instead of “0”). It also replaces commas with periods, which is a common mistake that OCR software makes.

 

  1. Combine rows or columns does what you’d expect. It combines rows or columns that should be put together in whatever way you choose (separated with a comma, space, semicolon, or not at all).

 

  1. Split rows or columns splits rows or columns based on your preferred marker. You can split on the first comma, letter, number, or special character. This is similar to the “text to columns” function, but designed for OCR correction specifically.

 

  1. Remove spaces does what you’d expect. It removes spaces within the cells, which is a common mistake that OCR software makes.

 

  1. Remove stray characters removes all characters in the cell that don’t belong there. You can choose to remove all non alphanumeric characters (i.e. everything that’s not A-Z, 0-9, or a decimal point) or all non numeric characters (i.e. everything that’s not 0-9 or a decimal point).

 

When you click on a blank, you’ll see a drop-down like this:

Paperpusher Dropdown

You can type in an individual cell into the Range box, like “A10”. You can type in a range, like “A12:A30”. Or you can type in a column or row, like “A:A” or “3:3”. Once you’ve typed in your ranges, click the “Fix typos” button and the typos will be fixed.

How do I use the categorization part of the add-on?
Once your OCR data is cleaned up, you’ll need to categorize your data using the same range functions as you used in the “Fix OCR” part of the add-on (so like A1, or A:A, or A1:F1). Depending on whether it’s a receipt or invoice, you’ll need to put in the cell (or cells) that contain the important information.

If one cell contains more than one kind of information (like one cell has both the Vendor Name and the Date), you’re going to have to go back and split the cells with the “Split rows or columns” in the “Fix OCR” section.

How do I use the Naming part of the add-on?
In this step, the add-on guesses general category names for your products. It does this based on how similar the names are in your Dictionary to the Product Names in your spreadsheet. For example, you can see in the below example that the add-on recognized that “Drink – Soda Can” is close to “Can of Soda”.

How the dictionary works

Unfortunately, it didn’t guess the other general product names. The add-on is pretty good at guessing, but it is not perfect. When it gets it wrong, you have to fill in the rest of the general product names yourself. The add-on will help by auto-suggesting general product names for you.

 

You can only type in general product names that are already in your Dictionary. If you need to use a general product name that’s not in your Dictionary, add it to the Dictionary first, then you can use it in your spreadsheet.

How do I make a dictionary?
A Dictionary is just a spreadsheet full of general category names. I’ve included one dictionary for you to use, the “Food Dictionary”, which is linked on the “Dictionary” step in the sidebar. It contains pretty much every food item you might want to buy at a grocery store or for a restaurant. Here’s a sample of it below.

Dictionary sample

 

Column A is just for my information, when I want to quickly scroll through my dictionary and find a certain category of names. Column B is what the add-on uses to guess general product names.

You can add new items to this dictionary by just adding them to the end of column B. Make sure that you don’t create multiple general product names for the same general product (e.g. creating both “apple” and “apples” in your Dictionary). That’s going to make your analysis of your purchases way more difficult.

If you want to create your own dictionary full of general product names, feel free to! Just create a Google Sheets with the product names that the dictionary should know in column B, and category names for your own use in column A. Name the sheet something informative, like “Fashion Dictionary” or “Makeup Dictionary”. Then just select it during the “Dictionary” step.

What are vendor and product reviews for?
This is for your own information. If you’re looking back at your purchases in a year’s time, you’ll be able to see how much they cost from the invoice receipt data you’ve put in your database. But you won’t remember if you liked the vendor or the product. This is an easy way of keeping track of the information.

Also, if you do want to share your purchasing history with your friends or family, they’ll want to see your thoughts on what you’ve bought. After all, while cheap is always nice, it’s also true that you get what you pay for. This is your chance to tell people which retailers and items are really worth your hard-earned dollars.

Something isn’t quite right in the final product. Can I edit my receipt or invoice information directly, or do I have to go through the process again?
Edit it directly! That’s the nice thing about a spreadsheet, the info’s right there. In fact, if the dictionary function doesn’t correctly guess names, or fails to guess some names, then I encourage you to type in the general product names directly.
What happens when I move my data to a database?
A database is just a sheet on your Google Drive that holds all of the info from your receipts and invoices. You can access it directly from Google Drive, or from the “View Database” button on the Paperpusher Add-On.

The first time you move data, a database will be created. Every time after that, your info will be put in the Database that was created the first time you moved data.

What is “Filter Database”?
Filter Database allows you to slice and dice the information from your database with an easy-to-use interface. It only works with the information in your database, and displays your results on a separate sheet in your database.

Here’s the interface:

And here’s the result. Notice how the result shows up next to the Database main sheet.

How do I publish my database to Reddit?
Well, you’ll need a Reddit account, first. Beyond that, copy and paste the entirety of your database to TableIt, and it’ll create the Reddit formatting code for you.
Something strange is happening in the add-on that's not covered here. What gives?

My first guess would be that you have two Google accounts on your Gmail (so Gmail allows you to switch between the two). For some reason, Google add-ons do not play well with two Google accounts.

If that doesn’t work, ask me a question on the subreddit: https://www.reddit.com/r/Paperpusher/ .