What is OCR & why it makes your life easier

scanning scrabble tiles with smartphone

Optical character recognition, or OCR, defines the process of mechanically or electronically converting scanned images of handwritten, typed or printed text into machine-encoded text.

In this blog article, you’ll learn about:

  • What the heck is OCR
  • How does optical character recognition work – explained for non techies
  • Why OCR is the new marketing gadget

Just keep on reading and you will get the answers you’re looking for and not end up confused.

Explaining a complex technology can end in a text that is horrible to read. A text full of technical vocabulary, confusing explanations and badly selected examples. Even though we can not describe OCR without using any terminology, we will try to keep them to a minimum. So the good news is that you do not need to be a hardcore techy to learn about what OCR is and how it works.

If you already know what OCR is, please just skip the introductory part. Take a look at how it works or at examples of what you can do with the technology.

What is OCR?

As already mentioned OCR stands for optical character recognition. The technology deals with the problem of recognizing all different kinds of characters. Both handwritten and printed characters can be recognized and converted into machine readable text.

The technology deals with the problem of recognizing all different kinds of characters.

Think of any kind of serial number or code consisting of numbers and letters that you need digitized. By using OCR you can transform those codes into digital output. The technology makes use of different techniques. Put in a very simplified way, the image taken will be preprocessed and the characters extracted and recognized. I will get to the just mentioned techniques, a little later, but you can also jump right to it.

What OCR does not take into account is the actual nature of the object that you want to scan. It simply “takes a look” at the text that you aim to transform. If you want the device to recognize both the nature of the object as well as the text on it you need to combine different technologies. Take a look of what you can do combining both OCR and augmented reality for example.

If you want the device to recognize both the nature of the object as well as the text on it you need to combine different technologies.

Different techniques of OCR

Let’s have a look at three steps of optical character recognition: image preprocessing, character recognition itself and the post-processing of the output.

Preprocessing

OCR software often preprocesses images to improve the chances of a successful recognition. The aim of image preprocessing is an improvement of the image data. Thus unwanted distortions are suppressed and specific image features are enhanced. Both of which are important for further processing.

Character recognition

character recognition of license plate

Character recognition of a license plate

For the actual character recognition part it is important to understand what feature extraction is. When the input data to an algorithm is too large to be processed, only a reduced set of features is selected. Those features selected are expected to be the important ones. The ones that are suspected to be redundant are sorted out. By using the reduced set of data instead of the initial large one, the performance will be better.
For the process of OCR this is important because the algorithm has to detect specific portions or shapes of a digitized image or video stream.

Post-processing

Post-processing is another error correction technique that ensures the high accuracy of OCR. The accuracy can be further improved if the output is restricted by a lexicon. That way the algorithm can fall back to a list of words that are allowed to occur in the scanned document for example.

Also depending on the application OCR is not only used for proper words, but also for numbers and codes.To better deal with different types of input OCR providers started to develop specific OCR systems. Those systems are able to deal with the special images. To further improve the recognition accuracy they combined various optimization techniques. For example they used business rules, standard expressions or rich information contained in color image. The strategy of merging various optimization techniques is called “application oriented OCR” or “customized OCR”. It is used in fields like business card OCR, invoice OCR or ID card OCR.

Possibilities using OCR

The possibilities using optical character recognition software are widespread. As already mentioned OCR can be combined with technologies like augmented reality for example. But the technology itself is already very powerful.

Here are a few examples of possible use cases including OCR software:

Identification processes

Scan passport

Machine readable zone in a passport

Passports and IDs have a machine readable zone (MRZ) that can be scanned. OCR can speed up the process of identifying and registering people at borders or other checkpoints. It thus is useful for immigration officers or other security personal.

Marketing campaigns

There are a lot of innovative mobile marketing campaigns out there. Many companies make use of codes to engage their customers in a little competition.Think of all the voucher codes that customers can redeem by typing them in. Or numbers printed on the inside of a bottle cap that you need to collect. All those campaigns can make use of OCR by integrating the software in their often existing app. That way they minimize the hurdle of online registration and the process of typing in a series of numbers and letters.

Have a look at how Karlsberg used OCR in their marketing campaign:

Payment processes

Scan payment slip with OCR

IBAN Scanning with OCR

The International Bank Account Number (IBAN) serves to identify bank accounts across borders. The IBAN may come in different length and can consist of numbers as well as letters. To ease cross border transactions banking apps can easily integrate OCR software. That way their customers can scan their IBAN instead of tediously typing it in.

OCR Tools

There are a lot of optical character recognition softwares that specialize in one specific use case. For example credit card scanning, or document scanning.
But OCR can be useful for so many different parts in our lives. Thus it is kind of annoying to use a different software for every different use case.
Tesseract is an open source OCR engine that has gained popularity among OCR developers. Even though it can be painful to implement and modify sometimes, there weren’t too many free and powerful OCR alternatives on the market for the longest time.
Anyline offers an OCR SDK that you can download for free as well and which, in contrast to Tesseract works perfectly on mobile.

What’s your take on optical character recognition and its use for different industries? Think we missed something? You have an idea to improve the article? Don’t hesitate to reach out to us or leave a comment! Via Facebook, Twitter or simply via [email protected]!

Online Marketing Manager at anyline.io
Besides working in Online Marketing and Graphic Design at Anyline and engaging with our awesome community, she’s a music enthusiast and hardcore foodie who’s always trying to make the perfect espresso.
1 reply

Trackbacks & Pingbacks

  1. […] This article will cover the challenges and possibilities for Real Estate Management with the high-end technology of Optical Character Recognition. […]

Comments are closed.