Thresholding for Mobile OCR: An Introduction – Part 2

# Thresholding for Mobile OCR: An Introduction – Part 2 Last week we gave you an An Introduction to Binary, Truncate & To Zero Thresholding, which we hope you found useful! This blog post will dive a little deeper into the thresholding topic with Otsu Thresholding and Adaptive Thresholding. So let’s get started!

## OTSU THRESHOLDING

Otsu’s method of thresholding, named after Nobuyuki Otsu who first published this thresholding method in 1979, is used to automatically perform clustering-based image thresholding.
But what does that mean?

In global thresholding, one arbitrary value is used as threshold. So in order to get a good result image, we need to find the right threshold value which is basically a trial and error process. Since we want an automated thresholding algorithm, we need a better method to find the right threshold.
Consider a bi-modal image, an image whose histogram has two peaks (aka clusters). A good threshold value for such an image would be a value in the middle of those peaks, which is exactly what the Otsu method does. It automatically calculates a threshold value from the image histogram of a bi-modal image.

Explore computer vision with the free Anyline OCR SDK!   input image histogram red line = threshold Otsu threshold image

In case you are interested in more detailed information on how Otsu tresholding works, continue reading. Otherwise skip this section and directly continue with the code examples.

Variance is a measure of region homogeneity, which means regions with high homogeneity will have a low variance. Otsu’s algorithm searches for the threshold that minimizes the intra-class variance. In order to do so, one has to consider all possible thresholds and compute the variance for each of the two classes of pixels (i.e., the class below and above threshold). Where the weights are the probabilities for the two classes given by the relative number of pixels in each class separated by the threshold and are the variances for each class.

Computing this intra-class variance for each of the two classes for each possible threshold involves a lot of computation, but luckily there is a much faster way. If the intra-class variance is extracted from the total variance of the combined distribution, the so-called inter-class variance is the result: Where the class probabilities are computed from the histogram as: and While the class means are computed like: and Where is the value at the center of the th histogram bin.

Drawbacks

C++ Code

To execute Otsu thresholding with OpenCV it is necessary to pass an additional flag (THRESH_OTSU) to the threshold() function as well as one of the five threshold types explained in the previous section. Simply pass 0 as a threshold value, it is omitted anyway. The algorithm will then find the optimal threshold value, which will be returned as value of type double. For maxValue it is possible to pass any non-zero value. This value will be assigned to every pixel greater than the threshold value. In this example we used 255 to get a black and white binary image.

```using namespace cv;

Mat dst;// Otsu Thresholding
thresh = threshold(src,dst, 0, 255, THRESH_BINARY | THRESH_OTSU);```

Results      input image histogram red line = threshold Otsu threshold image

In the previous algorithms we used one global threshold to binarize the image, which works fine if you have a relatively uniform background. However, a single threshold will not work well if there is a large variation in the background intensity due to shadows or the the direction of illumination.
In that case it is better to use Adaptive Thresholding (aka local, dynamic or areal thresholding).   input image binary thresholding thresh = 100 adaptive threshold

The idea of this algorithm is to partition the image into smaller sub-images and then calculate a different threshold for each sub-image. This approach might lead to sub-images having simpler histograms which will usually generate better results for images with uneven illumination. OpenCV provides a function to perform adaptive thresholding:

```double cv::adaptiveThreshold(
cv::InputArray src // input image (8 bit, single channel)
cv::OutputArray dst // result image
double maxValue // the maximal (non-zero) value that can be assigned to output
int thresholdType // use THRESH_BINARY or THRESH_BINARY_INV only
int blockSize // size of pixel neighborhood e.g. 3,5,7,9,etc.
double C // Constant subtracted from mean or weighted mean usually positive but may be 0 or negative as well
);
```

There are two methods to calculate the weighted mean for the blockSize * blockSize neighborhood:

The threshold value T(x,y) is a mean of the blocksize * blocksize neighborhood of pixel (x,y) minus a constant value C.

The threshold value T(x,y) is a weighted mean of the blocksize * blocksize neighborhood of pixel (x,y) minus a constant value C . The pixel values closer to the center of the neighborhood have a higher weight when calculating the mean value.

C++ Code

```using namespace cv;

Mat dst;

// Set maxValue, blockSize and c (constant value)
double maxValue = 255;
int blockSize = 9;
double c = 41;

```

Results

The following table shows the results of applying adaptive thresholding on the input image with different values.   blockSize = 5 c = 41 blockSize = 7 c = 41 blockSize = 9 c = 41

## Multilevel Thresholding

So far we only discussed thresholding based on grayscale images. However, it is also possible to threshold color images. This approach is called multilevel, multiband or simply multi thresholding and gradually gains more relevance with the increasing number of color documents. One approach is to designate a separate threshold for each of the RGB channels and then combine them with an AND operation.

This reflects the way the camera works and how the data is stored, but it does not correspond to the way that people recognize color. Therefore the HSL & HSV or CMYK color models are more often used which mostly require more sophisticated thresholding algorithms resulting in higher computational complexity.
These approaches are rather complicated and would be too extensive for this blog post but don’t hesitate to contact us if you have any questions!

This was our introduction on mobile thresholding. We hope we could give you a good and concise overview on this topic and that you stay tuned for more!

### QUESTIONS? LET US KNOW!

If you have questions, suggestions or feedback on this, please don’t hesitate to reach out to us via FacebookTwitter or simply via [email protected]! Cheers!