Project Description
Thresholding for Mobile OCR: An Introduction – Part 2
Last week we gave you an An Introduction to Binary, Truncate & To Zero Thresholding, which we hope you found useful! This blog post will dive a little deeper into the thresholding topic with Otsu Thresholding and Adaptive Thresholding. So let’s get started!
OTSU THRESHOLDING
Otsu’s method of thresholding, named after Nobuyuki Otsu who first published this thresholding method in 1979, is used to automatically perform clusteringbased image thresholding.
But what does that mean?
In global thresholding, one arbitrary value is used as threshold. So in order to get a good result image, we need to find the right threshold value which is basically a trial and error process. Since we want an automated thresholding algorithm, we need a better method to find the right threshold.
Consider a bimodal image, an image whose histogram has two peaks (aka clusters). A good threshold value for such an image would be a value in the middle of those peaks, which is exactly what the Otsu method does. It automatically calculates a threshold value from the image histogram of a bimodal image.
Explore computer vision with the free Anyline OCR SDK!
input image  histogram red line = threshold 
Otsu threshold image 
In case you are interested in more detailed information on how Otsu tresholding works, continue reading. Otherwise skip this section and directly continue with the code examples.
Variance is a measure of region homogeneity, which means regions with high homogeneity will have a low variance. Otsu’s algorithm searches for the threshold that minimizes the intraclass variance. In order to do so, one has to consider all possible thresholds and compute the variance for each of the two classes of pixels (i.e., the class below and above threshold).
Where the weights are the probabilities for the two classes given by the relative number of pixels in each class separated by the threshold and are the variances for each class.
Computing this intraclass variance for each of the two classes for each possible threshold involves a lot of computation, but luckily there is a much faster way. If the intraclass variance is extracted from the total variance of the combined distribution, the socalled interclass variance is the result:
Where the class probabilities are computed from the histogram as:
and
While the class means are computed like:
and
Where is the value at the center of the th histogram bin.
Drawbacks
 The method assumes that the histogram of the image is bimodal
 It breaks down when the two classes are very unequal (i.e. large size difference) which could result in two maxima for

 The correct maximum is not necessary the global one.
 The selected threshold should correspond to a valley of the histogram.
 The method does not work well with variable illumination.
C++ Code
To execute Otsu thresholding with OpenCV it is necessary to pass an additional flag (THRESH_OTSU) to the threshold() function as well as one of the five threshold types explained in the previous section. Simply pass 0 as a threshold value, it is omitted anyway. The algorithm will then find the optimal threshold value, which will be returned as value of type double. For maxValue it is possible to pass any nonzero value. This value will be assigned to every pixel greater than the threshold value. In this example we used 255 to get a black and white binary image.
using namespace cv; // Read image Mat src = imread("threshold.png", IMREAD_GRAYSCALE); Mat dst;// Otsu Thresholding thresh = threshold(src,dst, 0, 255, THRESH_BINARY  THRESH_OTSU);
Results
input image  histogram red line = threshold 
Otsu threshold image 
ADAPTIVE THRESHOLDING
In the previous algorithms we used one global threshold to binarize the image, which works fine if you have a relatively uniform background. However, a single threshold will not work well if there is a large variation in the background intensity due to shadows or the the direction of illumination.
In that case it is better to use Adaptive Thresholding (aka local, dynamic or areal thresholding).
input image  binary thresholding thresh = 100 
adaptive threshold 
The idea of this algorithm is to partition the image into smaller subimages and then calculate a different threshold for each subimage. This approach might lead to subimages having simpler histograms which will usually generate better results for images with uneven illumination.
OpenCV provides a function to perform adaptive thresholding:
double cv::adaptiveThreshold( cv::InputArray src // input image (8 bit, single channel) cv::OutputArray dst // result image double maxValue // the maximal (nonzero) value that can be assigned to output int adaptiveMethod // adaptive Thresholding algorithm (see Table 2) int thresholdType // use THRESH_BINARY or THRESH_BINARY_INV only int blockSize // size of pixel neighborhood e.g. 3,5,7,9,etc. double C // Constant subtracted from mean or weighted mean usually positive but may be 0 or negative as well );
There are two methods to calculate the weighted mean for the blockSize * blockSize neighborhood:
1. ADAPTIVE_THRESH_MEAN_C
The threshold value T(x,y) is a mean of the blocksize * blocksize neighborhood of pixel (x,y) minus a constant value C.
2. ADAPTIVE_THRESH_GAUSSIAN_C
The threshold value T(x,y) is a weighted mean of the blocksize * blocksize neighborhood of pixel (x,y) minus a constant value C . The pixel values closer to the center of the neighborhood have a higher weight when calculating the mean value.
C++ Code
using namespace cv; // Read image Mat src = imread("threshold.png", IMREAD_GRAYSCALE); Mat dst; // Set maxValue, blockSize and c (constant value) double maxValue = 255; int blockSize = 9; double c = 41; // Adaptive Threshold adaptiveThreshold(src, dst, thresh, ADAPTIVE_THRESH_GAUSSIAN_C, THRESH_BINARY, blockSize, c);
Results
The following table shows the results of applying adaptive thresholding on the input image with different values.
blockSize = 5 c = 41 
blockSize = 7 c = 41 
blockSize = 9 c = 41 
Multilevel Thresholding
So far we only discussed thresholding based on grayscale images. However, it is also possible to threshold color images. This approach is called multilevel, multiband or simply multi thresholding and gradually gains more relevance with the increasing number of color documents. One approach is to designate a separate threshold for each of the RGB channels and then combine them with an AND operation.
This reflects the way the camera works and how the data is stored, but it does not correspond to the way that people recognize color. Therefore the HSL & HSV or CMYK color models are more often used which mostly require more sophisticated thresholding algorithms resulting in higher computational complexity.
These approaches are rather complicated and would be too extensive for this blog post but don’t hesitate to contact us if you have any questions!
This was our introduction on mobile thresholding. We hope we could give you a good and concise overview on this topic and that you stay tuned for more!
QUESTIONS? LET US KNOW!
If you have questions, suggestions or feedback on this, please don’t hesitate to reach out to us via Facebook, Twitter or simply via [email protected]! Cheers!