-
Notifications
You must be signed in to change notification settings - Fork 948
Tips for Improving OCR Results
Tesseract is a library for performing optical character recognition, but it's important to know that Tesseract performs OCR best when it is given a preprocessed image that is ideally crystal clear black text on a pure white background.
The following sections provide some tips about how to preprocess images before running them through Tesseract to improve the result and speed of OCR.
The upstream Tesseract library has a Wiki page on how to improve the quality of OCR results here: https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality
It's worth reading because it explains the kinds of processing Tesseract does and does not do, which is useful in determining what preprocessing to perform on an image.
GPUImage is a fantastic image processing library for iOS that filters images on the GPU, so it's really fast. It even comes with a photo camera and a live video camera that you can use to create a pipeline of one or more filters.
You can use GPUImage's GPUImageAdaptiveThresholdFilter
to preprocess an image for performing OCR, which "determines the local luminance around a pixel, then turns the pixel black if it is below that local luminance and white if above. This can be useful for picking out text under varying lighting conditions."
Here's some sample code to get you started:
// Grab the image you want to preprocess
UIImage *inputImage = [UIImage imageNamed:@"my_test_image.jpg"];
// Initialize our adaptive threshold filter
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0
// Retrieve the filtered image from the filter
UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];
// Give Tesseract the filtered image
tesseract.image = filteredImage;
By default, Tesseract applies Otsu's thresholding method to every image as a pre-processing step of the recognition process.
But if you've already performed your own pre-processing/thresholding (as with using the GPUImage code above), you will probably want to bypass the internal Tesseract thresholding step. That's possible with the preprocessedImageForTesseract:sourceImage:
Tesseract delegate method. If implemented, that method is called before the internal thresholder and prevents the running of the internal thresholder, as long as the method returns an image.
If you wanted to skip the internal thresholding step, the GPUImage code above should be changed as follows:
// somewhere in the function of your class
// set the delegate
tesseract.delegate = self;
// give the original, non-processed image to Tesseract
tesseract.image = [UIImage imageNamed:@"my_test_image.jpg"];
// Tesseract delegate method inside of your class
- (UIImage *)preprocessedImageForTesseract:(G8Tesseract *)tesseract sourceImage:(UIImage *)sourceImage {
// sourceImage is the same image you sent to Tesseract above
UIImage *inputImage = sourceImage;
// Initialize our adaptive threshold filter
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0
// Retrieve the filtered image from the filter
UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];
// Give the filteredImage to Tesseract instead of the original one,
// allowing us to bypass the internal thresholding step.
// filteredImage will be sent immediately to the recognition step
return filteredImage;
}