Over the past week I have expored the topic of automatic license plate recognition. While it is basically a solved problem, the area of image processing has always intigued me and the plethora of literature made LPR seem a good place to start. Take note: I don't know anything about image recognition! What is an integral image? What is a white hat transformation? Until recently I had no idea.
OpenCV is clearly the library of choice for image recognition. While it is battle tested and has bindings in many major languages, the two available Haskell bindings (Haskell is my language of choice) are under-resourced and incomplete. By selecting Haskell I knowingly set myself up to do more basic library work and potentially make less progress on the task at hand. Ultimately, I selected the CV library, grabbed a few papers, and got to work.
My pipeline is informed by the papers I perused. Plate Localization uses an otsu threshold, filter blob with high areas, white hat, filter blobs of area outside of a range, dilation, and shape filtering (keep rectangular blobs). Potential plates are then extracted and cleaned by filtering based on the area, width/height relative to near by objects, and number of blobs. Finally, I segment all the blobs (sorted left to right) sending each one individually to GNU optical character recognizer.
An example of a successful application starts with a typical image of a car in traffic:
We then see how thresholding helps (while assuming the plate is dark numbers on a light background):
The white top-hat reduces many of the objects to skeletons, breaking some apart so we can consider them as separate blobs. This is important for some images that, when thresholded, connect a letter or two from the plate to the body of the car.
The size-filter gives us:
Now dilation allows us to connect the individual letters into a single blob. The areas that show up as white in the dilated image are checked to see if they contain between 3 and 30 separate blobs (in the thresholded image), if so then I consider this portion of the image as a potential plate.
With just a little more processing, such as comparing the sizes of the blobs to make sure they are approximately the same height, we end up with two blobs. After cleaning, here they are:
This license plate is properly segmented and GOCR tells me the plate is "F669'", or "fY_ 68_ J" if given the letters all at once.