Extract plate number #1

robmarkcole · 2021-01-08T05:27:32Z

Hi, great work on the model! The next step is to extract the plate number. I tried AWS textract, which works well:

I also tried tesseract OCR which is open source and can run locally, but it failed to get the plate. You can try using https://github.com/robmarkcole/text-insights-app

Possibly something as simple as OpenCV is worth a try too

odd86 · 2021-01-08T09:11:10Z

Hello, let me give you some nice python code ;)

You should also check the font of your licence plates.
Here in Norway they use Myraid Pro, so i use the Language Independent Training pack.
When you have the font download the rigth tesseract model here: Tesseract Language Packs

Take your Language pack and copy it to C:\Program Files\Tesseract-OCR\tessdata

First i crop out the cordinates of the licenceplate, then i pass that image to tesseract.

    from io import BytesIO
    from PIL import Image

    def _get_cropped_image(coordinates, image, image_name=""):
        image = Image.open(BytesIO(image))
        size = image.size
        y_max = coordinates["y_max"]
        y_min = coordinates["y_min"]
        x_max = coordinates["x_max"]
        x_min = coordinates["x_min"]
        cropped = image.crop((x_min, y_min, x_max, y_max))
        byte_image = BytesIO()
        cropped.save(byte_image, "PNG")
        cropped.save(f"{image_name}.png", )

    import re

    import cv2
    import pytesseract

    pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe" # Your path to tesseract.exe

    def _read_licence_plate():
        img = cv2.imread("licence-plate.png", 0)
        gray = cv2.resize(img, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)
        blur = cv2.GaussianBlur(gray, (5, 5), 0)
        gray = cv2.medianBlur(gray, 3)
        ret, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
        rect_kern = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
        dilation = cv2.dilate(thresh, rect_kern, iterations=1)
        contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
        sorted_contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
        im2 = gray.copy()
        plate_num = ""
        i = 0
        for cnt in sorted_contours:
            try:
                x, y, w, h = cv2.boundingRect(cnt)
                height, width = im2.shape
                if height / float(h) > 7:
                    continue

                ratio = h / float(w)
                if ratio < 1.2 or ratio > 3.77:
                    continue

                area = h * w
                if width / float(w) > 22:
                    continue
                    
                if area < 70:
                    continue
                    
                roi1 = thresh[y - 5:y + h + 5, x - 5:x + w + 5]
                roi2 = cv2.bitwise_not(roi1)
                roi3 = cv2.medianBlur(roi2, 5)
                
                whitelist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" if i < 3 else "0123456789" # <- Make a smart whitelist
                text = pytesseract.image_to_string(
                    roi3,
                    lang="ENG6", # Insert the name of your language pack
                    config=f'-c tessedit_char_whitelist={whitelist} --psm 10 --oem 3'
                )
                plate_num += text.replace('\n', '').replace('', '')
                i += 1
            except Exception as e:
                print(e)
                pass
        is_reg = re.findall(r'[a-zA-Z]{2}[0-9]{5}', plate_num) # <- Make the regex pick out licenceplate numbers based on your country
        if len(is_reg) > 0:
            print(is_reg[0])
            return is_reg[0]
        return False

robmarkcole · 2021-01-10T08:18:12Z

@odd86 I spent a bit of time experimenting with opencv and tesseract and concluded it will be a solution that requires a lot of fine tuning, e.g. of blur parameter etc. I am interested in a neural net approach which is robust and doesn't require fine tuning, are you aware of any?

odd86 · 2021-01-10T10:31:00Z

Yeah, i ended up making my own nuralnet traind model for Tesseract 4.
Download the model

This model is made with 1 million iterations ower a dataset of 800 images of norwegian licence plates.
If you make a dataset for other plates it would be fun to include them!

I just made a crawler that got all images of cars for sale on our Norwegian main car sale site and checked them for plates.
After the plates where cropped out i ran them true some of the previous models of tesseract and last i looked ower the dataset to fix errors.

So now i just send in the image to tesseract and dont need to do all the tweaking

robmarkcole · 2021-01-12T05:57:03Z

Very nice! Re custom model for Tesseract 4 is there a nice article I can follow to reproduce with UK number plates? It would be good to document this somewhere so people can make dedicated models for their own country

Themrpie · 2021-01-23T21:56:16Z

Hello! I'm thinking about building a licence plate detector to use on deepstack so it was nice to find your model.
But I'm also needing to be able to read the licence plate, can you comment on the Tessaract 4 accurracy?
I'm not sure if to follow that path or to train each letter as a category, which would be a lot more work but I'm looking for a reliable solution.
Thanks

odd86 · 2021-02-01T09:17:53Z

Hello @Themrpie.
Read my answer about that here:
https://forum.deepstack.cc/t/licence-plate-reader/687/10

odd86 added the documentation Improvements or additions to documentation label Jan 8, 2021

odd86 closed this as completed Aug 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract plate number #1

Extract plate number #1

robmarkcole commented Jan 8, 2021

odd86 commented Jan 8, 2021 •

edited

Loading

robmarkcole commented Jan 10, 2021

odd86 commented Jan 10, 2021 •

edited

Loading

robmarkcole commented Jan 12, 2021

Themrpie commented Jan 23, 2021

odd86 commented Feb 1, 2021

Extract plate number #1

Extract plate number #1

Comments

robmarkcole commented Jan 8, 2021

odd86 commented Jan 8, 2021 • edited Loading

robmarkcole commented Jan 10, 2021

odd86 commented Jan 10, 2021 • edited Loading

robmarkcole commented Jan 12, 2021

Themrpie commented Jan 23, 2021

odd86 commented Feb 1, 2021

odd86 commented Jan 8, 2021 •

edited

Loading

odd86 commented Jan 10, 2021 •

edited

Loading