OCR for HKID using Python 3 and OpenCV 3 / 基於Python3及OpenCV3 - 香港身份證OCR

What is it?

This project is mainly used to demonstrate how I think about using OCR on a HKID.

I am not perfect, neither is my code.
Using this code at your own risk, there is NO absolute guarantee that this script can scan everything you send it to.
In other words, it can only be used in a reasonable way. For example, if you pass in an image with mainly the HKID and a little extra space around the border, that's OK. However, if your HKID is in the corner of the whole image while 80% (or more!) of the image contains other stuff, this script will not work.
The key to successful text recognition is clear image. If your image is NOT clear, like the text is not easily and reasonably identified/blurred, this script may not give accurate result.
The Google Key has been deactivated. Please replace it with your own KEY.
Yes you can use Facebook OCR, you are not limited to that. You can also use tesseract, which I also highly recommended. Actually if you look at my code, I include (but commented) pytesseract.

python hkid.py -i <image_path> [-d/--debug]

e.g. python hkid.py -i hkid_sample-no-sample.jpg

It will return a JSON string, like below.

{'result': ['李智能', 'LEE, Chi Nan', '2621 2535 5174', '出生日期Date of Birth', '女F', '01-01-1968', 'k AZ', '簽發日期Date of Issue', '(01-79)']}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
cropped		cropped
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hkid-ocr-output.png		hkid-ocr-output.png
hkid-output.png		hkid-output.png
hkid.py		hkid.py
hkid_sample-no-sample.jpg		hkid_sample-no-sample.jpg
hkid_sample.jpg		hkid_sample.jpg