-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify a dewarping algorithm / library #2
Comments
here does not appear to be a perfect solution for de-warping, and that will be a risk to the project overall (but one we will better understand once the MVP is complete). The two risks are:
Here is a short thread on the DIY Bookscanner forums of someone who appears to have tried to make exactly what we're talking about making here. They are pointed to a few resources, though I am a bit wary of going down the path of completely implementing something from scratch based on academic papers. That thread does note that Which does raise my concerns -- we may find that some pages simply cannot be reliably scanned / dewarped. Again, MVP will expose that challenge. Approach A: Modify an existing algoirthmThis guide from 2016 and the accompanying code offers what appears to be a fairly compelling de-warping algorithm, though it is in python. This algorithm takes around 30 seconds to de-warp a page on a 2012 Macbook Pro. Approach B: OpenCVThere are a few projects (such as OpenNoteScanner) which appear to use OpenCV to handle de-warping. This article from 2014 shows an example in Python which could be modified to Java. Approach C: TensorFlowThis blog post from 2019 talks about the use of TensorFlow / Machine Learning. Unfortunately they note that |
I spoke with @kfogel on this item and it is understood that (1) dewarping is a preprocessing step that is going to improve the outcome of OCR and (2) neither dewarping nor OCR is a perfectly solved problem. To that end, we are going to follow the 80/20 rule and see what comes from an initial implementation with the understanding that there will be room for significant improvement, but that improvement should be explored after that first iteration. |
Just heard about another project that might have some useful references or code: https://gitlab.com/rstocker/scanner @slifty, if there's some place (other than this issue) where you'd like me to put information about related projects, please let me know. We could create a separate document in the tree for that, or make a section in an existing document later, or whatever. I don't want these notes to be distracting, I just want to have a place to keep possibly-useful references. Even after we evaluate them, it's good to keep a record of what we evaluated, so that neither we nor others need to retrace those steps later. |
Ask HN: OCR framework for extracting formatted text has a lot of links too. |
Awesome thank you for these @kfogel -- this is a fine place for them for now, and we can make another place for related projects later. |
One more: https://github.com/Ethereal-Developers-Inc/OpenScan: "An open source app that enables users to scan hardcopies of documents or notes and convert it to a PDF file. No ads. No data collection. We respect your privacy." (They don't say anything about OCR; not sure if that's included, or planned for the roadmap, or just not something they're doing.) |
One more: https://wiki.gnome.org/Apps/OCRFeeder |
One of the exports of this project is a de-warped version of each page photographed. This will be used to improve the OCR as well.
This is not out-of-the box functionality! Lets find some existing libraries and algorithms that do this (ideally in Java, but if there is no other choice it might be OK to have it in another language and find a way to run it from within the app).
This issue is ultimately a research issue, to capture and log resources as I find them.
The text was updated successfully, but these errors were encountered: