Identify a dewarping algorithm / library #2

slifty · 2020-02-07T20:26:51Z

One of the exports of this project is a de-warped version of each page photographed. This will be used to improve the OCR as well.

This is not out-of-the box functionality! Lets find some existing libraries and algorithms that do this (ideally in Java, but if there is no other choice it might be OK to have it in another language and find a way to run it from within the app).

This issue is ultimately a research issue, to capture and log resources as I find them.

slifty · 2020-02-07T21:23:31Z

here does not appear to be a perfect solution for de-warping, and that will be a risk to the project overall (but one we will better understand once the MVP is complete). The two risks are:

Computational intensity (since this is for a mobile device)
Quality of the final result.

Here is a short thread on the DIY Bookscanner forums of someone who appears to have tried to make exactly what we're talking about making here. They are pointed to a few resources, though I am a bit wary of going down the path of completely implementing something from scratch based on academic papers.

That thread does note that there is no single way to do it because it is really just a heuristic. Even the best algorithms produce odd or bogus results with a fair bit of frequency. Especially on pages where the typology does not match the assumptions above. Say, on a map or title page.

Which does raise my concerns -- we may find that some pages simply cannot be reliably scanned / dewarped. Again, MVP will expose that challenge.

Approach A: Modify an existing algoirthm

This guide from 2016 and the accompanying code offers what appears to be a fairly compelling de-warping algorithm, though it is in python. This algorithm takes around 30 seconds to de-warp a page on a 2012 Macbook Pro.

Approach B: OpenCV

There are a few projects (such as OpenNoteScanner) which appear to use OpenCV to handle de-warping.

This article from 2014 shows an example in Python which could be modified to Java.

Approach C: TensorFlow

This blog post from 2019 talks about the use of TensorFlow / Machine Learning. Unfortunately they note that Geometric correction in the second step requires massive computational power, and it is not feasible to conduct it solely on-device at the moment.

slifty · 2020-02-25T21:46:38Z

I spoke with @kfogel on this item and it is understood that (1) dewarping is a preprocessing step that is going to improve the outcome of OCR and (2) neither dewarping nor OCR is a perfectly solved problem.

To that end, we are going to follow the 80/20 rule and see what comes from an initial implementation with the understanding that there will be room for significant improvement, but that improvement should be explored after that first iteration.

kfogel · 2020-04-10T20:52:47Z

Just heard about another project that might have some useful references or code: https://gitlab.com/rstocker/scanner

@slifty, if there's some place (other than this issue) where you'd like me to put information about related projects, please let me know. We could create a separate document in the tree for that, or make a section in an existing document later, or whatever. I don't want these notes to be distracting, I just want to have a place to keep possibly-useful references. Even after we evaluate them, it's good to keep a record of what we evaluated, so that neither we nor others need to retrace those steps later.

kfogel · 2020-05-22T14:55:15Z

Ask HN: OCR framework for extracting formatted text has a lot of links too.

slifty · 2020-05-24T17:11:10Z

Awesome thank you for these @kfogel -- this is a fine place for them for now, and we can make another place for related projects later.

kfogel · 2021-01-03T19:29:59Z

One more: https://github.com/Ethereal-Developers-Inc/OpenScan:

"An open source app that enables users to scan hardcopies of documents or notes and convert it to a PDF file. No ads. No data collection. We respect your privacy."

(They don't say anything about OCR; not sure if that's included, or planned for the roadmap, or just not something they're doing.)

kfogel · 2021-01-19T16:30:12Z

One more: https://wiki.gnome.org/Apps/OCRFeeder

slifty self-assigned this Feb 10, 2020

slifty added the discussion The conversation is the point label Feb 25, 2020

slifty mentioned this issue Feb 25, 2020

Populate two initial milestones #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify a dewarping algorithm / library #2

Identify a dewarping algorithm / library #2

slifty commented Feb 7, 2020

slifty commented Feb 7, 2020 •

edited

Loading

slifty commented Feb 25, 2020

kfogel commented Apr 10, 2020

kfogel commented May 22, 2020

slifty commented May 24, 2020

kfogel commented Jan 3, 2021

kfogel commented Jan 19, 2021

Identify a dewarping algorithm / library #2

Identify a dewarping algorithm / library #2

Comments

slifty commented Feb 7, 2020

slifty commented Feb 7, 2020 • edited Loading

Approach A: Modify an existing algoirthm

Approach B: OpenCV

Approach C: TensorFlow

slifty commented Feb 25, 2020

kfogel commented Apr 10, 2020

kfogel commented May 22, 2020

slifty commented May 24, 2020

kfogel commented Jan 3, 2021

kfogel commented Jan 19, 2021

slifty commented Feb 7, 2020 •

edited

Loading