Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set DPI #54

Merged
merged 5 commits into from
Jul 11, 2019
Merged

Set DPI #54

merged 5 commits into from
Jul 11, 2019

Conversation

bertsky
Copy link
Collaborator

@bertsky bertsky commented Jul 9, 2019

No description provided.

@bertsky bertsky requested a review from wrznr July 9, 2019 13:16
@bertsky
Copy link
Collaborator Author

bertsky commented Jul 9, 2019

I added the fix in crop which prevents negative coordinates when AlternativeImage was already present. Now also fixes #55.

@bertsky bertsky self-assigned this Jul 9, 2019
@bertsky
Copy link
Collaborator Author

bertsky commented Jul 9, 2019

Oh, @kba, am I supposed to use OcrdExif instead of just PIL.Image.info?

@wrznr wrznr self-requested a review July 10, 2019 07:46
@wrznr
Copy link
Contributor

wrznr commented Jul 10, 2019

@bertsky We should choose a definite option for accessing image information in the long run. This could be either exif or PIL.

@wrznr
Copy link
Contributor

wrznr commented Jul 10, 2019

After a little research, I think that the image resolution is stored in JFIF meta data section which both exiftool and PIL.Image.info read:

$ exiftool IMG/FILE_0033_ORIGINAL.jpg 
ExifTool Version Number         : 10.80
File Name                       : FILE_0033_ORIGINAL.jpg
Directory                       : IMG
File Size                       : 329 kB
File Modification Date/Time     : 2019:07:04 17:07:41+02:00
File Access Date/Time           : 2019:07:09 14:40:50+02:00
File Inode Change Date/Time     : 2019:07:04 17:07:46+02:00
File Permissions                : rw-r--r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Resolution Unit                 : inches
X Resolution                    : 300
Y Resolution                    : 300
Image Width                     : 1187
Image Height                    : 1687
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 1187x1687
Megapixels                      : 2.0

vs.

>>> from PIL import Image
>>> Image.open("IMG/FILE_0033_ORIGINAL.jpg")
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1187x1687 at 0x7F7958245C18>
>>> img = Image.open("IMG/FILE_0033_ORIGINAL.jpg")
>>> img.info
{'jfif': 257, 'jfif_version': (1, 1), 'dpi': (300, 300), 'jfif_unit': 1, 'jfif_density': (300, 300)}

@bertsky
Copy link
Collaborator Author

bertsky commented Jul 10, 2019

That's right, but there are more options than JPG here. And it can always be empty. And core's OcrdExif looks like a good abstraction (which does use PIL already, btw). That's why I was asking whether this is recommended and stable (especially the fallback defaults xResolution=1 etc). In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

@kba
Copy link
Member

kba commented Jul 10, 2019

That's why I was asking whether this is recommended and stable (especially the fallback defaults xResolution=1 etc)

This is the recommended way, though not as stable as I'd wish. The variability of image metadata is immense and PIL.Image.Info exposes the metadata on a relatively low level, so IMHO an abstraction is necessary to handle all the formats, units and bogus metadata.

The xResolution = 1 hacks are in place not too cause division-by-zero errors and be so obviously low to prevent processors from assuming it to be true.

In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

True, that should be made explicit. We have some information at https://ocr-d.github.io/gt//trans_documentation/exif_pagexml_konkordanz.html from early 2018 when we considered encoding image metadata with MIX.

ocrd_tesserocr/crop.py Outdated Show resolved Hide resolved
ocrd_tesserocr/crop.py Show resolved Hide resolved
ocrd_tesserocr/deskew.py Outdated Show resolved Hide resolved
ocrd_tesserocr/recognize.py Outdated Show resolved Hide resolved
ocrd_tesserocr/segment_line.py Outdated Show resolved Hide resolved
ocrd_tesserocr/segment_word.py Outdated Show resolved Hide resolved
ocrd_tesserocr/segment_region.py Outdated Show resolved Hide resolved
@bertsky
Copy link
Collaborator Author

bertsky commented Jul 10, 2019

In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

True, that should be made explicit.

But on the other hand, how should this possibly be related to the spec? We cannot teach the implementors how to talk to their underlying processors (in how they should determine and pass down DPI values). We can only make the annotation in PAGE mandatory (i.e. imageXResolution / imageYResolution / imageResolutionUnit).

Then we would of course generously provide a specialized processor that just annotates that (using OcrdExif). And we would of course build it into the basic OcrdPage functions as well, to add those attributes if they are not already there (for everyone who uses core to write their processors, so they do not need the special processor). Thus, any careful and wise processor author can find joy in just relying on the attributes of PageType for her DPI adjustments. She will also have to deal with the question of fallback/default values now: if nothing was annotated, then there is nothing to know, so she decides what to do about it herself.

@kba
Copy link
Member

kba commented Jul 11, 2019

But on the other hand, how should this possibly be related to the spec? We cannot teach the implementors how to talk to their underlying processors (in how they should determine and pass down DPI values). We can only make the annotation in PAGE mandatory (i.e. imageXResolution / imageYResolution / imageResolutionUnit).

We had long discussions on this last year and decided to use the image itself as the single source of truth for all technical image metadata. Duplicating the information from the image in PAGE could lead to metadata drift (e.g. the image changes but the PAGE doesn't reflect that or the annotations are missing in PAGE) whereas the image is always available. In addition, the 2017 PAGE version didn't have the attributes yet.

How this is related to the spec: Different image file formats support different metadata schemes, like EXIF, JFIF or PNG tags. While an abstraction like PIL.Image.Info, it leaks pretty quickly. Implementing this to support the reality of images used in libraries, archives etc. is not only tedious for extracting metadata like dimension or pixel density but also for colorspace conversions or image manipulation.

Originally, the plan was to have an image characterization step that does all the metadata extraction, like you propose, and refuse processing bad images if pixel density too low etc. In reality, it is next to impossible to guess pixel density and the image metadata often cannot be trusted.

That being said, I'm open to encoding this in PAGE instead if that makes life easier for developers and users. This would require a dedicated processor as the first step in any pipeline though, since the input file group will be images.

@cneud @tboenig Any thoughts on this?

@bertsky
Copy link
Collaborator Author

bertsky commented Jul 11, 2019

Related: OCR-D/core#37 (re-open there?)

…t annotation in case AlternativeImage was already present)
@bertsky bertsky merged commit 4a69ba1 into OCR-D:master Jul 11, 2019
@bertsky bertsky deleted the set-dpi branch July 11, 2019 12:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants