Set DPI #54

bertsky · 2019-07-09T13:14:20Z

No description provided.

…cropped

bertsky · 2019-07-09T15:48:11Z

I added the fix in crop which prevents negative coordinates when AlternativeImage was already present. Now also fixes #55.

bertsky · 2019-07-09T20:16:46Z

Oh, @kba, am I supposed to use OcrdExif instead of just PIL.Image.info?

wrznr · 2019-07-10T07:58:40Z

@bertsky We should choose a definite option for accessing image information in the long run. This could be either exif or PIL.

wrznr · 2019-07-10T09:57:06Z

After a little research, I think that the image resolution is stored in JFIF meta data section which both exiftool and PIL.Image.info read:

$ exiftool IMG/FILE_0033_ORIGINAL.jpg 
ExifTool Version Number         : 10.80
File Name                       : FILE_0033_ORIGINAL.jpg
Directory                       : IMG
File Size                       : 329 kB
File Modification Date/Time     : 2019:07:04 17:07:41+02:00
File Access Date/Time           : 2019:07:09 14:40:50+02:00
File Inode Change Date/Time     : 2019:07:04 17:07:46+02:00
File Permissions                : rw-r--r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
JFIF Version                    : 1.01
Resolution Unit                 : inches
X Resolution                    : 300
Y Resolution                    : 300
Image Width                     : 1187
Image Height                    : 1687
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
Image Size                      : 1187x1687
Megapixels                      : 2.0

vs.

>>> from PIL import Image
>>> Image.open("IMG/FILE_0033_ORIGINAL.jpg")
<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1187x1687 at 0x7F7958245C18>
>>> img = Image.open("IMG/FILE_0033_ORIGINAL.jpg")
>>> img.info
{'jfif': 257, 'jfif_version': (1, 1), 'dpi': (300, 300), 'jfif_unit': 1, 'jfif_density': (300, 300)}

bertsky · 2019-07-10T10:23:25Z

That's right, but there are more options than JPG here. And it can always be empty. And core's OcrdExif looks like a good abstraction (which does use PIL already, btw). That's why I was asking whether this is recommended and stable (especially the fallback defaults xResolution=1 etc). In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

kba · 2019-07-10T11:53:56Z

That's why I was asking whether this is recommended and stable (especially the fallback defaults xResolution=1 etc)

This is the recommended way, though not as stable as I'd wish. The variability of image metadata is immense and PIL.Image.Info exposes the metadata on a relatively low level, so IMHO an abstraction is necessary to handle all the formats, units and bogus metadata.

The xResolution = 1 hacks are in place not too cause division-by-zero errors and be so obviously low to prevent processors from assuming it to be true.

In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

True, that should be made explicit. We have some information at https://ocr-d.github.io/gt//trans_documentation/exif_pagexml_konkordanz.html from early 2018 when we considered encoding image metadata with MIX.

ocrd_tesserocr/crop.py

ocrd_tesserocr/deskew.py

ocrd_tesserocr/recognize.py

ocrd_tesserocr/segment_line.py

ocrd_tesserocr/segment_word.py

ocrd_tesserocr/segment_region.py

bertsky · 2019-07-10T21:03:16Z

In that case it should probably be in the spec as well (with a hint that ocrd.OcrdExif already implements it).

True, that should be made explicit.

But on the other hand, how should this possibly be related to the spec? We cannot teach the implementors how to talk to their underlying processors (in how they should determine and pass down DPI values). We can only make the annotation in PAGE mandatory (i.e. imageXResolution / imageYResolution / imageResolutionUnit).

Then we would of course generously provide a specialized processor that just annotates that (using OcrdExif). And we would of course build it into the basic OcrdPage functions as well, to add those attributes if they are not already there (for everyone who uses core to write their processors, so they do not need the special processor). Thus, any careful and wise processor author can find joy in just relying on the attributes of PageType for her DPI adjustments. She will also have to deal with the question of fallback/default values now: if nothing was annotated, then there is nothing to know, so she decides what to do about it herself.

kba · 2019-07-11T10:29:48Z

But on the other hand, how should this possibly be related to the spec? We cannot teach the implementors how to talk to their underlying processors (in how they should determine and pass down DPI values). We can only make the annotation in PAGE mandatory (i.e. imageXResolution / imageYResolution / imageResolutionUnit).

We had long discussions on this last year and decided to use the image itself as the single source of truth for all technical image metadata. Duplicating the information from the image in PAGE could lead to metadata drift (e.g. the image changes but the PAGE doesn't reflect that or the annotations are missing in PAGE) whereas the image is always available. In addition, the 2017 PAGE version didn't have the attributes yet.

How this is related to the spec: Different image file formats support different metadata schemes, like EXIF, JFIF or PNG tags. While an abstraction like PIL.Image.Info, it leaks pretty quickly. Implementing this to support the reality of images used in libraries, archives etc. is not only tedious for extracting metadata like dimension or pixel density but also for colorspace conversions or image manipulation.

Originally, the plan was to have an image characterization step that does all the metadata extraction, like you propose, and refuse processing bad images if pixel density too low etc. In reality, it is next to impossible to guess pixel density and the image metadata often cannot be trusted.

That being said, I'm open to encoding this in PAGE instead if that makes life easier for developers and users. This would require a dedicated processor as the first step in any pipeline though, since the input file group will be images.

@cneud @tboenig Any thoughts on this?

bertsky · 2019-07-11T11:06:07Z

Related: OCR-D/core#37 (re-open there?)

…t annotation in case AlternativeImage was already present)

…crd_tesserocr#54

bertsky added 2 commits July 9, 2019 15:09

deskew: relax assertion

41cfa12

set parameter user_defined_dpi to DPI value from source image tags

6b5c1ff

bertsky requested a review from wrznr July 9, 2019 13:16

wrznr approved these changes Jul 9, 2019

View reviewed changes

deskew: omit 'cropped' from comments unless the segment was actually …

01c0208

…cropped

bertsky self-assigned this Jul 9, 2019

wrznr self-requested a review July 10, 2019 07:46

wrznr approved these changes Jul 10, 2019

View reviewed changes

kba requested changes Jul 10, 2019

View reviewed changes

bertsky force-pushed the set-dpi branch from 2896d24 to ba7eb5d Compare July 11, 2019 08:26

bertsky requested a review from kba July 11, 2019 08:28

crop: create AlternativeImage when successful (to prevent inconsisten…

433f326

…t annotation in case AlternativeImage was already present)

bertsky force-pushed the set-dpi branch from ba7eb5d to 433f326 Compare July 11, 2019 11:22

make DPI calculation based on OcrdExif abstraction instead of PIL

e9c95ba

kba added a commit to kba/ocrd-core that referenced this pull request Jul 11, 2019

Improve pixel density logic in OcrdExif, OCR-D#256, OCR-D#37, OCR-D/o…

76181e7

…crd_tesserocr#54

bertsky added a commit to bertsky/core that referenced this pull request Jul 11, 2019

Improve pixel density logic in OcrdExif, OCR-D#256, OCR-D#37, OCR-D/o…

ef37dc4

…crd_tesserocr#54

bertsky mentioned this pull request Jul 11, 2019

Improve pixel density logic in OcrdExif OCR-D/core#257

Merged

kba approved these changes Jul 11, 2019

View reviewed changes

bertsky merged commit 4a69ba1 into OCR-D:master Jul 11, 2019

bertsky deleted the set-dpi branch July 11, 2019 12:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set DPI #54

Set DPI #54

bertsky commented Jul 9, 2019

bertsky commented Jul 9, 2019

bertsky commented Jul 9, 2019

wrznr commented Jul 10, 2019

wrznr commented Jul 10, 2019

bertsky commented Jul 10, 2019

kba commented Jul 10, 2019

bertsky commented Jul 10, 2019

kba commented Jul 11, 2019

bertsky commented Jul 11, 2019

Set DPI #54

Set DPI #54

Conversation

bertsky commented Jul 9, 2019

bertsky commented Jul 9, 2019

bertsky commented Jul 9, 2019

wrznr commented Jul 10, 2019

wrznr commented Jul 10, 2019

bertsky commented Jul 10, 2019

kba commented Jul 10, 2019

bertsky commented Jul 10, 2019

kba commented Jul 11, 2019

bertsky commented Jul 11, 2019