Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tone-mapping should be stable when mDCv and cLLi are static #319

Open
palemieux opened this issue Jun 12, 2023 · 25 comments
Open

Tone-mapping should be stable when mDCv and cLLi are static #319

palemieux opened this issue Jun 12, 2023 · 25 comments
Assignees

Comments

@palemieux
Copy link
Contributor

If present, mDCv and cLLi SHALL completely define the tone mapping algorithm used by the decoder when rendering the image to a display.

mDCv and cLLi SHOULD be set. This is particularly important when drawing a temporal sequence of images. If mDCv and cLLi are not set, the tone mapping algorithm can vary over the sequence, resulting in temporal artifacts.

@palemieux palemieux self-assigned this Jun 12, 2023
@ProgramMax
Copy link
Collaborator

Just to be clear, I think "...completely define the tone mapping algorithm..." might just mean the decoder's choice of tone mapping is now set, right? IE we aren't defining what tone mapping algorithm the decoder should use. Just that it should be consistent with itself.

(The intention being that the decoder doesn't change which tone mapping algorithm it uses based on the image, for example...just the metadata.)

@fintelia
Copy link

Does "SHALL completely define" contradict "Ancillary chunks may be ignored by a decoder."?

@ProgramMax
Copy link
Collaborator

It shouldn't.
It's more like when mDCv and cLLI are present, the decoder has everything it needs to make a consistent decision about which tone mapping to use. I think that is the intent of the "shall completely define" phrasing. (And we want to enforce that it will indeed be consistent with itself.)

But those are still ancillary chunks.

@fintelia
Copy link

If a decoder sees that mDCv and cLLI are present but ignores them and does tone mapping based on the image contents instead, has it violated the spec?

@palemieux
Copy link
Contributor Author

If a decoder sees that mDCv and cLLI are present but ignores them and does tone mapping based on the image contents instead, has it violated the spec?

I think it should be strongly discouraged at the very least.

@palemieux
Copy link
Contributor Author

IE we aren't defining what tone mapping algorithm the decoder should use. Just that it should be consistent with itself.

Yes, the idea is that the HDR metadata, if present, should set all internal parameters of the algorithm, no matter what they are.

@ProgramMax
Copy link
Collaborator

It is a little tricky because the HDR metadata in cLLi is itself dependent on the image contents.

I could imagine a scenario where a new tone mapping algorithm is developed and the mDCv and cLLi chunks are not sufficient. So even though they are provided in an image, the decoder might ignore them.

However, that theoretical future decoder will run into the problem that spawned this issue: Two images next to each other or in a sequence like a flipbook might have a jarring, unintentional jump if they use different tone mapping algorithms.

The goal is to make the decoder consistent with itself. It can use the new tone mapping algorithm both times. That's fine. Said another way: So long as the decoder ignores both chunks consistently and always applies its tone mapping algorithm, that is fine.

@palemieux
Copy link
Contributor Author

Said another way: So long as the decoder ignores both chunks consistently and always applies its tone mapping algorithm, that is fine.

I think it should be a little bit more stringent: it should be possible for the author to strongly hint that multiple images should be tone-mapped ignoring their contents.

@palemieux
Copy link
Contributor Author

It is a little tricky because the HDR metadata in cLLi is itself dependent on the image contents.

cLLi is for an image sequence, not an individual image.

@svgeesus
Copy link
Contributor

It is a little tricky because the HDR metadata in cLLi is itself dependent on the image contents.

cLLi is for an image sequence, not an individual image.

It's a good thing the PNG spec defines some terms, so that we have the option of using them consistently.

A PNG datastream consists of a static image and, optionally, a frame-based sequence. ( which may or may not include the static image as the first frame).
https://w3c.github.io/PNG-spec/#static-and-animated-images
https://w3c.github.io/PNG-spec/#dfn-frame

cLLi is defined on frames. We don't need the term 'image sequence'.

I just noticed that the casual reader might conclude that a static (non-animatedPNG has no frames and thus, that they can't use cLLI to define the brightest pixel in a static PNG.

The spec is actually clear on that, because the MaxCLL is defined on "the entire playback sequence" but I think we could be a bit more explicit that MaxCLL can indeed be defined on a static PNG.

@simontWork
Copy link
Contributor

Do we need to define what happens if a user creates and mDCV tag that differs from the monitor specified in the specification? For example, sRGB has a 80nit monitor in the spec, Adobe RGB has a 160nit monitor, BT.709 and BT.2020 both have 100 nit defined in BT.2035 and HLG has a variable monitor brightness with a corrective gamma adjustment. How do you actually use the mDCV tag to maintain subjective appearance?

@jbowler
Copy link

jbowler commented Jan 31, 2024

If a decoder sees that mDCv and cLLI are present but ignores them and does tone mapping based on the image contents instead, has it violated the spec?

I think it should be strongly discouraged at the very least.

Then the chunks should be MDCv and CLLI (etc): this is the precise difference between an ancillary chunk and a critical chunk. An ancillary chunk may be ignored, a critical chunk shall not be ignored.

@svgeesus
Copy link
Contributor

svgeesus commented Feb 5, 2024

this is the precise difference between an ancillary chunk and a critical chunk. An ancillary chunk may be ignored, a critical chunk shall not be ignored.

True, although the PNG spec has always interpreted critical to mean "no image of any sort can be displayed without it", not "needed to display the image correctly". Which is why we have only 4 critical chunks: IHDR, IDAT, IEND and PLTE - and the last of those, as originally defined, was only sort-of-critical hence the need to add sPLT to fix that.

Notably, chunks which are needed to make the image display correctly, like gAMA (if that is the only colorspace-related info) or iCCP are ancillary. So I certainly don't think mDCv and cLLi need to become critical.

@palemieux
Copy link
Contributor Author

So I certainly don't think mDCv and cLLi need to become critical.

+1

@jbowler
Copy link

jbowler commented Feb 5, 2024

So I certainly don't think mDCv and cLLi need to become critical.

+1

I can't see ChrisL's original comment here on github however my point was not that the chunks should be critical (they shouldn't) but that unless they are critical all decoders are completely free to ignore them; it's just a QoI issue. At least the use of the word SHALL in the OP comment creates a massive conflict in the specification; the checks are ancillary but the decoder "shall" use them to control tone mapping. In effect it's making them backdoor critical chunks. Hence my comment.

@svgeesus
Copy link
Contributor

We have plenty of existing cases where an ancillary chunk includes normative wording (shall, must, should). In general, this seems to mean "you can get some sort of image without this, so it is ancillary" and also "if you use this chunk (readers) or create this chunk (writers) then ...". A few examples:

tRNS

Each entry indicates that pixels of the corresponding palette index shall be treated as having the specified alpha value.

Encoders should set the other bits to 0, and decoders must mask the other bits to 0 before the value is used.

A tRNS chunk shall not appear for color types 4 and 6, since a full alpha channel is already present in those cases.

iCCP

When the iCCP chunk is present, PNG decoders that recognize it and are capable of color management shall ignore the gAMA and cHRM chunks and use the iCCP chunk instead and interpret it according to [ICC].

Unless a cICP chunk exists, a PNG datastream should contain at most one embedded profile, whether specified explicitly with an iCCP or implicitly with an sRGB chunk.

sBIT

Each depth specified in sBIT shall be greater than zero and less than or equal to the sample depth (which is 8 for indexed-color images, and the bit depth given in IHDR for other color types).

tEXt

Newlines in the text string should be represented by a single linefeed character (decimal 10).

@svgeesus
Copy link
Contributor

If present, mDCv and cLLi SHALL completely define the tone mapping algorithm used by the decoder when rendering the image to a display.

I just realized that a display system with an ambient light detector (eg an HDR TV) whose tone mapping responds to the current HDR headroom, would not comply with that requirement. Also, in some cases it is the display not the decoder which does the tone mapping, and the display does not know what PNG chunks were in the image (or even that it was a PNG image).

@jbowler
Copy link

jbowler commented Mar 6, 2024

I just realized that a display system with an ambient light detector

Chris, honestly; what on earth does this have to do with PNG? Like, dude, what on earth does tone mapping have to do with PNG?

@digitaltvguy
Copy link
Contributor

digitaltvguy commented Mar 6, 2024

I just realized that a display system with an ambient light detector

Chris, honestly; what on earth does this have to do with PNG? Like, dude, what on earth does tone mapping have to do with PNG?

PNG's that are HDR can be displayed on an SDR display (and often will be), so tone mapping is important. This occurs in MacOS eDR right now so you can place HDR and SDR images in different application windows.

@svgeesus
Copy link
Contributor

svgeesus commented Mar 6, 2024

Chris, honestly; what on earth does this have to do with PNG? Like, dude, what on earth does tone mapping have to do with PNG?

John, honestly, you must have heard of HDR? You may have missed that PNG now supports HDR images as well as SDR ones. Here is an explainer

@simontWork
Copy link
Contributor

simontWork commented Mar 7, 2024

If present, mDCv and cLLi SHALL completely define the tone mapping algorithm used by the decoder when rendering the image to a display.

I just realized that a display system with an ambient light detector (eg an HDR TV) whose tone mapping responds to the current HDR headroom, would not comply with that requirement. Also, in some cases it is the display not the decoder which does the tone mapping, and the display does not know what PNG chunks were in the image (or even that it was a PNG image).

You may find that even without an ambient light detector, it doesn't comply. TV manufacturers want their products to look different in a show room to allow the consumer to choose based on look and most will need to follow regional power saving regulations. There are a number of initiatives to get a more standardised look.

As well as HDR to SDR tone-mapping, PQ HDR systems require HDR to HDR tone mapping to adjust a display referred signal to a lower capability monitor, which is where the metadata added to PNG originated.

There was some discussion in the W3C Color on the Web group on a minimum viable tone-mapping and we presented a relatively simple technique for HDR-SDR mapping complete with ambient light adaptation: https://bbc.github.io/w3c-tone-mapping-demo/ This could obviously be built upon, for example a better gamut reduction algorithm could be included, which I know is something that @svgeesus has been investigating.

@digitaltvguy
Copy link
Contributor

digitaltvguy commented Jan 4, 2025

mDCV and cLLI do not define a tone mapping algorithm. They are INFORMATIVE to the tone mapping that a manufacturer decides to use. The algorithm applied is entirely subjectively and determined by each manufacturer for their product. Its purpose was for implemention (tone mapping) on the final display side only. It forms are relationship between the content creation characteristics and the final target display.

Please take a look at SMPTE ST 2067-21 and CTA 861.3. There's not mention of a specific algorithm that will be implemented to tone map.

@jbowler
Copy link

jbowler commented Jan 4, 2025

Just to clarify. @digitaltvguy is Chris Seeger, right?

@jbowler
Copy link

jbowler commented Jan 4, 2025

Regardless, I offered this apology to Chris Seeger on the W3C mailing list (open to anyone, I assume, since they let me in). I've omitted most of the apology since it's technically off topic. This is the set of conclusions I came to based on the discussions on the list and, as will be revealed, elsewhere. Like I said at the end, this is the best I can do.

If I put sRGB data into a 2020 (colour primary) container and I don't write an mDCV chunk then I expect no detectable colour shift because the sRGB adapted white point, D65, matches the 2020 cICP(9,) D65 encoding/adopted white point.

If I put wide gamut data with an adapted white point some way away from D65, e.g. D50, into a REC 2020 container I will certainly expect a bad colour shift if I don't include mDCV.  In fact I would require the colour shift; the only adapted white the decoder can assume is the container one, D65.

Why would I do this?  Because I have no choice: We use the tools we are given, not the ones we want.  There are only three ways to include higher dynamic range data in PNG just using publicly defined chunks:
  1. Use a gAMA value of around 1/20 (i.e. a screen gamma of "20").  The precise choice depends on what the aim is.  The problem is that while doing this produces a completely conformant PNG decoders (including libpng) typically cannot handle it.  My long and detailed explanation with very precise numbers is here: png_set_cHRM() fails when using ACEScg coordinates pnggroup/libpng#578 (comment)
  2. Use a cICP chunk with the transfer function set to 16 (PQ) and the original PQ depth (record it in sBIT) chosen carefully.  If you go too high you gain no perceptual benefit but the noise zaps the PNG compression (not that 16-bit compression is that good in any case.)
  3. Use a cICP chunk with the transfer function set to 18 (HLG) and sBIT to 10 - at least that seems to be the only defined choice at present.  Be very careful to follow the mandatory instructions in H.273 about scaling to 16 bits.
So then I have the transfer function in a chunk that is fully approved by the W3C, the broadcast industry and others!  However now I have to chose from one of the numbers in Table 2.  There is very little choice if the data is not only "HDR" but is also wide gamut.  I can only see two choices; additions to my list are very welcome and I haven't checked through the H.273 chromaticities to see whether any others are wide gamut:
  1. 10: CIEXYZ.  Completely complete, every colour of the rainbow and all the others too.  Horribly inefficient as an encoding in PNG but, in fact, that probably improves the 16-bit compression.
  2. 9: REC-2020 again.  Not complete.  The question is whether the overlap in tristimulus colour space (measured in CIELab or CIELuv) is sufficient.  This is a tricky calculation which I haven't done.  I'd be tempted more by CIEXYZ despite the maybe weird adopted white.
I'm not trying to persuade anyone of anything beyond my normal aim which is to present arguments in the forumagora.  Based on what I have just read and just learned mDCV at least is essential to cICP in the broader context of all possible valid PNG usage.

This is just my opinion.
  | ReplyReply allForwardYou can't react to a message with a reply-to address -- | --

@jbowler
Copy link

jbowler commented Jan 4, 2025

As well as HDR to SDR tone-mapping, PQ HDR systems require HDR to HDR tone mapping to adjust a display referred signal to a lower capability monitor, which is where the metadata added to PNG originated.

I didn't read that. Thank you. This is what I want; I didn't see it because it wasn't a reply to me. Here's the relevant part of the apology that I omitted but which is now clearly on topic (I summarized your response, readers need to read the whole thing, I think you quoted Poynton there :-). Oh, this next quote is political, sensitive souls please tune out, the quoted sentence is someone else's.

I've always assumed that the broadcast industry would want to ensure that original broadcast data was reproduced consistently across all devices so that two devices with the same capabilities would display the same picture. "Most video has existed without both mDCV and cLLI for many years." And inconsistent results existed even though the output device capabilities (television receiver and, indeed monitor) were substantially the same. So I jumped to the conclusion that the ITU had, in fact, formalised tonemapping because everyone can see that all current and, maybe, practical RGB devices can't handle the colour gamut of 2020 (colour primaries 9 in cICP).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants