Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose and conserve the color palette for indexed png images. #2485

Merged
merged 14 commits into from
Oct 3, 2023

Conversation

JimBobSquarePants
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This builds upon #2455 and should be merged after.

Changes the following in PngMetadata:

  • Exposes a new ReadonlyMemory<Color> ColorTable property.
  • Merges TransparentRgb24, TransparentRgb48, TransparentL8 and, TransparentL16 into a new nullable TransParentColor Color? property.
  • Removes HasTransparency since we now know that from the palette or transparent color.

On index Png encode, as with Gif if a palette is found in the metadata the quantizer will use that instead.

@JimBobSquarePants
Copy link
Member Author

@SixLabors/core This one next please 🙏

for (int i = 0; i < alpha.Length; i++)
{
ref Color color = ref colorTable[i];
color = color.WithAlpha(alpha[i] / 255F);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If perf is important here, then

Suggested change
color = color.WithAlpha(alpha[i] / 255F);
color = color.WithAlpha(alpha[i] * (1F / 255F));

will produce a mul instead of div.

Even if perf isn't critical here I think that I'd change the code, as

  • it's almost as readable
  • it's not only about the perf of this piece of code, but rather for the whole system, as the cpu-ports can be utilized better when there's no division clogging it

Note: it's Roslyn to blame here for not being able to emit 1 / 255F automatically -- though the JIT could do that optimization too, but it has more stringent timing constraints...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would not be a valid optimization for either compiler to make, as the results are not the same (due to 1/255f not being precisely representable in float). See https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEUCuA7AHwAEAmARgFgAoIgBgAIiyA6AGQEs8BHAbmqIDMjEvQDC9AN7V6M4SKlVZS+gDNoMAIZgAFvQAUnDPQBu9TvQCieHAFsYUDcAA2MZgCUNeAOYw9tNPQkAKxBAJShktLK0ewq+qYA9IEhcQCEALwm9ABU+mQJwUEq4VHRZYxkAJx6ACQARBLGAL4gkonJRU30GW05eQUpoU11oXyKZU2l9JNUTUA===

For that reason, unless it's a hot path, this optimization loses precision for no measurable benefit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as the results are not the same

Yep, but the error is in the region of 1e-8 or less (demo)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that value range, yes, but the magnitude of the error is relative to the magnitude of the value -- that's how floating point works. Whether the precision loss is acceptable in any given case is a question of the trade-off between performance (~12 cycles for divss vs ~4 for mulss) and precision.

I'm not familiar enough with the code to know whether that trade-off is worthwhile here (though it looks questionable). Mostly, I was commenting on your suggestion that Roslyn and RyuJIT were somehow missing an optimization one of them could have made automatically, which is incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is incorrect

You are right (and I have to apologize here by Roslyn).
I had a look what GCC does (before writing the first comment on this) and it emitted the mul -- but I had -Ofast compiler switch set which disregards strict standards compliance (as I re-read just now). With -O{2,3} it's a div.

Thanks for the heads up!

@@ -875,7 +875,7 @@ private void WriteGammaChunk(Stream stream)
// 4-byte unsigned integer of gamma * 100,000.
uint gammaValue = (uint)(this.gamma * 100_000F);

BinaryPrimitives.WriteUInt32BigEndian(this.chunkDataBuffer.Span.Slice(0, 4), gammaValue);
BinaryPrimitives.WriteUInt32BigEndian(this.chunkDataBuffer.Span[..4], gammaValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: the Slice here was IIRC intentional, as it produces better code than the slice. There are issues in Roslyn and runtime to care about that, but for .NET 8 these aren't resolved (when I'm not mistaken).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're just taking the hit here. One day Roslyn will catch up, but we need greater consistency to ease the (massive) maintenance burden. I'm hoping come v4/.NET8 with the SIMD migration improvements I have planned we won't notice codegen issues.

@@ -1175,7 +1177,7 @@ private void WriteChunk(Stream stream, PngChunkType type, Span<byte> data, int o

stream.Write(buffer);

uint crc = Crc32.Calculate(buffer.Slice(4)); // Write the type buffer
uint crc = Crc32.Calculate(buffer[4..]); // Write the type buffer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@JimBobSquarePants
Copy link
Member Author

@antonfirsov The changes here are based upon the changes made to the gif format API. If you have a chance, I'd love to get your feedback please.

@JimBobSquarePants
Copy link
Member Author

@SixLabors/core Can I get a review here please.

@antonfirsov
Copy link
Member

@JimBobSquarePants will do it by EOW if that's good enough.

@JimBobSquarePants
Copy link
Member Author

@JimBobSquarePants will do it by EOW if that's good enough.

Thanks! Appreciate it.

Copy link
Member

@antonfirsov antonfirsov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions, but mostly LGTM.

Comment on lines +936 to +940
ReadOnlySpan<Rgb24> rgbTable = MemoryMarshal.Cast<byte, Rgb24>(palette);
for (int i = 0; i < colorTable.Length; i++)
{
colorTable[i] = new Color(rgbTable[i]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I see several loops like this one in codecs now. We can maybe define a public static Color.FromPixel<TPixel>(Span<TPixel>, Span<Color>) analogous to the bulk ToPixel<TPixel>(...) method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without knowing the component precision of the TPixel we don't know which boxed pixel to assign to. We could do it in v4 though if we added additional information to PixelTypeInfo #2534 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah this indeed goes well beyond this PR, nevermind then.

@JimBobSquarePants JimBobSquarePants merged commit e905b0a into main Oct 3, 2023
7 checks passed
@JimBobSquarePants JimBobSquarePants deleted the js/png-pallete branch October 3, 2023 23:00
@JimBobSquarePants JimBobSquarePants mentioned this pull request Oct 3, 2023
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Signifies a binary breaking change. enhancement formats:png
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants