Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeline for IANA dictionary registry? #1669

Closed
cldellow opened this issue Jul 2, 2019 · 11 comments
Closed

Timeline for IANA dictionary registry? #1669

cldellow opened this issue Jul 2, 2019 · 11 comments
Labels

Comments

@cldellow
Copy link

cldellow commented Jul 2, 2019

Hello - apologies if this isn't the best venue for this question. Please redirect me if that's the case!

https://tools.ietf.org/html/rfc8478#section-6.3 alludes to work in progress to provide pre-built dictionaries designed to optimize compressing certain types of content.

I have a use case (compressing many HTML files) that benefits from a dictionary -- even a small dictionary -- trained on HTML files. However, I'm hesitant to define an out-of-band process for distributing the dictionary and to become the steward for such a file, especially if such a standard dictionary may be coming soon, anyway.

I suspect an HTML dictionary would be one of the standard ones registered, so I was wondering - is there any publicly-shareable timeline for when such dictionaries may be available?

Thanks!

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jul 3, 2019

Unfortunately, don't expect it happening "soon".
To serve your own needs, you are better off today designing your own mechanism (we currently do the same at Facebook).

The topic of generic dictionaries for web contents has been discussed and stopped here :
mozilla/standards-positions#105

I still believe it's a good idea, especially as we have been able to measure gains, and they were substantial. But it will require some time to win the argument.

@cldellow
Copy link
Author

cldellow commented Jul 8, 2019

Thank you for the context + link to that thread! I can see how it's a bit of a political mess. :(

@cldellow cldellow closed this as completed Jul 8, 2019
@andrew-aladev
Copy link

Hello.

Unfortunately, don't expect it happening "soon".

It is frustrating. I was thinking that Facebook knows what competition it tried to jump in. Developers created awesome compression library with multiple dictionaries support. Now Facebook need to make an epic constantly repeated research (like Google did for brotli) and create registry with web optimized dictionaries.

Facebook can destroy Google in perspective, because web mutates and static brotli dictionary can stale. But Facebook stopped and don't want to provide money for such research.

Please let me know if I am wrong. I will be happy to be wrong.

@Cyan4973
Copy link
Contributor

The limitations do not come from Facebook side.

Actually, Facebook is already able to use Dictionary compression over http, though is contrained to its own private environment (own client, own server). The http ecosystem is very large, and it takes a lot of time to convince all actors that this innovation is a good thing for the web. Expect some progresses on this topic in the future (we are actively working on it, it's not abandoned), but at a pace compatible with the size of an ecosystem as gigantic as the http one.

@andrew-aladev
Copy link

@Cyan4973, Hello. Can you please ask in facebook something like rough estimate for target integration steps? Thank you.

@Cyan4973
Copy link
Contributor

This does not depend on Facebook.
A more critical actor for such a topic is IETF's W3C committee.

I would love to have such estimate, in fact we are working our way to get one, through direct contributions with active participation of @felixhandte .

But at this stage, it's too early, and we don't have any yet.

@felixhandte
Copy link
Contributor

Hi @andrew-aladev,

I am working on this topic. I'm not sure how you came to the conclusion that we had abandoned this direction. (Certainly, my bank account disproves your assertion that Facebook isn't spending money to figure this out.)

What do you want me to tell you? It's a hard problem, both technically (especially w.r.t. security) and in terms of driving consensus and adoption. We welcome (constructive) contributions on either of those fronts.

At any rate, I'll be at IETF 106 to discuss the progress we've made and the plan going forward.

@andrew-aladev
Copy link

Hi @felixhandte, I have an assumption for RFC: please add a special encoding type: zstd-no-dictionary and it will be possible to integrate it everywhere today (web browsers too). Otherwise it is not possible because regular zstd requires dictionary for decompressing.

Web browsers with regular zstd support released today won't be able to decompress content from 2025, because it will require dictionaries.

@felixhandte
Copy link
Contributor

The plan is the opposite: as I described in the caniuse thread, the RFC does not standardize the use of a dictionary. Responses with Content-Encoding: zstd should not use a dictionary. If and when a dictionary-based scheme is standardized for HTTP, it will use a different content-coding identifier.

@andrew-aladev
Copy link

andrew-aladev commented Oct 14, 2019

Sorry, It is not clear, I can't see anything about dictionary in RFC 6.2 Content Encoding. Are you sure about that?

@felixhandte
Copy link
Contributor

I agree that the text of the RFC is not as clear about this as it could have been. I can look into inserting something in a future version of the document.

I am pretty confident that I can speak authoritatively about Zstd and HTTP. We are not going to ship an extension to the spec that breaks existing clients and servers... That would be pretty obviously stupid. So any standardized way of using dictionaries will require at least one of: a totally different content-coding identifier, or extra negotiation beyond Accept-Encoding: zstd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants