-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix speed of --patch-from mode at high compression levels #4276
base: dev
Are you sure you want to change the base?
Conversation
improves compression ratio at low levels
c92d015
to
e2aac8f
Compare
by avoiding to duplicate in memory a dictionary that was passed by reference.
thus saving a bit of memory and a little bit of cpu time
--patch-from no longer blocked on first job dictionary loading
} else { | ||
/* note : a loadPrefix becomes an internal CDict */ | ||
mtctx->cdictLocal = ZSTD_createCDict_advanced(dict, dictSize, | ||
dictLoadMethod, dictContentType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually care about the dictLoadMethod
here, right? If it was ZSTD_dlm_byCopy
, the outer ZSTD_CCtx
must have already token ownership of the buffer. So we can use byRef
here.
dictLoadMethod, dictContentType, | |
ZSTD_dlm_byRef, dictContentType, |
@@ -1232,7 +1248,7 @@ static size_t ZSTDMT_computeOverlapSize(const ZSTD_CCtx_params* params) | |||
|
|||
size_t ZSTDMT_initCStream_internal( | |||
ZSTDMT_CCtx* mtctx, | |||
const void* dict, size_t dictSize, ZSTD_dictContentType_e dictContentType, | |||
const void* dict, size_t dictSize, ZSTD_dictContentType_e dictContentType, ZSTD_dictLoadMethod_e dictLoadMethod, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we can always use byRef
, we don't need to ass a dictLoadMethod
in here.
const void* dict; | ||
size_t dictSize; | ||
ZSTD_dictContentType_e dictContentType; | ||
} ZSTD_prefixDict; | ||
ZSTD_dictLoadMethod_e loadMethod; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we can always use byRef
in ZSTDMT, we don't need to store it here.
When employing
--patch-from
in combination with very high compression levels,compression speed plummets significantly, becoming extremely slow.
This issue is mitigated by this patch,
which sensibly reduces compression time for this scenario.
Benchmark: patching
linux-6.7.tar
(~1.4GB) fromlinux-6.6.tar
on a M1 Pro laptop (employing 2 threads by default)
dev
The impact is even more pronounced with a larger amount of threads.
Benchmark on a core i7-9700k with 6 threads :
dev
Also: improved memory usage of
--patch-from
, by skipping an internalCDict
creation that was duplicating reference content in memory.