-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zstd compression support to S3 plugin #439
Conversation
Thanks for this enhancement. |
fluent-plugin-s3.gemspec
Outdated
@@ -26,4 +26,5 @@ Gem::Specification.new do |gem| | |||
# aws-sdk-core requires one of ox, oga, libxml, nokogiri or rexml, | |||
# and rexml is no longer default gem as of Ruby 3.0. | |||
gem.add_development_dependency "rexml" | |||
gem.add_development_dependency 'zstd-ruby' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be gem.add_dependency
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I also think it should add gem into runtime dependency instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your valuable feedback. I have updated the fluent-plugin-s3.gemspec
file by moving zstd-ruby
from development dependencies to runtime dependencies as you suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ddukbg
Sorry, I was wrong.
It should be development_dependency
.
Please take a look at the following PR for details.
This fix will require manual installation of zstd-ruby
gem.
0d0bf95
to
6af3b5d
Compare
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v3...v4) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: yongwoo.kim <[email protected]>
Signed-off-by: yongwoo.kim <[email protected]>
Signed-off-by: ddukbg <[email protected]>
@ddukbg Thanks! I will review this soon. FYI: We are just now getting Zstd support on Fluentd's side. |
@daipom hello :) |
ruby/setup-ruby action has installed proper
So, I think you can replace
Then, I think the tests might be successful. |
Before: - name: Install dependencies run: gem install bundler rake After: - name: Install dependencies run: gem install rake Signed-off-by: ddukbg <[email protected]>
@Watson1978 Thank you for your valuable feedback. I have modified the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this enhancement!
Could you please check my following comments?
Signed-off-by: ddukbg <[email protected]>
…mments Moved ZstdCompressor tests from test_in_s3.rb to test_out_s3.rb as they relate to the out_s3 plugin. Signed-off-by: ddukbg <[email protected]>
…omments Added tests for ZstdCompressor to test_out_s3.rb following the maintainer's suggestions. Signed-off-by: ddukbg <[email protected]>
Add ZstdCompressor to S3 Plugin and Fix Tests According to Maintainer's Feedback
@daipom Thank you for your valuable feedback :) Changes Made:
Fluentd Log
testcode
testcode(rspec)
Notes:
Please review the changes, and let me know if any further adjustments are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Sorry for my late response.
Basically, it looks good to me.
Could you please fix the following?
Remove redundant spaces to improve code readability and consistency Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]> refactor: Simplify data compression logic refactor: Simplify data compression logic Remove duplicate file reading and streamline compression process Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]>
@daipom
Could you please review the changes when you have a chance? |
4d2dbff
to
b1efca1
Compare
Signed-off-by: ddukbg <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry to bother you, but I think it would be better to implement the compressor in a separate file.
Could you please confirm the following comments?
Add test cases to verify: - Default compression level (3) - Custom compression level setting Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]>
…e ending and encoding handling - Fix compression logic using chunk.write_to method - Properly implement ZstdCompressor in separate file The previous implementation mistakenly included explicit line ending and encoding handling during troubleshooting. This has been removed as the proper implementation using chunk.write_to handles the data correctly without such manipulation. Signed-off-by: ddukbg <[email protected]> Signed-off-by: ddukbg <[email protected]>
@daipom The real issue was not with line endings or encoding, but with the compression implementation itself. After fixing the compression logic to use Changes in this commit:
I can confirm that the compressed files now have the correct ZSTD magic number (
|
Use direct chunk.read method instead of intermediate StringIO. This simplification maintains the same functionality while making the code more straightforward and easier to understand. Co-authored-by: Daijiro Fukuda <[email protected]> Signed-off-by: ddukbg <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks so much for this enhancement!
Summary
This PR adds support for Zstd compression in the Fluentd S3 plugin.
Changes
zstd-ruby
library.ZstdCompressor
class to handle log compression before uploading to S3.store_as zstd
.Zstd
module is properly loaded to avoid uninitialized constant errors.Testing
Test Code
Result
store_as (Zstd) Test
Test Data
echo '{"message": "'$(head -c 1000000 </dev/zero | tr '\0' 'A')'"}' | fluent-cat test.tag
fluentd log
2024-10-18 17:49:37 +0900 [info]: #0 fluent/log.rb:362:info: [Aws::S3::Client 200 0.162773 0 retries] head_object(bucket:"fluent-test-yw",key:"logs/20241018_0.zst")
S3 Data
Why this feature?
Zstd compression provides a better compression ratio and performance compared to gzip, making it a valuable option for users who want efficient log storage on S3.