Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3_object_key_format without index overwrites (not append) log files #435

Open
arepusb opened this issue Jan 29, 2024 · 11 comments
Open

s3_object_key_format without index overwrites (not append) log files #435

arepusb opened this issue Jan 29, 2024 · 11 comments

Comments

@arepusb
Copy link

arepusb commented Jan 29, 2024

Describe the bug

Hello there!
I use fluentd v1.16 and fluent-plugin-s3 and according my configuration it should send logs from few files to MinIO once per day.
Hers is the configuration:

<match projectname>
    @type s3
    aws_key_id "#{ENV['MINIO_ROOT_USER']}"
    aws_sec_key  "#{ENV['MINIO_ROOT_PASSWORD']}"
    s3_bucket tenants
    format json
    force_path_style true
    s3_endpoint "http://#{ENV['MINIO_HOST']}:#{ENV['MINIO_PORT']}/"
    path "#{ENV['TENANT_ID']}/logs/projectname-"     # This prefix is added to each file
    time_slice_format %Y%m%d%H%M  # This timestamp is added to each file name
    #s3_object_key_format %{path}%{time_slice}.%{file_extension} # Should be commented because target log file will be overwritten few times and logs wil be lost. 

    <buffer tag,time>
        @type file
        path /fluentd/logs/
        timekey 1440m  
        timekey_wait 10m
        flush_mode lazy
        timekey_use_utc true
        chunk_limit_size 256m
    </buffer>
</match>

I noticed that fluent-plugin-s3 often creates on MinIO more than one file per day.

Example,
projectname-202401240532_3.gz
projectname-202401240532_2.gz
projectname-202401240532_1.gz
projectname-202401240532_0.gz

I would like to have single log file on MinIO per day.
To achieve this goal I tried to play with s3_object_key_format property.
default value = %{path}%{time_slice}_%{index}.%{file_extension}
I changed it to %{path}%{time_slice}.%{file_extension}. As result I lost part of the logs. Looks like target log file was overwritten few times and I saw only data from the latest iteration.

How to force fluent-plugin-s3 to create only single file on MinIO when timekey condition has been met (and not lose data)?

To Reproduce

Use provided configuration and check the logs on MinIO.

Expected behavior

Would be great if information in the log file will be appended instead overwriting.

Your Environment

- Fluentd version: v1.16
- TD Agent version:
- Operating system:
- Kernel version:

Your Configuration

<match projectname>
    @type s3
    aws_key_id "#{ENV['MINIO_ROOT_USER']}"
    aws_sec_key  "#{ENV['MINIO_ROOT_PASSWORD']}"
    s3_bucket tenants
    format json
    force_path_style true
    s3_endpoint "http://#{ENV['MINIO_HOST']}:#{ENV['MINIO_PORT']}/"
    path "#{ENV['TENANT_ID']}/logs/projectname-"     # This prefix is added to each file
    time_slice_format %Y%m%d%H%M  # This timestamp is added to each file name
    #s3_object_key_format %{path}%{time_slice}.%{file_extension} # Should be commented because target log file will be overwritten few times and logs wil be lost. 

    <buffer tag,time>
        @type file
        path /fluentd/logs/
        timekey 1440m  
        timekey_wait 10m
        flush_mode lazy
        timekey_use_utc true
        chunk_limit_size 256m
    </buffer>
</match>

Your Error Log

have no error log

Additional context

No response

@daipom daipom transferred this issue from fluent/fluentd Jan 30, 2024
@daipom
Copy link
Contributor

daipom commented Jan 30, 2024

I have moved the issue here because it is about out_s3.

@daipom
Copy link
Contributor

daipom commented Jan 30, 2024

out_s3 plugin does not support append feature.
So, we need to upload files without duplicating file names.

I would like to have single log file on MinIO per day.

You can make out_s3 to upload files once a day.
(if you can tolerate very slow upload frequency...)

@arepusb
Copy link
Author

arepusb commented Jan 30, 2024

You can make out_s3 to upload files once a day.
(if you can tolerate very slow upload frequency...)

Fluentd is already configured to upload logs to MinIO once per day. I attached config in the description.
Here is part of the config

timekey 1440m  
timekey_wait 10m

Anyway I get more than one file per day often. Their time difference is no more than two minutes (much less than 10m).

@daipom
Copy link
Contributor

daipom commented Jan 30, 2024

Hmm, tag key or chunk_limit_size can be the cause.

<buffer tag,time>
chunk_limit_size 256m

@arepusb
Copy link
Author

arepusb commented Jan 30, 2024

Size of log files ~ 200-300 bytes. It's much less than 256m.

@daipom
Copy link
Contributor

daipom commented Jan 31, 2024

Then, please try removing tag key

- <buffer tag,time>
+ <buffer time>

@arepusb
Copy link
Author

arepusb commented Feb 2, 2024

Then, please try removing tag key

- <buffer tag,time>
+ <buffer time>

To be honest I tested both variants before creation this request. Thank you for trying to help!

Copy link

github-actions bot commented Mar 3, 2024

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

@github-actions github-actions bot added the stale label Mar 3, 2024
@ashie ashie removed the stale label Mar 3, 2024
Copy link

github-actions bot commented Apr 3, 2024

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 7 days

@github-actions github-actions bot added the stale label Apr 3, 2024
@daipom daipom removed the stale label Apr 3, 2024
@daipom
Copy link
Contributor

daipom commented Apr 3, 2024

@arepusb
Sorry for the interval.
Did you find out anything?

@kenhys
Copy link
Contributor

kenhys commented Dec 18, 2024

Is it still reproducible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Triage
Development

No branches or pull requests

4 participants