Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: The requested DurationSeconds exceeds the MaxSessionDuration #814

Closed
kaiohenricunha opened this issue Jun 26, 2023 · 3 comments · Fixed by #820
Closed

bug: The requested DurationSeconds exceeds the MaxSessionDuration #814

kaiohenricunha opened this issue Jun 26, 2023 · 3 comments · Fixed by #820

Comments

@kaiohenricunha
Copy link
Contributor

kaiohenricunha commented Jun 26, 2023

Describe the issue

My fluent-operator setup had been up and running for a few months with Fluentbit and Fluentd.

Just recently I stumbled upon this Fluentd error:

level=error msg="Fluentd exited" error="exit status 1"
level=info msg=backoff delay=0s
level=info msg="backoff timer done" actual=25.533µs expected=0s
level=info msg="Fluentd started"
2023-06-26 21:36:43 +0000 [info]: init supervisor logger path=nil rotate_age=nil rotate_size=nil
/usr/lib/ruby/gems/3.1.0/gems/aws-sdk-core-3.175.0/lib/seahorse/client/plugins/raise_response_errors.rb:17:in `call': The requested DurationSeconds exceeds the MaxSessionDuration set for this role. (Aws::STS::Errors::ValidationError)

After some investigation I found that my customPlugin had automatically installed the latest release of the fluent-plugin-opensearch:
https://github.com/fluent/fluent-plugin-opensearch

$ kubectl exec fluentd-0 -- gem list --local
fluent-plugin-opensearch (1.1.1)

This release changes the default session duration of the fluentd authentication with opensearch to 5h:
https://github.com/fluent/fluent-plugin-opensearch/pull/78/files

Which conflicted with the IAM role assigned to my Fluentd pod, which has a maxSessionDuration of 1 hour. To confirm that, I changed the maxSessionDuration to 6 hours and Fluentd started working again.

Automatic update of plugin versions is a critical concern.

Is there any way to pin the versions of the plugins I'm using?

Here is my current ClusterOutput configuration:

apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: opensearch
  labels:
    output.fluentd.fluent.io/enabled: "true"
    output.fluentd.fluent.io/tenant: "core"
spec:
  outputs:
    - customPlugin:
        config: |
          <match **>
            @type copy
            <store>
              @type opensearch
              host "${FLUENT_OPENSEARCH_HOST}"
              port 443
              logstash_format  true
              logstash_prefix logs-raas-core
              scheme https
              log_os_400_reason true
              @log_level ${FLUENTD_OUTPUT_LOGLEVEL:=error}
              <buffer>
                @type ${FLUENTD_BUFFER_TYPE:=memory}
                path ${FLUENTD_BUFFER_PATH:=/buffers/opensearch/raas-core}
                flush_mode ${FLUENTD_BUFFER_FLUSH_MODE:=interval}
                flush_interval ${FLUENTD_BUFFER_FLUSH_INTERVAL:=60s}
                flush_thread_count ${FLUENTD_BUFFER_FLUSH_THREAD_COUNT:=2}
                flush_at_shutdown ${FLUENTD_BUFFER_FLUSH_AT_SHUTDOWN:=true}
                retry_type ${FLUENTD_BUFFER_RETRY_TYPE:=exponential_backoff}
                retry_max_times ${FLUENTD_BUFFER_RETRY_MAX_TIMES:=10}
                retry_wait ${FLUENTD_BUFFER_RETRY_WAIT:=1s}
                retry_max_interval ${FLUENTD_BUFFER_RETRY_MAX_INTERVAL:=60s}
                chunk_limit_size ${FLUENTD_BUFFER_CHUNK_LIMIT_SIZE:=8M}
                total_limit_size ${FLUENTD_BUFFER_TOTAL_LIMIT_SIZE:=512MB}
                overflow_action ${FLUENTD_BUFFER_OVERFLOW_ACTION:=throw_exception}
                compress ${FLUENTD_BUFFER_COMPRESS:=text}
              </buffer>
              <endpoint>
                url "https://${FLUENT_OPENSEARCH_HOST}"
                region "${FLUENT_OPENSEARCH_REGION}"
                assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
                assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
              </endpoint>
            </store>
          </match>

To Reproduce

I'm not sure it's possible. Maybe if you have Fluentd running for a while you can delete its statefulset and it may return with newer plugin versions, if there are any.

Expected behavior

Being able to pin plugin version either via customPlugin AND the templates.

Your Environment

- Fluent Operator version: 2.3.0
- Container Runtime: Docker
- Operating system: Linux
- Kernel version:

How did you install fluent operator?

Via Helm chart.

Additional context

Also, I noticed there was a push to the kubesphere/fluentd:v1.15.3 recently:

image

Which must have caused an override of the Fluentd 1.15.3 image and reinstall of the plugin's latest version:

https://github.com/fluent/fluent-operator/blob/master/cmd/fluent-watcher/fluentd/base/Dockerfile#L43

@kaiohenricunha kaiohenricunha changed the title bug: Fluentd opensearch plugin version bug: fluentd automatic plugin version updates Jun 26, 2023
@kaiohenricunha kaiohenricunha changed the title bug: fluentd automatic plugin version updates bug: The requested DurationSeconds exceeds the MaxSessionDuration Jun 28, 2023
@kaiohenricunha
Copy link
Contributor Author

kaiohenricunha commented Jun 30, 2023

@wenchajun Any ideas?

I don't think we have to fix this problem, as it is a fluent-plugin-opensearch issue.

Just pinning the plugin's version to v1.1.0 would resolve it, or even better, making it possible for users to set and override the versions.

Can't we do it? I've seen the elasticsearch plugin version is pinned.

They've tried to fix it on the plugin side, but it didn't work. It's even worse:
fluent/fluent-plugin-opensearch#107

@wenchajun
Copy link
Member

I agree with you. The latest version is probably unstable and we can use a stable version, it should not be changed from the configuration, it should be changed from the docker image.

@kaiohenricunha
Copy link
Contributor Author

I agree with you. The latest version is probably unstable and we can use a stable version, it should not be changed from the configuration, it should be changed from the docker image.

Seems like they have a fix now:
fluent/fluent-plugin-opensearch#107 (comment)

I'm not sure it's possible, but ideally we could let users override plug-ins version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants