Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor logger initialization #4065

Merged
merged 3 commits into from
Mar 9, 2023

Conversation

daipom
Copy link
Contributor

@daipom daipom commented Feb 19, 2023

Which issue(s) this PR fixes:
I found this problem during fixing:

This fix should be done before fixing this issue.

What this PR does / why we need it:
In the supervisor process, Fluentd initializes the logger twice, at Supervisor::configure() and Supervisor.load_config().

Supervisor.load_config() is called in ServerEngine's reloading function, but Fluentd doesn't use the function even when SIGHUP or SIGUSR2.
So I can't see the reason for initializing the logger in that callback.

This also fixes a problem that ignore_same_log_interval is not passed to the Supervisor.load_config() and that option is not reflected to the logger of the supervisor process.

Docs Changes:
Not needed.

Release Note:

  • Fix bug that ignore_same_log_interval option is not reflected in the supervisor process.
  • Fix bug that suppress_repeated_stacktrace from SystemConfig is not reflected in the worker processes.

We should consider the release note with

@daipom daipom marked this pull request as draft February 21, 2023 14:28
In the supervisor process, Fluentd initializes the logger twice,
at `Supervisor::configure()` and `Supervisor.load_config()`.

`Supervisor.load_config()` is called in ServerEngine's reloading
function, but Fluentd doesn't use the function even when `SIGHUP`
or `SIGUSR2`.
So I can't see the reason for initializing the logger in the
callback.

In addition, I removed some needless parameters that were passed
to ServerEngine.

This also fixes a problem that `ignore_same_log_interval` is not
passed to the `Supervisor.load_config()` and that option is not
reflected to the logger of the supervisor process.

Signed-off-by: Daijiro Fukuda <[email protected]>
@daipom daipom force-pushed the refactor-logger-initialization branch from 39c6698 to afee4a4 Compare March 3, 2023 04:15
@daipom daipom changed the title Draft: Refactor logger initialization Refactor logger initialization Mar 3, 2023
@daipom daipom marked this pull request as ready for review March 3, 2023 04:16
@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

I rebased this to apply #4064.

Please tell me if we need to investigate more about ServerEngine's reloading function.
If so, I will summarize Fluentd and ServerEngine's behavior related to this.

@daipom daipom requested a review from ashie March 3, 2023 04:21
@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

I am concerned that the Ruby 2.7 on macos-latest test fails.

https://github.com/fluent/fluentd/actions/runs/4320183945/jobs/7540167787

 Failure: test_unwatched_files_should_be_removed(TailInputTest::path)
/Users/runner/work/fluentd/fluentd/test/plugin/test_in_tail.rb:1616:in `test_unwatched_files_should_be_removed'
     1613:       waiting(20) { sleep 0.1 until Dir.glob("#{@tmp_dir}/*.txt").size == 0 } # Ensure file is deleted on Windows
     1614:       waiting(5) { sleep 0.1 until d.instance.instance_variable_get(:@tails).keys.size <= 0 }
     1615: 
  => 1616:       assert_equal(
     1617:         {
     1618:           files: [],
     1619:           tails: []
<{:files=>[], :tails=>[]}> expected but was
<{:files=>[],
 :tails=>
  ["/Users/runner/work/fluentd/fluentd/test/plugin/../tmp/tail/dcebe801d2a39d7ba0fc/tail.txt"]}>

diff:
? {:files=>[], :tails=>[]}
+  :tails=>
+   ["/Users/runner/work/fluentd/fluentd/test/plugin/../tmp/tail/dcebe801d2a39d7ba0fc/tail.txt"]}

@ashie
Copy link
Member

ashie commented Mar 3, 2023

It's a known unstable test.
We should make it stable but not related with this change.

@ashie ashie added this to the v1.16.0 milestone Mar 3, 2023
@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

It's a known unstable test. We should make it stable but not related with this change.

Thanks, I have created the issue:

@ashie
Copy link
Member

ashie commented Mar 3, 2023

This also fixes a problem that ignore_same_log_interval is not passed to the Supervisor.load_config() and that option is not reflected to the logger of the supervisor process.

You mean suppress_repeated_stacktrace?
According to your commit, the addressed parameter seems suppress_repeated_stacktrace, not ignore_same_log_interval.

@ashie
Copy link
Member

ashie commented Mar 3, 2023

Supervisor.load_config() is called in ServerEngine's reloading function, but Fluentd doesn't use the function even when SIGHUP or SIGUSR2. So I can't see the reason for initializing the logger in that callback.

Sorry, I can't remember the code around it soon.
I'm sorry to bother you but could you point out the rationale?

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

This also fixes a problem that ignore_same_log_interval is not passed to the Supervisor.load_config() and that option is not reflected to the logger of the supervisor process.

You mean suppress_repeated_stacktrace? According to your commit, the addressed parameter seems suppress_repeated_stacktrace, not ignore_same_log_interval.

No. Originally, ignore_same_log_interval is not passed here.

params = {
'main_cmd' => fluentd_spawn_cmd,
'daemonize' => @daemonize,
'inline_config' => @inline_config,
'log_path' => @log_path,
'log_rotate_age' => @log_rotate_age,
'log_rotate_size' => @log_rotate_size,
'chuser' => @chuser,
'chgroup' => @chgroup,
'use_v1_config' => @use_v1_config,
'conf_encoding' => @conf_encoding,
'signame' => @signame,
'fluentd_conf' => @conf.to_s,
'workers' => @system_config.workers,
'root_dir' => @system_config.root_dir,
'log_level' => @system_config.log_level,
'suppress_repeated_stacktrace' => @system_config.suppress_repeated_stacktrace,
'ignore_repeated_log_interval' => @system_config.ignore_repeated_log_interval,
'rpc_endpoint' => @system_config.rpc_endpoint,
'enable_get_dump' => @system_config.enable_get_dump,
'counter_server' => @system_config.counter_server,
'log_format' => @system_config.log.format,
'log_time_format' => @system_config.log.time_format,
'disable_shared_socket' => @system_config.disable_shared_socket,
'restart_worker_interval' => @system_config.restart_worker_interval,
}

So Fluent::Log::ignore_same_log_interval is always nil after Supervisor.load_config()

  • ignore_same_log_interval = params['ignore_same_log_interval']
  • log_opts = {suppress_repeated_stacktrace: suppress_repeated_stacktrace, ignore_repeated_log_interval: ignore_repeated_log_interval,
    ignore_same_log_interval: ignore_same_log_interval}
    logger_initializer = Supervisor::LoggerInitializer.new(
    log_path, log_level, chuser, chgroup, log_opts,
    log_rotate_age: log_rotate_age,
    log_rotate_size: log_rotate_size
    )

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

The reason why I fixed the codes about suppress_repeated_stacktrace is that originally the suppress_repeated_stacktrace value from the SytemConfig was applied in Supervisor.load_config().

'suppress_repeated_stacktrace' => @system_config.suppress_repeated_stacktrace,

Since this PR removes the initialization process of the logger from Supervisor.load_config(), now we need to apply the suppress_repeated_stacktrace value from the SytemConfig in here.

@log.apply_options(
format: @system_config.log.format,
time_format: @system_config.log.time_format,
log_dir_perm: @system_config.dir_permission,
ignore_repeated_log_interval: @system_config.ignore_repeated_log_interval,
ignore_same_log_interval: @system_config.ignore_same_log_interval
)

Note: The following logic applies the suppress_repeated_stacktrace value from the command line options.

log_opts = {suppress_repeated_stacktrace: opt[:suppress_repeated_stacktrace], ignore_repeated_log_interval: opt[:ignore_repeated_log_interval],
ignore_same_log_interval: opt[:ignore_same_log_interval]}
@log = LoggerInitializer.new(
@log_path, opt[:log_level], @chuser, @chgroup, log_opts,
log_rotate_age: @log_rotate_age,
log_rotate_size: @log_rotate_size
)

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

Thus, I thought I didn't change the specification of suppress_repeated_stacktrace,,, but now I'm concerned about if suppress_repeated_stacktrace of a worker process has been changed...
(Originally, was it applied to a worker process correctly??)
I will look into this.

@daipom
Copy link
Contributor Author

daipom commented Mar 3, 2023

Supervisor.load_config() is called in ServerEngine's reloading function, but Fluentd doesn't use the function even when SIGHUP or SIGUSR2. So I can't see the reason for initializing the logger in that callback.

Sorry, I can't remember the code around it soon. I'm sorry to bother you but could you point out the rationale?

Sure! Me too, sorry for the lack of explanation.
I will add more explanation about this.

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

Supervisor.load_config() is called in ServerEngine's reloading function, but Fluentd doesn't use the function even when SIGHUP or SIGUSR2. So I can't see the reason for initializing the logger in that callback.

Sorry, I can't remember the code around it soon. I'm sorry to bother you but could you point out the rationale?

Sure! Me too, sorry for the lack of explanation. I will add more explanation about this.

About ServerEngine logic

I'll check the Windows just to be sure.

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

On Windows, Fluentd overwrites Server::restart() and Server:reload(), so it does NOT use ConfigLoader::reload_config().

if Fluent.windows?
# Override some methods of ServerEngine::MultiSpawnWorker
# Since Fluentd's Supervisor doesn't use ServerEngine's HUP, USR1 and USR2
# handlers (see install_supervisor_signal_handlers), they should be
# disabled also on Windows, just send commands to workers instead.
def restart(graceful)
@monitors.each do |m|
m.send_command(graceful ? "GRACEFUL_RESTART\n" : "IMMEDIATE_RESTART\n")
end
end
def reload
@monitors.each do |m|
m.send_command("RELOAD\n")
end
end
end

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

On Ubuntu focal, I have confirmed that sending USR1, USR2, HUP to the supervisor process does not call Supervisor.load_config().

On Windows, I have confirmed that using fluent-ctl flush/restart/reload to the supervisor process does not call it.

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

It seems that Supervisor.load_config() is used by USR2 and HUP at the point of #880, but after #2716, Fluentd completely stopped using ServerEngine's reload_config() feature. (I'm not sure about Windows specification at this point.)

After #2716, there seems to be no longer any need to initialize the logger in Supervisor.load_config().

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

I tried Fluentd v1.8.1, which does not include #2716, and I confirmed that sending USR2 to the supervisor process call Supervisor.load_config().

From the above, I conclude as follows.

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

Thus, I thought I didn't change the specification of suppress_repeated_stacktrace,,, but now I'm concerned about if suppress_repeated_stacktrace of a worker process has been changed... (Originally, was it applied to a worker process correctly??) I will look into this.

I have checked this.
Surprisingly, suppress_repeated_stacktrace of SystemConfig was not applied to worker processes...
This PR fixes it too.

About suppress_repeated_stacktrace from SystemConfig:

@daipom
Copy link
Contributor Author

daipom commented Mar 8, 2023

I have organized all the information.
If you need more detailed information, please let me know.

Copy link
Member

@ashie ashie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed explanation! I've understood the issue.
It looks good to me.
I've added a nitpick comment in lines.

lib/fluent/supervisor.rb Outdated Show resolved Hide resolved
daipom added 2 commits March 9, 2023 10:37
Signed-off-by: Daijiro Fukuda <[email protected]>
It is hard to understand what is this method and how it works.

Signed-off-by: Daijiro Fukuda <[email protected]>
Comment on lines +427 to 431
# For ServerEngine's `reload_config`.
# This is called only at the initilization of the supervisor process,
# since Fluentd overwrites all related SIGNAL(HUP,USR1,USR2) and have own
# reloading feature.
def self.load_config(path, params = {})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should rename or move this method.
It is hard to understand what this method is.
I will do it in another PR.

@ashie ashie merged commit 7b0e0cd into fluent:master Mar 9, 2023
@ashie
Copy link
Member

ashie commented Mar 9, 2023

Thanks!

@daipom daipom deleted the refactor-logger-initialization branch March 9, 2023 02:51
@daipom
Copy link
Contributor Author

daipom commented Mar 9, 2023

Thanks for your review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants