Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_s3: log_key configuration option implemented #3668

Merged
merged 1 commit into from
Jul 23, 2021

Conversation

StephenLeeY
Copy link

@StephenLeeY StephenLeeY commented Jun 19, 2021

Signed-off-by: Stephen Lee [email protected]


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#552


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

By default, the whole log record will be sent to S3.

If you specify a key name with this option, then only the value of that key
will be sent to S3.

For example, if you are using the Fluentd Docker log driver, you can specify
log_key log and only the log message will be sent to S3. If the key is not
found, it will skip that record.

This patch has been tested using test configuration files and various
input plugins (random, exec, etc) as well as valgrind. The resulting output,
as expected, only contained values of the specified log_key.

Record without log_key

{"date":"2021-06-16T19:56:28.441428Z","log":"Pi is roughly 3.141605814119434"}

Record with log_key log

Pi is roughly 3.141605814119434

Example Configuration File

[INPUT]
    name exec
    command date +"%Y-%m-%d %H:%M:%S,%3N"

[OUTPUT]
    name s3
    match *
    region us-west-2
    bucket bucket-name
    s3_key_format /test/$UUID.gz
    use_put_object true
    total_file_size 2M
    upload_timeout 10s
    compression gzip
    store_dir /tmp/fluent-bit/s3-output-buffer
    retry_limit 5
    log_key exec

@StephenLeeY
Copy link
Author

For some reason, when running with the example configuration file, the Valgrind is throwing many errors even on a clean master branch. For this reason, I believe it is not my code throwing these errors but the current master branch code. Below are the Valgrind logs for the example configuration file.

Example Configuration File Valgrind Output Logs

==13492== HEAP SUMMARY:
==13492==     in use at exit: 701,635 bytes in 5,374 blocks
==13492==   total heap usage: 126,115 allocs, 120,741 frees, 23,018,072 bytes allocated
==13492==
==13492== LEAK SUMMARY:
==13492==    definitely lost: 0 bytes in 0 blocks
==13492==    indirectly lost: 0 bytes in 0 blocks
==13492==      possibly lost: 0 bytes in 0 blocks
==13492==    still reachable: 701,635 bytes in 5,374 blocks
==13492==         suppressed: 0 bytes in 0 blocks
==13492== Rerun with --leak-check=full to see details of leaked memory
==13492==
==13492== For counts of detected and suppressed errors, rerun with: -v
==13492== ERROR SUMMARY: 16 errors from 11 contexts (suppressed: 0 from 0)

If Valgrind and FluentBit are running with a config option that has oneshot set to true, there will be no Valgrind errors. Here is a sample config file with oneshot true.

oneshot true Configuration File

[INPUT]
    name exec
    command date +"%Y-%m-%d %H:%M:%S,%3N"
    oneshot true

[OUTPUT]
    name s3
    match *
    region us-west-2
    bucket bucket-name
    s3_key_format /test/$UUID.gz
    use_put_object true
    total_file_size 2M
    upload_timeout 10s
    compression gzip
    store_dir /tmp/fluent-bit/s3-output-buffer
    retry_limit 5
    log_key exec

Below are the Valgrind Output Logs for this specific case.

oneshot true Valgrind Output Log

[2021/06/19 02:37:43] [ warn] [engine] service will stop in 5 seconds
[2021/06/19 02:37:48] [ info] [engine] service stopped
==12155==
==12155== HEAP SUMMARY:
==12155==     in use at exit: 701,635 bytes in 5,374 blocks
==12155==   total heap usage: 119,836 allocs, 114,462 frees, 14,688,145 bytes allocated
==12155==
==12155== Searching for pointers to 5,374 not-freed blocks
==12155== Checked 940,744 bytes
==12155==
==12155== LEAK SUMMARY:
==12155==    definitely lost: 0 bytes in 0 blocks
==12155==    indirectly lost: 0 bytes in 0 blocks
==12155==      possibly lost: 0 bytes in 0 blocks
==12155==    still reachable: 701,635 bytes in 5,374 blocks
==12155==         suppressed: 0 bytes in 0 blocks
==12155== Reachable blocks (those to which a pointer was found) are not shown.
==12155== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==12155==
==12155== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==12155== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@StephenLeeY StephenLeeY marked this pull request as ready for review June 19, 2021 02:45
@StephenLeeY
Copy link
Author

@PettitWesley

@PettitWesley PettitWesley self-requested a review June 20, 2021 00:39
Comment on lines 1425 to 1427
flb_plg_error(ctx->ins, "Could not allocate enough "
"memory to read record");
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

always call flb_errno() after alloc failures. Also, it'd be more typical to just fail and return if an allocation fails. Remember that allocation failures shouldn't happen normally.

Comment on lines 1445 to 1450
if (out_buf == NULL) {
out_buf = flb_sds_create(val_buf);
} else {
out_buf = flb_sds_cat(out_buf, val_buf, strlen(val_buf));
}
out_buf = flb_sds_cat(out_buf, "\n", 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all of these functions can allocate memory IIRC, and need checks that they were successful

if (strncmp(ctx->log_key, key_str, key_str_size) == 0) {
found = FLB_TRUE;

orig_val_buf = flb_malloc(bytes + bytes / 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allocated 1.25 * bytes here doesn't make sense to me. bytes is the entire size of the msgpack, but this is inside an iteration of the loop, where you are only dealing with a single record.

If possible write this code so that you do not allocate any memory inside the loop. that will make it much more efficient and faster. One thing with C code is that allocating a lot of memory should often be considered to be completely fine- the machines fluent bit is run on usually have a lot of memory. For efficiency, you should often care more about keeping those allocs infrequent, because calls to alloc memory are much slower than any other computation.

And for this code, I see no reason why you can't alloc one large buffer outside the loop, and then use it within the loop. You might possibly need multiple buffers, some for copying and temporarily storing data. But the point is to try to see if you can do everything by alloc'ing a few buffers before/after the loop. Nothing inside of it.

@StephenLeeY
Copy link
Author

Added documentation PR here: fluent/fluent-bit-docs#552 and added fixes.

Comment on lines 1460 to 1461
alloc_error = 1;
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see alloc_error as a condition in the loop, but you also use a break (which seems better)...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this part is in a nested loop, I need to double break, hence alloc_error to break out of the outer while loop. I could use a goto, but I felt a conditional variable was better practice here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think that works. I've also seen the use of goto with a well-named label like:

if something bad happens:
     goto break_outer_loop

Comment on lines 1454 to 1455
if (out_buf == NULL) {
out_buf = flb_sds_create(val_buf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very inefficient and I don't understand it.

Why do you need both val_buf and out_buf?

With out_buf, you initialize it to the strlen of val_buf on the first iteration, and then you will realloc it frequently with the calls to flb_sds_cat. So you are not doing efficient memory allocations.

Remember that the goal was to avoid any allocation in the loop.

I do not think you need two buffers. I think you can do this with only 1 buffer, allocated before the loop. You can track an offset for in that single buffer and repeatedly write the logs & newlines to it. No need to copy to another buffer.

Comment on lines 1831 to 1832
"that key will be sent to S3. For example, if you are using "
"the Fluentd Docker log driver, you can specify log_key log and only "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "For example, if you are using Docker, you can specify log_key log and only"

Fluentd docker log driver is too specific, and its not the only thing which adds this log key

@StephenLeeY
Copy link
Author

@PettitWesley Addressed comments and ready for review.

@PettitWesley
Copy link
Contributor

@DrewZhang13 @zhonghui12 I'd like both of you to look at this:

  • Bare minimum: At least read through the code once and understand it
  • Points will be awarded for: Useful code review that catches issues or helps Stephen improve

Comment on lines 1483 to 1709
out_buf = flb_sds_create(val_buf);
if (out_buf == NULL) {
flb_plg_error(ctx->ins, "Error creating buffer to store log_key contents.");
flb_errno();
return NULL;
}
flb_free(val_buf);

return out_buf;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why tho? Why not just return val_buf?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later on, we try to do flb_sds_destroy(json), which throws an error when used on a character buffer.

Copy link
Contributor

@DrewZhang13 DrewZhang13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is pretty clean and clear, left some comments

@@ -624,6 +624,11 @@ static int cb_s3_init(struct flb_output_instance *ins,

}

tmp = flb_output_get_property("log_key", ins);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider length restriction for the log key config?

plugins/out_s3/s3.c Show resolved Hide resolved
plugins/out_s3/s3.c Show resolved Hide resolved
@StephenLeeY
Copy link
Author

@PettitWesley Addressed zhonghui and drewzhang's comments!

PettitWesley
PettitWesley previously approved these changes Jul 17, 2021
@PettitWesley
Copy link
Contributor

@StephenLeeY rebase with master and squash your commits, then we can merge this. Also put up a PR against the 1.8 branch as well.

@StephenLeeY
Copy link
Author

Ready for merge @PettitWesley

PettitWesley
PettitWesley previously approved these changes Jul 23, 2021
@PettitWesley
Copy link
Contributor

@StephenLeeY needs rebase

By default, the whole log record will be sent to S3.

If you specify a key name with this option, then only the value of that key
will be sent to S3.

For example, if you are using the Fluentd Docker log driver, you can specify
log_key log and only the log message will be sent  to S3. If the key is not
found, it will skip that record.

This patch has been tested using test configuration files and various
input plugins (random, exec, etc). The resulting output, as expected,
only contained values of the specified log_key.

Signed-off-by: Stephen Lee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants