Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.With() function repeating keys/values #622

Open
henriquechehad opened this issue Aug 15, 2018 · 4 comments
Open

.With() function repeating keys/values #622

henriquechehad opened this issue Aug 15, 2018 · 4 comments

Comments

@henriquechehad
Copy link

Using .With() function it's repeating the same key/value for every With call. Should it overwrite the previous values or have a function to remove specific keys/values previously set?


import "go.uber.org/zap"

func main() {
	logger, _ := zap.NewProduction()
	defer logger.Sync()
	sugar := logger.Sugar()

	sugar = sugar.With("test", "value")
	sugar.Info("message 01")

	// prints:
	// {"level":"info","ts":1534362222.643982,"caller":"tmp/main.go:11","msg":"message 01","test":"value"}

	sugar = sugar.With("test", "value")
	sugar.Info("message 02")

	// prints duplicated "test" "value"
	// {"level":"info","ts":1534362222.644039,"caller":"tmp/main.go:14","msg":"message 02","test":"value","test":"value"}
}
@henriquechehad henriquechehad changed the title .With() function repeating key/values .With() function repeating keys/values Aug 15, 2018
@akshayjshah
Copy link
Contributor

This is as designed, and is documented in NewJSONEncoder. It's technically allowable by the JSON specification, and all deserialization code that I'm aware of (including Go's standard library) preserves only the last value.

Supporting a last-writer-wins policy as you suggest is remarkably difficult in zap (and similar projects) - zap is fast because it encodes fields as they're added, without maintaining some intermediate representation (commonly map[string]interface{}). It's technically possible, and could even be made reasonably fast in the case when there are no duplicates, but we haven't run into many situations where it's valuable - usually we run into this when two developers are mistakenly stomping on each others' log data, so keeping both values in the output makes debugging easier.

In short, this is functioning as designed. If you're interested in a fairly complex PR, I can guide you through how we might implement this feature and benchmark the performance impact.

@geoah
Copy link

geoah commented Mar 3, 2021

Just a note:

This seems to be an issue when sending said logs to GCP's stackdriver.
It takes the duplicated fields and concats them.

{
	"trace": "foo",
	"trace": "foo"
}

will be presented in stackdriver as

{
	"trace": "foofoo"
}

@akshayjshah
Copy link
Contributor

Huh! I'm a little surprised by this, but perhaps Google's also trying to avoid dropping the duplicate data. I'm no longer at Uber, so the current maintainers have the final say on whether to make any changes to the current duplicate-handling code.

If this is particularly inconvenient for you, you can wrap zap's JSON encoder and fix it yourself. To keep performance reasonably good, you could use a Bloom filter:

  • The JSON encoder can keep track of all keys seen so far in an uint64 Bloom filter.
  • If it sees a key that's possibly a duplicate, it can unmarshal the JSON accumulated so far into a map[string]json.RawMessage, overwrite the existing data (if any), and re-serialize it.

That would keep serialization reasonably fast and zero-allocation when there are no duplicates. I haven't thought through how you'd handle duplicates in nested objects (created by zap.Namespace).

@pohly
Copy link

pohly commented Nov 21, 2024

This is as designed, and is documented in NewJSONEncoder. It's technically allowable by the JSON specification,

According to the RFC:

The names within an object SHOULD be unique. ... When the names within an object are not unique, the behavior of software that receives such an object is unpredictable.

zap is relying on readers implementing a "last one wins" approach, but that is not required by the spec.

You are right that SHOULD is not MUST, so it's not wrong to write JSON with duplicates - it's just not guaranteed to be interoperable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants