-
Notifications
You must be signed in to change notification settings - Fork 344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ingester (and kafka) support #168
Conversation
Codecov Report
@@ Coverage Diff @@
## master #168 +/- ##
=========================================
+ Coverage 96.46% 96.7% +0.24%
=========================================
Files 30 32 +2
Lines 1469 1638 +169
=========================================
+ Hits 1417 1584 +167
- Misses 40 41 +1
- Partials 12 13 +1
Continue to review full report at Codecov.
|
pkg/deployment/collector.go
Outdated
storageType := c.jaeger.Spec.Storage.Type | ||
// If ingester options have been defined, then change storage type | ||
// to Kafka, and the storage options will be used in the Ingester instead | ||
if len(c.jaeger.Spec.Ingester.Options.Map()) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting, do we assume ingester options have to be provided?
Or in other words are not we supporting the case where no conf props are needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the easiest approach, as I believe ingester options will always be required - but is not intuitive. As mentioned in the PR description, I think a better approach would be to have more distinct strategies that effectively define the architectural structure being used. So all-in-one, another for the current production setup, and a new third option used to represent the kafka based approach.
pkg/deployment/collector_test.go
Outdated
dep := collector.Get() | ||
|
||
envvars := []v1.EnvVar{ | ||
v1.EnvVar{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: types can be omitted here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applies to other places in this repo as well
|
||
// NewIngester builds a new Ingester struct based on the given spec | ||
func NewIngester(jaeger *v1alpha1.Jaeger) *Ingester { | ||
if jaeger.Spec.Ingester.Size <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move this to normalize func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, but currently this is consistent with the other components. So should be handled in one go in another PR.
} | ||
|
||
// Get returns a ingester pod | ||
func (i *Ingester) Get() *appsv1.Deployment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about consistency here...
The other creators in this repo returns object not pointer. Maybe a better to standardize and include func which check if the ingester is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly - but the other deployments in this package (query, collector, agent) all return pointer. So this is consistent with the other deployments at the moment - so if going to be changed, it should be done in separate PR for all components.
This PR is on hold until @jpkrohling returns so we can resolve how the operator should know when ingester is being used. |
I trust your judgment about which approach is better. Initially, I would say that having the ingester options are not a problem, but I agree that it's cleaner to have the three options as you listed them. The main motivation as I see it is that we could implement new features specific for the strategy in a way that is transparent to the user. This means that we are making the operator more opinionated, which I see as a good thing. About the naming, how about:
|
@jpkrohling I was hoping we could move away from So I think we are agreed that the strategy name is required for kafka/ingester - but just need to resolve the actual names to use. |
@jpkrohling Created a new strategy called If this approach seems reasonable, then next step would be to add some e2e tests and docs. |
+1 from my side! |
README.adoc
Outdated
|
||
The agent can be injected as a sidecar on the instrumented application or as a daemonset. | ||
|
||
The query and collector services are configured with a supported storage type - currently cassandra or elasticsearch. Multiple instances of each of these components can be provisions as required for performance and resilience purposes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/can be provisions/can be provisioned/
?
README.adoc
Outdated
|
||
=== Streaming | ||
|
||
The `streaming` strategy is designed to augment the `production` strategy by providing a streaming capability that effectively sits between the collector and the backend storage (e.g. cassandra or elasticaearch). This provides the benefit of reducing the pressure on the backend storage, under high load situations, and enables other trace post processing capabilities to tap into the real time span data directly from the streaming platform (kafka). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/elasticaearch/elasticsearch/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry should have re-read :)
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
…ger instance for streaming Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
…estarts when just demo/testing Signed-off-by: Gary Brown <[email protected]>
@jpkrohling @pavolloffay Ready for re-review. Will follow up with e2e tests in another PR. Have locally tested with strimzi. |
Signed-off-by: Gary Brown <[email protected]>
Signed-off-by: Gary Brown <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The e2e test is missing, but it's coming as part of a future PR.
Reviewed 3 of 10 files at r1, 4 of 11 files at r3, 6 of 10 files at r5, 2 of 4 files at r6.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @objectiser and @pavolloffay)
README.adoc, line 190 at r6 (raw file):
The only additional information required is to provide the details for accessing the Kafka platform, which is configured in a new `ingester` component: [source,yaml]
Would it be too much to include a complete example here, instead of a snippet? If it's too much, then perhaps a note with a link to simple-streaming.yaml
?
pkg/deployment/collector.go, line 69 at r6 (raw file):
storageType := c.jaeger.Spec.Storage.Type // If sttategy is "streaming", then change storage type
typo: s/sttategy/strategy
pkg/deployment/ingester.go, line 27 at r1 (raw file):
Previously, objectiser (Gary Brown) wrote…
Possibly, but currently this is consistent with the other components. So should be handled in one go in another PR.
+1, this is being handled in another PR (the one that adds the Update capabilities)
pkg/deployment/ingester.go, line 31 at r6 (raw file):
if jaeger.Spec.Ingester.Image == "" { jaeger.Spec.Ingester.Image = fmt.Sprintf("%s:%s", viper.GetString("jaeger-ingester-image"), viper.GetString("jaeger-version"))
This will create a conflict with #174
pkg/strategy/controller.go, line 39 at r6 (raw file):
} if strings.ToLower(jaeger.Spec.Strategy) == "streaming" {
Could you create an issue where we replace all the ToLower(...) == ...
with the EqualFold()
that you used elsewhere in this PR?
pkg/strategy/streaming.go, line 36 at r6 (raw file):
} func (c *streamingStrategy) Create() []runtime.Object {
Is this the same as the production + ingester.Get()
? I wonder if it would make sense to reuse that. Perhaps not, as the code is very simple and it might be easier/more maintainable to duplicate it, but perhaps you've given some thought about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @jpkrohling and @pavolloffay)
README.adoc, line 190 at r6 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
Would it be too much to include a complete example here, instead of a snippet? If it's too much, then perhaps a note with a link to
simple-streaming.yaml
?
Done.
pkg/deployment/collector.go, line 70 at r1 (raw file):
Previously, objectiser (Gary Brown) wrote…
This was the easiest approach, as I believe ingester options will always be required - but is not intuitive. As mentioned in the PR description, I think a better approach would be to have more distinct strategies that effectively define the architectural structure being used. So all-in-one, another for the current production setup, and a new third option used to represent the kafka based approach.
Now explicit 'streaming' strategy is used to determine whether ingester is required.
pkg/deployment/collector.go, line 69 at r6 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
typo:
s/sttategy/strategy
Done.
pkg/deployment/collector_test.go, line 289 at r1 (raw file):
Previously, pavolloffay (Pavol Loffay) wrote…
applies to other places in this repo as well
Done.
pkg/deployment/ingester.go, line 27 at r1 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
+1, this is being handled in another PR (the one that adds the Update capabilities)
Done.
pkg/deployment/ingester.go, line 31 at r6 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
This will create a conflict with #174
We can sort out the conflict depending which gets merged first.
pkg/strategy/controller.go, line 39 at r6 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
Could you create an issue where we replace all the
ToLower(...) == ...
with theEqualFold()
that you used elsewhere in this PR?
Done #181
pkg/strategy/streaming.go, line 36 at r6 (raw file):
Previously, jpkrohling (Juraci Paixão Kröhling) wrote…
Is this the same as the production +
ingester.Get()
? I wonder if it would make sense to reuse that. Perhaps not, as the code is very simple and it might be easier/more maintainable to duplicate it, but perhaps you've given some thought about it?
Did consider whether could make reusable - but it is slightly different as kafka is the storage in collector. So thought for now better to keep separate - but can be refactored later, especially if we introduce other strategies that share common elements.
Signed-off-by: Gary Brown <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r7.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @objectiser and @pavolloffay)
pkg/strategy/streaming.go, line 36 at r6 (raw file):
Previously, objectiser (Gary Brown) wrote…
Did consider whether could make reusable - but it is slightly different as kafka is the storage in collector. So thought for now better to keep separate - but can be refactored later, especially if we introduce other strategies that share common elements.
No, that's fine. At most, a comment in the code would be nice, but it's not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @pavolloffay)
Signed-off-by: Gary Brown <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r8.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @pavolloffay)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @pavolloffay)
pkg/deployment/collector.go, line 70 at r1 (raw file):
Previously, objectiser (Gary Brown) wrote…
Now explicit 'streaming' strategy is used to determine whether ingester is required.
Done.
@pavolloffay don't think any of your remaining points were blockers - but can be dealt with in separate PR if they are still issues. |
This is an initial suggestion for how we can easily switch from a simple production deployment (e.g. collector and query using a common backend persistent storage) to one where Kafka is used as a buffer between collectors (publishing to Kafka) and ingesters (consuming from Kafka and storing in the specified storage).
The difference between the two configurations would be:
and
This is a simple change to the configuration, i.e. defining the ingester options, but not necessarily clear this results in a more complex architecture.
Issues to consider:
storage
definitions - whereas the CR only has one area to define these. We could list storage types, e.g.type: elasticsearch, kafka
with options for both, but it is still not clear which should be used in collector and ingester.ingester
node in the CR, equivalent to the other top level nodes, to be able to override the image - but these are just used to define information about the individual components. Not necessarily how those components collaborate with each other.So the question is whether defining the ingester should imply using the
collector->kafka->ingester
approach and that the storage option therefore moves to the final component in the chain - i.e. ingester rather than collector.If the existence of ingester options is not considered appropriate, then I would recommend we define more explicit strategies:
allInOne
as now, single deployment with agent, collector and query(what we name the 2nd and 3rd strategies is another matter)
I think this is actually the best approach, as it is clear what architecture the user wants, and the CR can then be verified to ensure it has all of the required information defined for each of the components. It also means that the
ingester
node is optional - as the kafka options could just be listed in thestorage.options
, but would still result in the ingester being configured.Feedback on the current PR and proposed change would be appreciated - however I might not work on it much until new year.
(Also requires e2e tests)
Signed-off-by: Gary Brown [email protected]