Configuring a Kafka based Jaeger architecture #95

objectiser · 2018-11-07T19:40:54Z

Currently Kafka support is being added to Jaeger in two places, as a storage plugin and an ingester.

The aim of this approach is to have a collector configured with Kafka as storage, to publish spans to Kafka, and then ingesters that can consume those messages and store the spans in a real storage backend (e.g. elasticsearch/cassandra).

We need to consider how such a configuration would be defined in the operator's CR?

Currently kafka is being listed as a storage type - but an operator CR can only support a single storage type - so either

We need to treat this kafka based configuration as something else - i.e. the storage type is specified as the real storage used by the ingester, but the collector using kafka and the ingester need to be configured from a different spec?
There would be two separate CRs - one defining the collector with Kafka storage, and the other defining the ingester with real storage. Issue with this approach is that only a subset of the components may need to be configured in each CR - so query will only be defined in the second CR (as it will also use the same real storage), and agent may potentially be defined in the first, as it will use the collector.

Although Kafka not yet fully supported, we need to consider how its introduction may impact the spec structure.

objectiser · 2018-11-07T19:46:29Z

My preference would be to go with option 1 - treating this Kafka based configuration as a virtual collector - so the current config still applies to the collector and backend storage used by the ingester.

The additional part is a Kafka based configure that connects the two parts - and if defined, it would result in the collector being configured to use Kafka storage, and the ingester being deployed.

objectiser · 2018-11-08T08:05:18Z

There is also the option to have multiple storage plugins configured within the collector - e.g. kafka and elasticsearch.

So possibly we need more than two strategy values - one for allInOne but three or more 'production' type strategies. One for simple collector/storage, one for collector/multistorage and one for "distributed collector" using kafka.

The first two could potentially be collapsed into a single 'strategy', by simply allowing a comma separated list of storage types, with kafka options listed in the storage.options section.

jpkrohling · 2018-11-08T09:30:01Z

How about having a JaegerKafkaStorageSpec as a child of JaegerStorageSpec, like CassandraCreateSchema currently is?

This spec could then hold an Options object, related to the configuration of the backing storage.

apiVersion: io.jaegertracing/v1alpha1
kind: Jaeger
metadata:
  name: with-kafka
spec:
  strategy: all-in-one
  storage:
    type: kafka
    kafka:
      options:
        es:
          server-urls: http://elasticsearch:9200
          username: elastic
          password: changeme

objectiser · 2018-11-08T09:39:25Z

I think we just need to look at the different backend configurations (as in ways the components are organised) and see if there is a natural way to structure the information in the CR to provide the appropriate flexibility, but also clarity. I'll put together some examples soon.

objectiser · 2018-11-08T11:02:55Z

Possible suggestions. First for the use of kafka as a secondary storage plugin within the collector:

apiVersion: io.jaegertracing/v1alpha1
kind: Jaeger
metadata:
  name: with-es-and-kafka-storage
spec:
  strategy:production
  storage:
    type: es,kafka
    options:
      es:
        server-urls: http://elasticsearch:9200
        username: elastic
        password: changeme
      kafka:
        brokers: xyz
        topic: spans

and using the ingester approach:

apiVersion: io.jaegertracing/v1alpha1
kind: Jaeger
metadata:
  name: with-ingester
spec:
  strategy:production
  ingester:
    enabled: true
  storage:
    type: es
    options:
      es:
        server-urls: http://elasticsearch:9200
        username: elastic
        password: changeme
      kafka:
        brokers: xyz
        topic: spans

this means that, as the ingester has been enabled, then the collector will use the kafka storage plugin (using the config from storage.options), and the 'enabled' ingester will use the actual storage type (i.e. elasticsearch in this case). So the change in deployment structure is triggered by the ingester,enabled being true.

Note: the kafka options are optional, if defaults are appropriate - although likely that the kafka.brokers option would be required in practice.

jpkrohling · 2018-11-08T16:18:54Z

I like your suggestions. Just one thing to think about: what would happen if a user emits/forgets the ingester: enabled from the second example?

To me, it's still clear that the user intends to use the ingester there. In that case, the ingester option wouldn't be necessary.

objectiser · 2018-11-08T20:08:53Z

Two reasons why it wouldn't work - the current approach allows multiple storage configurations to be defined, and only used if the storage.type is specified - this is used in the istio helm chart as a way to define configurations for multiple potential storage which is then selected based on the storage.type parameter.

The other reason is that the kafka storage/ingester options have default values - so technically no options need to be specified - so we shouldn't rely on someone specifying storage.options.kafka.xxx as a signal that an ingester should be used.

Not sure having enabled elements is a bad thing - we might also want to support it under (for example) the query and collector components, to enable a Jaeger instance to be deployed with only some of the components. For example - in a particular namespace, we may only want to deploy query servers.

jpkrohling · 2018-11-09T09:28:27Z

Two reasons why it wouldn't work

Agree with both. I guess there's no easy way to detect when the user wants to use the ingester without the flag, then.

objectiser · 2019-01-31T11:31:13Z

Implemented in #168

objectiser closed this as completed Jan 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuring a Kafka based Jaeger architecture #95

Configuring a Kafka based Jaeger architecture #95

objectiser commented Nov 7, 2018

objectiser commented Nov 7, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 8, 2018 •

edited

Loading

objectiser commented Nov 8, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 8, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 9, 2018

objectiser commented Jan 31, 2019 •

edited

Loading

Configuring a Kafka based Jaeger architecture #95

Configuring a Kafka based Jaeger architecture #95

Comments

objectiser commented Nov 7, 2018

objectiser commented Nov 7, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 8, 2018 • edited Loading

objectiser commented Nov 8, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 8, 2018

objectiser commented Nov 8, 2018

jpkrohling commented Nov 9, 2018

objectiser commented Jan 31, 2019 • edited Loading

jpkrohling commented Nov 8, 2018 •

edited

Loading

objectiser commented Jan 31, 2019 •

edited

Loading