-
Notifications
You must be signed in to change notification settings - Fork 747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kafka support for eventBus #1682
Comments
We are open to adopt Kafka eventbus. Few things need to be addressed for the implementation:
|
Maybe we can assume both Kafka and the extra storage (e.g. a database) are managed by the providers (users). |
@whynowy what if the sensor itself was used for storage of the status? This has the benefit of mitigating your second point, also it could be used to standardize the way information is persisted regardless of the event bus type. An (over-simplified) example: apiVersion: argoproj.io/v1alpha1
kind: Sensor
metadata:
name: sensor-jetstream
spec:
dependencies:
- name: d1
eventSourceName: es
eventName: e1
- name: d2
eventSourceName: es
eventName: e2
status:
dependencies:
d1: true
d2: true |
This means the application running in the sensor pod has the privilege to watch Sensor objects, this is thing we try to avoid. |
Yes, however, the service account would only need namespace scoped access to Sensor objects. I'm curious about the reason for avoiding this? If information is not shared via the Sensor object I think the only other solution would be to share state through a database which as you point out would need to be managed and maintained. Currently if jetstream is used as an eventbus then the jetstream key-value of store is used for storage, this won't work with kafka - do you think it would desirable to decouple this storage layer from the eventbus? |
Why don't put the storage layer together with Kafka as EventBus? |
Yes this is possible and perhaps the best route given the circumstances, but the drawback here is tight coupling. If the storage layer and the eventbus are tightly coupled then adding a new eventbus requires also adding a new storage technology, decoupling would have the benefit of simplifying argo events (eg: one codepath for maintaining this state vs duplicated code depending on whether or not the user chooses jetstream or kafka as an eventbus). I understand and agree that mixing of concerns between the controller and the sensor deployment is problematic. What I struggle to understand is why the sensor state is managed independently of the sensor object? Events are consumed by the sensor deployment, and transition the state of the sensor (and therefore it's trigger conditions). However, the sensor state is opaque, it is not knowable by looking at the sensor object. This feels off to me, the sensor object itself feels like the most logical place to maintain this information. Do you think there is a way we could maintain the sensor state in the sensor object without violating SRP (and security concerns)? For example, could the controller manage the sensor state? |
I think making it easy-to-use for the users is more important than others.
Don't make yourself confused by Sensor state and the state of the workload running in the sensor pod, they are two different things. Sensor state represents the orchestration state, when the orchestration is done, any change on the control plane (including k8s control plane and your applications control plane) should not impact your workload - think about if you have an application running in a k8s deployment, and somehow you have your application status stored in the deployment object status - then you will see your application being unreliable because of control plane upgrade or an etcd issue, on the other hand, the other ppl might also see the k8s control plane being unreliable because of application's heavy operations.
The controller should only be responsible for orchestration. |
Thanks @whynowy, we would like to tackle implementing this. From what I gather from our conversation
|
Thank you so much @dfarr ! |
Hi, I'm wondering that here the
Cheers! |
Hi @dfarr thanks for sharing the links. I use the |
There hasn't been a release since the kafka eventbus feature has been merged. You will need to use the |
Hi @dfarr have now deployed the
Then deployed the
This seems to work now, but when looking into the state of the bus argo complains about exotic specs:
Created an eventsource (with Kafka too), and a sensor. The eventsource pod gots notified about an event when publishing a message, but no workflow has been triggered. I assume that is due to the skipped Any chance to get this setup working? Thanks for your help! |
Do you have a kafka broker available at If you don't have a kafka cluster (and topics created, unless you have configured auto create), I would expect to see an connection failure in both the EventSource and Sensor. Can you post the logs from one or both of these? |
I cannot replicate this problem locally. Can you provide logs from both your EventSource and Sensor pods? I am wondering:
|
I forgot to mention |
Closed by #2502. |
Ok, so the Thanks for your help and patience! |
Is your feature request related to a problem? Please describe.
Kafka is widely used, and can easily get support from cloud vendors, such as aws, gcloud, azure, aliyun, tencentcloud etc.
It will be good if we can add kafka support for eventBus.
Describe the solution you'd like
Add kafka as an alternative solution for eventbus.
Describe alternatives you've considered
In argo-events, NATs is not only used as eventBus, but also leaderElection for Sensor deployment, so the first step should
be move leader election from NATs to kubernetes. Second, add kafka support for eventBus.
Additional context
Add any other context or screenshots about the feature request here.
As disscused in #1163, using kubernetes for leader election involves extra configuration like RBAC, this should also be taken into consideration; for me, it's more important for data security (cloud vendor to gurrantee our data), instead of use self-maintained NATs.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: