-
Notifications
You must be signed in to change notification settings - Fork 848
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Homepage performance improvement #277
Homepage performance improvement #277
Conversation
9757fd7
to
04e5395
Compare
I just now saw this one: https://github.com/obsidiandynamics/kafdrop/pull/142/files Seeing the comment there I probably need to be splitting up that TopicVO as well and see if I can keep the partitionInfoList out of the TopicVO since its only used internally to pass along the values, not for outputting |
9b99d8b
to
a160e93
Compare
It would be nice if someone can review and give some comments so that I can hopefully finalize this and get it merged. |
a160e93
to
56fc075
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what I can understand it seems to be a very good improvement. Thank you!
I just have some concerns on the code style. For example can we rewrite function
synchronized void setAllPartitionSizes(Map<String, List<PartitionInfo>> topicsMap, List<TopicVO> topics)
to something like
synchronized Map<String, List<PartitionInfo>> getPartitionsSizes(List<TopicVO> topics)
I suspect that you have done some of the changes for performance reasons, but these kind of functions can be difficult to test and can have some strange side effects.
What do you think?
All these changes are basically just reducing the number of call to fetch kafka data or fetch less data. It has been a while since I wrote this and I actually don't remember why I made it a side effecting function. Normally I never would have done that so I suspect I had a good reason at the time. That said I totally agree that is shouldn't have to be. The offsets are only part the The other option would I think be I'm curious what you think about which direction I should go. |
Consider that I'm not an expert on this project, I'm just trying to maintain it ;-), but from what I understand I prefer your first solution.
Also here some kind of unit test can be useful, but I agree that can be difficult ... consider it as a "nice to have"! Thank you very much for you help and explanation! |
do only 2 request to retrieve all partition sized instead of 2 per topic
d271a48
to
1a3fb19
Compare
@davideicardi Thanks for starting to maintain it and the same as you I'm also not an expert on the project. I'm also not a Java programmer and don't know the programming standards for this particular project, which is why I was asking input which route I should take. I only did some cleanup yet but haven't had the time to change the method signature yet. |
@davideicardi I have tried to go the route of returning a new TopicVO with the updated data. I did not finish writing the code because I felt it became way to ugly and it also requires a deepcopy of the TopicVO and all it partitionData. I thought that copying the data and returning an updated version would be trivial to do. At least this is what I'm used to in Scala. However making a deepcopy in Java doesn't seem to be trivial at all. I would need to add a lot more code. On the Taking this into account I feel that, with the current data model, I unfortunately don't see another option than to make it a side effecting function. So I left the method name as is, since it actually setting the partitionsizes. What are your thoughts about this? |
I mentioned the consumer group issue previously and worked on that a bit and removed the getTopicsWithOffsets method again in this PR: jorkzijlstra#1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Let's wait a couple of days to see if someone has any comment and then for me we can merge it.
…mance-improvement only fetch topic data that are part of the consumer groupId (10 times faster)
I also just now merged jorkzijlstra#1 since its also cleaning up a method. |
I haven't had the chance to have a look into adding unit / integrations test. Despite this are you considering that the couple of days have passed already? |
@davideicardi Many thanks for starting to maintain this repo. |
This refactoring makes it possible to load in my case 1600 topic with 11000 topic partition in 30 seconds (tested with a kafka stack that is other side of the world which means a higher ttl) and consists of several changes. Without these changes it was impossible to load the homepage.
Main point is that this can be achieved by not going to kafka where its not needed, or combine several calls into one.
Total kafka calls removed:
TopicInfo
When retrieving the topicInfo it was first doing the listTopics and subsequenty looping over them to retrieve the partitions. The partition data is however already part of the listTopics data. Instead of going to kafka again for the data for each topic I'm just getting it from the listTopics data.
Old version:
New version
PartitionSizes
When getting the partition sizes the code now get the partition start en end offset by doing 2 call per topic. For the kafka consumer call you only give it the partitions of the topic, not the topic. Instead of giving it the partition of 1 topic why not all partition at once.
Old version:
New version
Homepage improvement
On the homepage and other pages we actually don't need the partition sizes but its being retrieved regardless. So I splitted the getTopics in 2 methods, one with and one without the partition sizes
Old version:
New version
Fix #246
Fix #233