Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promote Burrow 1.0 RC branch to master #283

Merged
merged 16 commits into from
Dec 1, 2017
Merged

Promote Burrow 1.0 RC branch to master #283

merged 16 commits into from
Dec 1, 2017

Conversation

toddpalino
Copy link
Contributor

The time has come to move on. It's time for Burrow 1.0!

The new code is a significant improvement over the original version, and resolves a number of technical debt issues:

  • All the bits are modular, allowing for new pieces (like notifiers, or consumer modules) to be added easily without a big impact on the rest of the code
  • The internals have test coverage, which will make it a lot easier to accept PRs
  • Configuration has been moved from gcfg to viper, which will make it more flexible
  • Logging has been moved to uber/zap. This will look a lot different, as it's structured logging.
  • The code is all documented for godoc now, and the wiki docs around config are being cleaned up.

There's also a lot of feature updates, and bug fixes:\

  • Biggest of all, topic deletion is now supported in Kafka clusters
  • Evaluation logic has been fixed to have fewer false alerts on stopped partitions
  • PID files are now more thoroughly checked to see if the process is actually running
  • TLS and SASL support have been generalized, and are fully supported for Kafka connections
  • Ownership info is tracked for new consumers
  • Dependencies have all been updated, and dep has replaced gpm

We're also saying farewell to the Slack notifier. Slack messages can easily be sent with the HTTP notifier, and there are sample templates to do that. We'll be adding more docs later on setting that up. Most services can be handled with a generic HTTP notifier, so the direction will be to add samples and docs on how to do that, rather than creating custom notifiers for everything.

toddpalino and others added 16 commits November 10, 2017 17:08
* Replace burrow with the proposed 1.0 framework
Look, it's essentially a complete rewrite. There's almost nothing left of the original code here, and none of the modules have been fleshed out yet.

The overall changes:
* Make burrow itself a lib wrapped with main, so we can wrap it inside other applications
* Move to a modular framework with well-defined interfaces between components
* Switch logging to uber/zap and lumberjack
* Start with being able to have parallel operation (notifier active eveywhere) so we can share load between instances

* Restructure a bit to resolve import cycles

* Make sure to gitignore the built binary

* Move modules to internal packages

* Tweak logging to work on windows

* Clean up coordinators a little more

* Fix syscalls for unix vs windows

* First pass at inmemory storage module

* tests for inmemory, and fixes found during testing

* Additional tests to make sure channels are closed after replies

* Actually start the mainLoop

* Assure only 1 storage module is allow, and add coordinator tests

* Fix storage code and tests for problems found while testing evaluators

* Add a fixture for storage to create a coordinator with storage module for testing code outside storage

* Fixes to evaluator code based on testing

* Tests for the evaluator coordinator and caching module

* Add a fixture for the evaluator that other testing can use

* Add start/stop and multiple request tests for the evaluator coordinator

* Remove extra parens

* Fix config name

* Add group whitelists to storage module, along with tests

* Fix a potential bug in min-distance where we would never create a new offset

* moar logging

* Add a group delete request for storage modules

* Added expiration of group data via lazy deletion on request

* First pass at cluster module for kafka with limited tests

* Add a shim interface for sarama.Client and sarama.Broker

* Switch kafka cluster module to use the shim interface for sarama

* Add tests for the rest of the kafka cluster module

* Add a storage request for setting partition owner for a group

* Add kafka_client consumer module and tests

* Add consumer coordinator tests

* Move the storage request send helper to a new file

* Refactor names for the sarama shims

* Add a shim for go-zookeeper so we'll be able to test

* Implement the kafkazk consumer module and tests

* Add tests for validation routines

* comment fix

* Add tests for helpers

* Add whitelist support to consumers

* Have the PID creator also check if the process exists before exiting

* Restructure main ZK as a coordinator to use the common interface

* Start notifiers, clean up some testing

* Add tests for HTTP notifier module

* Refactor notifier coordinator to move common logic out of the modules

* Refactor notifier whitelist and threshold accept logic to coordinator

* Move template execution up to a coordinator method for consistency

* Email notifier

* Slack notifier and tests

* Use asserts instead of panics for the HTTP tests

* Fix a case in the storage fixture where it won't get all the commits

* Check http notifier profile configs

* Make maxlag template helper use the CurrentLag field

* Rename NotifierModule to just Module

* Rename StorageModule to just Module

* Rename EvaluatorModule to just Module

* Add support for ZK locks, as well as tests

* Add a ticker that can be stopped and restarted

* Make the notifier coordinator use a ZK lock with the restartable ticker

* Add HTTP server and tests

* Update dependencies

* Clean up HTTP tests so we test the router configuration code

* Few more HTTP server tests, and flesh out log level set/get

* Reorder imports

* Fix copyright comments

* Formatting cleanup

* Set httprouter to master, since it hasn't released in 2 years

* touch up logging

* Remember to set the config as valid

* Use master branch of testify

* Updates found in testing

* Check for null fields in member metadata

* Fixes to metadata handling

* Add a worker pool for inmemory to consistently process groups

* Remove the kafka_client mainLoop, as it's not useful

* Fix formatting and a duplicate logging field

* Add support for CORS headers on the HTTP server

* Add a template helper for formatting timestamps using normal Time format strings

* Add support for basic auth in the HTTP notifier

* Refactor config to use viper instead of gcfg

* add more logging in Kafka clients, and fix config loading

* fix typo in client-id config string

* Catch errors when starting coordinators

* Log the http listener info

* Clean up some of the logging
* Fix how the extras field is pulled into the HTTP response structs

* Make sure the module accept group is always called

* Pause before testing stop on the storage coordinator

* Fix conditions where notifications are sent, and add a much more robust test
* Change the loop for evaluations to be started to be timed with jitter for each consumer group

* reorder imports
* Add owners to consumer group status response

* If no storage module configured, use a default

* If no evaluator module configured, use a default

* Fix default http server

* ConfigurationValid gets set by Start, not before

* cleanup methods that don't need to be exported
* Add group blacklists

* Reduce logging level for storage purging expired groups

* Start evaluator and httpserver before clusters/consumers

* Remove the requirement that you must have a cluster and consumer module defined
* Lag values should always be unsigned ints

* unnecessary cast

* Update deps
* Remove slack notifier
* Add example slack templates
* Godoc docs for everything, and resolve all golint issues
* Update example configuration files

* Fix example email template

* Update docs
@toddpalino
Copy link
Contributor Author

Oh, and if it isn't obvious, this is basically a complete rewrite.

This will close the following issues:
#13 #82 #127 #146 #150 #155 #165 #167 #169 #170 #190 #200 #203 #204 #205 #210 #216 #218 #223 #225 #228 #233 #237 #241 #242 #252 #254 #256 #257

This will close the following PRs:
#156 #177 #179 #184 #186 #189 #192 #193 #211 #217 #224 #229 #244 #245

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants