-
Notifications
You must be signed in to change notification settings - Fork 5
Limiting or extending Kafka messaging
By default, Kafka conserves the past 7 days of messaging, which can take up a fair deal of disk space in a production system.
To alter the retention time for Kafka messaging, we'll need to update the Kafka config file: kafka/config/server.properties
In the case of the datasift-connector, chef is responsible for building the Kafka config (and will rewrite kafka/config/server.properties each time it rebuilds), so we'll need to update the local datasift-connector/chef/nodes/datasift-connector.json with a "log_retention_hours" value added to the kafka.broker object.
For example:
"kafka": {
"ulimit_file": 128000,
"broker": {
"log_dirs": [
"/mnt"
],
"log_retention_hours": 48,
"zookeeper_connect": [
"localhost:2181"
],
"zookeeper_connection_timeout_ms": 15000
}
}
....
When datasift-connector is rebuilt, /opt/kafka/config/server.properties (in your vagrant / EC2 instance) will now have log.retention.hours=48
Other Kafka config properties can be added as well: http://kafka.apache.org/07/configuration.html
Note, We recommend EC2 instances with at least 2GB of memory and 20GB of storage, which should be sufficient disk space in most cases.