Hadoop Autoscaling Metric Publisher

For YARN

We collect the following yarn metrics

Active Nodes
Containers Allocated
Containers Pending
Cluster Memory Usage
Unhealthy Nodes
Apps Running

For Hadoop1

This is a simple python program that is expected to run on Cron on the JobTracker machine. We collect hadoop metrics like

Total Map Slots
Total Reduce Slots
Total Nodes
Number of Map slots Required
Number of Reduce Slots Required

We push all these metrics to Cloudwatch periodcially. You can then create alarms which could trigger autoscaling of Hadoop Clusters. When visualized it as a Demand vs Supply it looks like this

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
docs		docs
lib		lib
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.json.example		config.json.example
requirements.txt		requirements.txt
yarn.py		yarn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hadoop Autoscaling Metric Publisher

For YARN

For Hadoop1

License

About

Releases 3

Packages

Contributors 2

Languages

License

indix/hadoop-as-publisher

Folders and files

Latest commit

History

Repository files navigation

Hadoop Autoscaling Metric Publisher

For YARN

For Hadoop1

License

About

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages