HYPERPARAMETER OPTIMIZATION USING RAYTUNE AND RAPIDS

Hyperparameter optimization is a method used to enhance the accuracy of a model. Hyperparameter tuning can make the difference between an average model and a highly accurate one. The goal of this model was to predict the value of a football player by using random forest on gpu and optimized the accuray of the prediciton based on features such as rating, skill rate, work rate, attacking rate, position and etc.

Link to view the Data

Rapids is a suite of open source software libraries and APIs gives you the ability to execute end-to-end data science and analytics pipelines entirely on GPUs. Imagine scikit-learn on steroids, that is rapids.

Without HPO Random Forest -> No-HPO (Training on GPU)

Without HPO Random Forest

Running default setting on Random Forest

n_estimator = 100, max_depth = 16, max_bins = 8

Accuracy -> R2 score = 75%

Goal is to tune the model and enhance the accuracy without doing any feature engineering

Hence comes the ray tune module into the picture. Ray.Tune

HPO-code

Ray Tune Config

number of samples = 10, number of folds = 3, range for n_estimators = 500 - 1500, range for max_depth = 10 - 20, range for max_features = 0.5 - 1.0, n_bins = 18

Configuration that is done here are restricted as the machine used for this experiment is only running on GTX1060. If you have more powerful GPU, then you may have wider range of configuration to test.

Ray will randomly select any value from the range and add into the model as hyperparameter

self.rf_model = curfc(
                n_estimators=self._model_params["n_estimators"],
                max_depth=self._model_params["max_depth"],
                n_bins=self._model_params["n_bins"],
                max_features=self._model_params["max_features"],
            )

Total of 300 samples executed, but some iterations stopped early due to early stopping conditions.

+---------------------+------------+-------+-------------+----------------+----------------+--------+------------------+
| Trial name          | status     | loc   |   max_depth |   max_features |   n_estimators |   iter |   total time (s) |
|---------------------+------------+-------+-------------+----------------+----------------+--------+------------------|
| WrappedTrainable_1  | TERMINATED |       |     13.7454 |       0.975357 |       1231.99  |      3 |         234.135  |
| WrappedTrainable_2  | TERMINATED |       |     15.9866 |       0.578009 |        655.995 |      1 |          35.5271 |
| WrappedTrainable_3  | TERMINATED |       |     10.5808 |       0.933088 |       1101.12  |      1 |          58.8539 |
| WrappedTrainable_4  | TERMINATED |       |     17.0807 |       0.510292 |       1469.91  |      1 |          98.2842 |
| WrappedTrainable_5  | TERMINATED |       |     18.3244 |       0.60617  |        681.825 |      3 |         180.687  |
| WrappedTrainable_6  | TERMINATED |       |     11.834  |       0.652121 |       1024.76  |      3 |         124.095  |
| WrappedTrainable_7  | TERMINATED |       |     14.3195 |       0.645615 |       1111.85  |      3 |         149.505  |
| WrappedTrainable_8  | TERMINATED |       |     11.3949 |       0.646072 |        866.362 |      1 |          36.0093 |
| WrappedTrainable_9  | TERMINATED |       |     14.5607 |       0.892588 |        699.674 |      1 |          43.3045 |
| WrappedTrainable_10 | TERMINATED |       |     15.1423 |       0.796207 |        546.45  |      3 |         112.048  |
+---------------------+------------+-------+-------------+----------------+----------------+--------+------------------+

Output for all parameters are stored in trials.csv

The best performing parameters were experiment number 6:

max_depth=11, max_features=0.6, n_estimators=1024

With these hyperparameters the model accuracy increased to 83%. As you can see by finding better parameters we can make the model more accurate, now all is left is to work on feature engineering and re run the HPO to increase the accuracy more.

Those interested to try and run this

Install rapids
Install ray pip install 'ray[tune]' torch torchvision
Clone this repo
Run python/python3 random_forest_hpo.py

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data.csv		data.csv
random_forest_hpo.py		random_forest_hpo.py
random_forest_ori.py		random_forest_ori.py
trials.csv		trials.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HYPERPARAMETER OPTIMIZATION USING RAYTUNE AND RAPIDS

Without HPO Random Forest

Goal is to tune the model and enhance the accuracy without doing any feature engineering

Ray Tune Config

Those interested to try and run this

References

Contributor

About

Releases

Packages

Languages

License

Technologia-X/Hpo-ML-GPU

Folders and files

Latest commit

History

Repository files navigation

HYPERPARAMETER OPTIMIZATION USING RAYTUNE AND RAPIDS

Without HPO Random Forest

Goal is to tune the model and enhance the accuracy without doing any feature engineering

Ray Tune Config

Those interested to try and run this

References

Contributor

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages