update hyperparameter recommendations

interpretml · Apr 4, 2024 · d06e9f0 · d06e9f0
1 parent c0dc714
commit d06e9f0
Showing 1 changed file with 10 additions and 10 deletions.
diff --git a/docs/interpret/hyperparameters.md b/docs/interpret/hyperparameters.md
@@ -19,7 +19,7 @@ guidance: For max_interaction_bins, more is not necessarily better, unlike with
 ## interactions
 default: 0.95
 
-hyperparameters: [0, 0.25, 0.5, 0.75, 0.95]
+hyperparameters: [0, 0.25, 0.5, 0.75, 0.95, 5, 10, 25, 50, 100, 250]
 
 guidance: Introducing more interactions tends to improve model accuracy. Values between 0 and LESS than 1.0 are interpreted as percentages of the number of features. For example, a dataset with 100 features and an interactions value of 0.75 will automatically detect and use 75 interactions. Values of 1 or higher indicate the exact number of interactions to be detected, so for example 1 would create 1 interaction, and 50 would create 50.
 
@@ -37,7 +37,7 @@ ideal: 50 (diminishing returns beyond this point)
 
 hyperparameters: [50]
 
-guidance: We suggest increasing the number of outer bags if computational resources permit, ideally up to 50 outer bags where improvements plateau.
+guidance: We suggest increasing the number of outer bags if computational resources permit, ideally up to 50 outer bags where improvements plateau. Ideally up to 50-100 outer bags.  As with bagging, improvement starts to plateau around 25 and usually there is little advantage to going above 100.
 
 ## inner_bags
 default: 0
@@ -48,7 +48,7 @@ ideal: 50 (diminishing returns beyond this point)
 
 hyperparameters: [0] OR if you can afford it [0, 50]
 
-guidance: The default inner_bags value of 0 disables inner bagging. Setting this parameter to 1 or other low values will typically make the model worse since model fitting will then only use a subset of the data. Increasing the number of inner bags to 50 can improve model accuracy at the cost of significantly longer training times. If computation time is not a constraint, we suggest trying both 0 and 50, but not other values in between.
+guidance: The default inner_bags value of 0 disables inner bagging. Setting this parameter to 1 or other low values will typically make the model worse since model fitting will then only use a subset of the data but not do enough inner bagging to compensate. Increasing the number of inner bags to 50 can improve model accuracy at the cost of significantly longer training times. If computation time is not a constraint, we suggest trying both 0 and 50, but not other values in between.
 
 ## learning_rate
 default: 0.01
@@ -76,23 +76,23 @@ default: 200
 
 hyperparameters: [0, 50, 100, 200, 500, 1000, 2000, 4000]
 
-guidance: The optimal smoothing_rounds value will vary depending on the dataset's characteristics. Adjust based on the prevalence of smooth feature response curves.
+guidance: This is an important hyperparameter to tune.  The optimal smoothing_rounds value will vary depending on the dataset's characteristics. Adjust based on the prevalence of smooth feature response curves.
 
 ## interaction_smoothing_rounds
 default: 50
 
 hyperparameters: [0, 50, 100, 500]
 
-guidance: interaction_smoothing_rounds appears to have only a minor impact on model accuracy. 0 is often the best choice.
+guidance: interaction_smoothing_rounds appears to have only a minor impact on model accuracy. 0 is often the best choice.  0 is often the most accurate choice, but the interaction shape plots will be smoother and easier to interpret with more interaction_smoothing_rounds.
 
 ## max_rounds
 default: 25000
 
-ideal: 1000000000 (early stopping should stop before this point)
+ideal: 1000000000 (early stopping should stop long before this point)
 
 hyperparameters: [1000000000]
 
-guidance: The max_rounds parameter serves as a limit to prevent excessive training on datasets where improvements taper off. Set this parameter sufficiently high to avoid curtailing the early stopping mechanism. Consider increasing it if small yet consistent gains are observed in longer trainings.
+guidance: The max_rounds parameter serves as a limit to prevent excessive training on datasets where improvements taper off. Set this parameter sufficiently high to avoid premature early stopping. Consider increasing it if small yet consistent gains are observed in longer trainings.
 
 ## early_stopping_rounds
 default: 50
@@ -114,13 +114,13 @@ guidance: The default value usually works well, however experimenting with sligh
 ## min_hessian
 default: 0.0001
 
-hyperparameters: [1.0, 0.01, 0.0001, 0.000001]
+hyperparameters: [0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001]
 
-guidance: The default min_hessian is a solid starting point. While 1.0 may not be optimal in most scenarios, testing different thresholds could yield improvements depending on dataset characteristics.
+guidance: The default min_hessian is a solid starting point.
 
 ## max_leaves
 default: 3
 
 hyperparameters: [3, 4]
 
-guidance: Generally, the default setting is effective, but it's worth checking if an increment to 4 can offer better accuracy on your specific data.
+guidance: Generally, the default setting is effective, but it's worth checking if an increment to 4 can offer better accuracy on your specific data. The max_leaves parameter only applies to main effects.