All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.sonar.l10n.py.rules.python.S6973.html Maven / Gradle / Ivy

There is a newer version: 4.23.0.17664
Show newest version

This rule raises an issue when a machine learning estimator or optimizer is instantiated without specifying the important hyperparameters.

Why is this an issue?

When instantiating an estimator or an optimizer, default values for any hyperparameters that are not specified will be used. Relying on the default values can lead to non-reproducible results across different versions of the library.

Furthermore, the default values might not be the best choice for the specific problem at hand and can lead to suboptimal performance.

Here are the estimators and the parameters considered by this rule :

Scikit-learn - Estimator Hyperparameters

AdaBoostClassifier

learning_rate

AdaBoostRegressor

learning_rate

GradientBoostingClassifier

learning_rate

GradientBoostingRegressor

learning_rate

HistGradientBoostingClassifier

learning_rate

HistGradientBoostingRegressor

learning_rate

RandomForestClassifier

min_samples_leaf, max_features

RandomForestRegressor

min_samples_leaf, max_features

ElasticNet

alpha, l1_ratio

NearestNeighbors

n_neighbors

KNeighborsClassifier

n_neighbors

KNeighborsRegressor

n_neighbors

NuSVC

nu, kernel, gamma

NuSVR

C, kernel, gamma

SVC

C, kernel, gamma

SVR

C, kernel, gamma

DecisionTreeClassifier

ccp_alpha

DecisionTreeRegressor

ccp_alpha

MLPClassifier

hidden_layer_sizes

MLPRegressor

hidden_layer_sizes

PolynomialFeatures

degree, interaction_only

PyTorch - Optimizer Hyperparameters

Adadelta

lr, weight_decay

Adagrad

lr, weight_decay

Adam

lr, weight_decay

AdamW

lr, weight_decay

SparseAdam

lr

Adamax

lr, weight_decay

ASGD

lr, weight_decay

LBFGS

lr

NAdam

lr, weight_decay, momentum_decay

RAdam

lr, weight_decay

RMSprop

lr, weight_decay, momentum

Rprop

lr

SGD

lr, weight_decay, momentum

How to fix it in Scikit-Learn

Specify the hyperparameters when instantiating the estimator.

Code examples

Noncompliant code example

from sklearn.neighbors import KNeighborsClassifier

clf = KNeighborsClassifier() # Noncompliant : n_neighbors is not specified, different values can change the behaviour of the predictor significantly

Compliant solution

from sklearn.neighbors import KNeighborsClassifier

clf = KNeighborsClassifier( # Compliant
    n_neighbors=5
)

How to fix it in PyTorch

Specify the hyperparameters when instantiating the optimizer

Code examples

Noncompliant code example

from my_model import model
from torch.optim import AdamW

optimizer = AdamW(model.parameters(), lr = 0.001) # Noncompliant : weight_decay is not specified, different values can change the behaviour of the optimizer significantly

Compliant solution

from my_model import model
from torch.optim import AdamW

optimizer = AdamW(model.parameters(), lr = 0.001, weight_decay = 0.003) # Compliant

Resources

Articles & blog posts

  • Probst, P., Boulesteix, A. L., & Bischl, B. (2019). Tunability: Importance of Hyperparameters of Machine Learning Algorithms. Journal of Machine Learning Research, 20(53), 1-32.
  • van Rijn, J. N., & Hutter, F. (2018, July). Hyperparameter importance across datasets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 2367-2376).

Documentation

External coding guidelines





© 2015 - 2024 Weber Informatics LLC | Privacy Policy