Avoid costly re-building of pipelines #443
Labels
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs for the next 7 days. Thank you for your contributions. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently, SMAC suggests hyperparameter configurations which are independent of the dataset size. For example, the hyperparameter
classifier:max_features
which is specified between zero and one is transformed according tomax_features = int(n_features ** classifier:max_features)
. Assuming the dataset in question has only 10 features, SMAC does not know that most values of the tuned hyperparameter map to the same hyperparameter applied to the actual model. Therefore, one needs to track the 'actual' hyperparameters after transformation and check whether they are re-used, and return a cached function value to SMAC if done so.Initial experiments suggest that 1-2% of the overall runs are actually re-optimizations.
The text was updated successfully, but these errors were encountered: