Hyperpipe
The PHOTONAI Hyperpipe class enables you to create a custom pipeline. In addition it defines the relevant analysis’ parameters such as the cross-validation scheme, the hyperparameter optimization strategy, and the performance metrics of interest.
So called PHOTONAI PipelineElements can be added to the Hyperpipe, each of them being a data-processing method or a learning algorithm. By choosing and combining data-processing methods and algorithms and arranging them with the PHOTONAI classes, both simple and complex pipeline architectures can be designed rapidly.
The PHOTONAI Hyperpipe automatizes the nested training, test and hyperparameter optimization procedures.
- The Hyperpipe monitors the nested-cross-validated training and test procedure,
- communicates with the hyperparameter optimization strategy,
- streams information between the pipeline elements,
- logs all results obtained and evaluates the performance.
- guides the hyperparameter optimization process by a so-called best config metric which is used to select the best performing hyperparameter configuration.
Example
my_pipe = Hyperpipe(name='basic_svm_pipe_no_performance',
optimizer='sk_opt',
optimizer_params={'n_configurations': 25},
metrics=['mean_squared_error', 'pearson_correlation'],
best_config_metric='mean_squared_error',
outer_cv=KFold(n_splits=3, shuffle=True),
inner_cv=KFold(n_splits=3),
eval_final_performance=True,
output_settings=OutputSettings(project_folder='./result_folder',
mongodb_connect_url="mongodb://localhost:27017/photon_results",
save_output=True,
plots=True),
cache_folder='./tmp',
verbosity=1,
random_seed=42,
performance_constraints=[MinimumPerformance('mean_squared_error', 35, 'first'),
MinimumPerformance('pearson_correlation', 0.7, 'all')])
Parameters
Parameter | Type | Description |
---|---|---|
name | str | Name of object instance |
inner_cv | BaseCrossValidator | Cross validation strategy to test hyperparameter configurations, generates the validation set |
outer_cv | BaseCrossValidator | Cross validation strategy to use for the hyperparameter search itself, generates the test set |
eval_final_performance | bool, default=True | If the metrics should be calculated for the test set, otherwise the test set is seperated but not used |
optimizer | str or object, default="grid_search" |
|
metrics | [list of metric names as str] |
|
best_config_metric | str | he metric that should be maximized or minimized. It is used in order to choose the best hyperparameter configuration |
output_settings | OutputSettings Object | Define persistation parameters such as
|
verbosity | int | Level of output quantity: 0: only photonai system logs 1: normal photonai output 2: extensive photonai output |
random_seed | int | control random seed, which is passed to each both numpy and each PipelineElement |
performance_constraints | [list of threshold objects] |
In order to save computational resources and time, the evaluation of a hyperparameter configuration
can be abbreviated. In case a configuration performs worse than an expected minimum, all further
inner cross validation folds are skipped. This can be a fixed threshold given by the user
or a dynamic threshold,
that expects the configuration to be a certain percentage better than the dummy estimator's performance.
|
cache_folder | str | URL/Path to a directory in which PHOTONAI is allowed to write cache files. If the paramater is given, PHOTONAI caches fold- and config-wise. |
Attributes
Parameter | Type | Description |
---|---|---|
elements | [list of PipelineElement objects] | the pipeline defining sequence of preprocessing methods and algorithms |
best_config | dict | Dictionary containing the hyperparameters of the best configuration. Contains the parameters in the sklearn interface of model_name__parameter_name: parameter value |
results | MDBHyperpipe | Object containing all information about the for the performed hyperparameter search. Holds the training and test metrics for all outer folds, inner folds and configurations, as well as additional information. |
optimum_pipe | PhotonPipeline | A pipeline object fitted with the complete feature matrix according to the best hyperparameter configuration found. |