cvopt.model_selection.SimpleoptCV¶

class SimpleoptCV(estimator, param_distributions, scoring=None, cv=5, max_iter=32, random_state=None, n_jobs=1, pre_dispatch='2*n_jobs', verbose=0, logdir=None, save_estimator=0, saver='sklearn', model_id=None, cloner='sklearn', refit=True, backend='hyperopt', **kwargs)[source]¶

Each cross validation optimizer class’s wrapper.

This class allow unified handling in different type backend.

For each backend optimizer class, refer to each class`s page.

Parameters:

estimator – scikit-learn estimator like.
param_distributions (dict.) – Search space.
scoring (string or sklearn.metrics.make_scorer.) – Evaluation index of search. When scoring is None, use stimator default scorer and this score greater is better.
cv (scikit-learn cross-validator or int(number of folds), default=5.) – Cross validation setting.
max_iter (int, default=32.) – Number of search.
random_state (int or None, default=None.) – The seed used by the random number generator.
n_jobs (int, default=1.) – Number of jobs to run in parallel.
pre_dispatch (int or string, default="2*n_jobs".) – Controls the number of jobs that get dispatched during parallel.
verbose (int(0, 1 or 2), default=0.) –
Controls the verbosity

0: don’t display status.

1: display status by stdout.

2: display status by graph.
logdir (str or None, default=None.) –
Path of directory to save log file. When logdir is None, log is not saved.

[directory structure]

logdir

|-cv_results

|-{model_id}.csv : search log

…

|-cv_results_graph

|-{model_id}.html : search log(graph)

…

|-estimators_{model_id}

|-{model_id}_index{search count}_split{fold count}.pkl: an estimator which is fitted fold train data
…

|-{model_id}_index{search count}_test.pkl : an estimator which is fitted whole train data.
save_estimator (int, default=0.) –
estimator save setting.

0: An estimator is not saved.

1: An estimator which is fitted fold train data is saved per cv-fold.

2: In addition to 1, an estimator which is fitted whole train data is saved per cv.
saver (str or function, default="sklearn".) –
estimator`s saver.
- sklearn: use sklearn.externals.joblib.dump. Basically for scikit-learn.
- function: function whose variable are model class and save path.
Examples
```
>>> def saver(model, path):
>>>     save_model(model, path+".h5")
```
model_id (str or None, default=None.) – This is used to log filename. When model_id is None, this is generated by date time.
cloner (str or function, default="sklearn".) –
estimator`s cloner.
- sklearn: use try:sklearn.base.clone, except:copy.deepcopy. Basically for scikit-learn.
- function: function whose variable is model.
Examples
```
>>> def cloner(model):
>>>     clone_model(model)
```
refit (bool, default=True.) – Refit an estimator using the best found parameters on all train data(=X).
backend (str, default="hyperopt".) –
backend optimeizer. Supports the following back ends.
- hyperopt: Sequential Model Based Global Optimization
- bayesopt: Bayesian Optimization
- gaopt: Genetic Algorithm
- randomopt: Random Search

cv_results_¶: dict of numpy (masked) ndarrays – A dict with keys as column headers and values as columns, that can be imported into a pandas DataFrame.

best_estimator_¶: estimator or dict – Estimator that was chosen by the search.

best_score_¶: float – Cross-validated score of the best_estimator.

best_params_¶: dict – Parameter setting that gave the best results on the hold out data.

Methods

__init__(estimator, param_distributions[, …])