cvopt.utils.mk_metafeature¶
-
mk_metafeature
(X, y, logdir, model_id, target_index, cv, validation_data=None, feature_groups=None, estimator_method='predict', merge=True, loader='sklearn')[source]¶ Make meta feature for stacking(https://mlwave.com/kaggle-ensembling-guide/)
Parameters: - X (np.ndarray or pd.core.frame.DataFrame, shape(axis=0) = (n_samples)) – Features that was used in optimizer training. Detail depends on estimator. Meta feature correspond to X is made using cross validation’s estimator.
- y (np.ndarray or pd.core.frame.DataFrame, shape(axis=0) = (n_samples) or None, default=None.) – Target variable that was used in optimizer training. Detail depends on estimator.
- logdir (str.) – cvopt’s log directory path.
- model_id (str.) – cvopt’s model id.
- target_index (int.) – Logfile index(start:0). The estimator correspond to index is used to make meta feature.
- cv (scikit-learn cross-validator) – Cross validation setting that was used in optimizer training.
- validation_data (tuple(X, y) or None, default=None.) – Detail depends on estimator. Meta feature correspond to validation_data is made using the estimator which is fitted whole train data.
- feature_groups (array-like, shape = (n_samples,) or None, default=None.) – cvopt feature_groups that was used in optimizer training.
- estimator_method (str, default="predict".) – Using estimator’s method to make meta feature.
- merge (bool, default=True.) – if True, return matrix which result per cv is merged into.
- loader (str or function, default="sklearn".) –
estimator`s loader.
- sklearn: use sklearn.externals.joblib.load. Basically for scikit-learn.
- function: function whose variable is estimator`s path.
Returns: X_meta or X_meta, X_meta_validation_data – When validation_data is input, return tuple.
Return type: np.ndarray or tuple of np.ndarray.