cvopt.utils.mk_metafeature¶

mk_metafeature(X, y, logdir, model_id, target_index, cv, validation_data=None, feature_groups=None, estimator_method='predict', merge=True, loader='sklearn')[source]¶

Make meta feature for stacking(https://mlwave.com/kaggle-ensembling-guide/)

Parameters:	X (np.ndarray or pd.core.frame.DataFrame, shape(axis=0) = (n_samples)) – Features that was used in optimizer training. Detail depends on estimator. Meta feature correspond to X is made using cross validation’s estimator. y (np.ndarray or pd.core.frame.DataFrame, shape(axis=0) = (n_samples) or None, default=None.) – Target variable that was used in optimizer training. Detail depends on estimator. logdir (str.) – cvopt’s log directory path. model_id (str.) – cvopt’s model id. target_index (int.) – Logfile index(start:0). The estimator correspond to index is used to make meta feature. cv (scikit-learn cross-validator) – Cross validation setting that was used in optimizer training. validation_data (tuple(X, y) or None, default=None.) – Detail depends on estimator. Meta feature correspond to validation_data is made using the estimator which is fitted whole train data. feature_groups (array-like, shape = (n_samples,) or None, default=None.) – cvopt feature_groups that was used in optimizer training. estimator_method (str, default="predict".) – Using estimator’s method to make meta feature. merge (bool, default=True.) – if True, return matrix which result per cv is merged into. loader (str or function, default="sklearn".) – estimator`s loader. sklearn: use sklearn.externals.joblib.load. Basically for scikit-learn. function: function whose variable is estimator`s path.
Returns:	X_meta or X_meta, X_meta_validation_data – When validation_data is input, return tuple.
Return type:	np.ndarray or tuple of np.ndarray.