skmultiflow.core.Pipeline¶

class
skmultiflow.core.
Pipeline
(steps)[source]¶ [Experimental] Holds a set of sequential operation (transforms), followed by a single estimator.
It allows for easy manipulation of datasets that may require several transformation processes before being used by a learner. Also allows for the crossvalidation of several steps.
Each of the intermediate steps should be an extension of the BaseTransform class, or at least implement the transform and partial_fit functions or the partial_fit_transform.
The last step should be an estimator (learner), so it should implement partial_fit, and predict at least.
Since it has an estimator as the last step, the Pipeline will act like an estimator itself, in a way that it can be directly passed to evaluation objects, as if it was a learner.
 Parameters
steps (list of tuple) – Tuple list containing the set of transforms and the final estimator. It doesn’t need to contain a transform type object, but the estimator is required. Each tuple should be of the format (‘name’, estimator).
 Raises
TypeError – If the intermediate steps or the final estimator do not implement:
the necessary functions for the pipeline to work, a TypeError is raised. –
NotImplementedError – Some of the functions are yet to be implemented.:
Notes
This code is an experimental feature. Use with caution.
Examples
>>> # Imports >>> from skmultiflow.lazy import KNNAdwin >>> from skmultiflow.core import Pipeline >>> from skmultiflow.data import FileStream >>> from skmultiflow.evaluation import EvaluatePrequential >>> from skmultiflow.transform import OneHotToCategorical >>> # Setting up the stream >>> stream = FileStream("skmultiflow/data/datasets/covtype.csv") >>> stream.prepare_for_use() >>> transform = OneHotToCategorical([[10, 11, 12, 13], ... [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, ... 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53]]) >>> # Setting up the classifier >>> classifier = KNNAdwin(n_neighbors=8, max_window_size=2000, leaf_size=40) >>> # Setup the pipeline >>> pipe = Pipeline([('transform', transform), ('passive_aggressive', classifier)]) >>> # Setup the evaluator >>> evaluator = EvaluatePrequential(show_plot=True, pretrain_size=1000, max_samples=500000) >>> # Evaluate >>> evaluator.evaluate(stream=stream, model=pipe)
Methods
__init__
(steps)Initialize self.
fit
(X, y)Sequentially fit and transform data in all but last step, then fit the model in last step.
get_info
()Collects and returns the information about the configuration of the estimator
get_params
([deep])Get parameters for this estimator.
Generates a dictionary to access all the steps’ properties.
partial_fit
(X, y[, classes])Sequentially partial fit and transform data in all but last step, then partial fit data in last step.
partial_fit_predict
(X, y)Partial fits and transforms data in all but last step, then partial fits and predicts in the last step
partial_fit_transform
(X[, y])Partial fits and transforms data in all but last step, then partial_fit in last step
predict
(X)Sequentially applies all transforms and then predict with last step.
reset
()Resets the estimator to its initial state.
set_params
(**params)Set the parameters of this estimator.

fit
(X, y)[source]¶ Sequentially fit and transform data in all but last step, then fit the model in last step.
 Parameters
X (numpy.ndarray of shape (n_samples, n_features)) – The data upon which the transforms/estimator will create their model.
y (An array_like object of length n_samples) – Contains the true class labels for all the samples in X.
 Returns
self
 Return type
Pipeline

get_info
()[source]¶ Collects and returns the information about the configuration of the estimator
 Returns
Configuration of the estimator.
 Return type
string

get_params
(deep=True)[source]¶ Get parameters for this estimator.
 Parameters
deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
 Returns
params – Parameter names mapped to their values.
 Return type
mapping of string to any

named_steps
()[source]¶ Generates a dictionary to access all the steps’ properties.
 Returns
A steps dictionary, so that each step can be accessed by name.
 Return type
dictionary

partial_fit
(X, y, classes=None)[source]¶ Sequentially partial fit and transform data in all but last step, then partial fit data in last step.
 Parameters
X (numpy.ndarray of shape (n_samples, n_features)) – The features to train the model.
y (numpy.ndarray of shape (n_samples)) – An arraylike with the class labels of all samples in X.
classes (numpy.ndarray) – Array with all possible/known class labels. This is an optional parameter, except for the first partial_fit call where it is compulsory.
 Returns
self
 Return type
Pipeline

partial_fit_predict
(X, y)[source]¶ Partial fits and transforms data in all but last step, then partial fits and predicts in the last step
 Parameters
X (numpy.ndarray of shape (n_samples, n_features)) – All the samples we want to predict the label for.
y (An array_like object of length n_samples) – Contains the true class labels for all the samples in X
 Returns
The predicted class label for all the samples in X.
 Return type

partial_fit_transform
(X, y=None)[source]¶ Partial fits and transforms data in all but last step, then partial_fit in last step
 Parameters
X (numpy.ndarray of shape (n_samples, n_features)) – The data upon which the transforms/estimator will create their model.
y (An array_like object of length n_samples) – Contains the true class labels for all the samples in X
 Returns
self
 Return type
Pipeline

predict
(X)[source]¶ Sequentially applies all transforms and then predict with last step.
 Parameters
X (numpy.ndarray of shape (n_samples, n_features)) – All the samples we want to predict the label for.
 Returns
The predicted class label for all the samples in X.
 Return type

set_params
(**params)[source]¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object. Returns
 Return type
self