skmultiflow.evaluation.EvaluateHoldout

class skmultiflow.evaluation.EvaluateHoldout(n_wait=10000, max_samples=100000, batch_size=1, max_time=inf, metrics=None, output_file=None, show_plot=False, restart_stream=True, test_size=5000, dynamic_test_set=False)[source]

The holdout evaluation method or periodic holdout evaluation method.

Analyses each arriving sample by updating its statistics, without computing performance metrics, nor predicting labels or regression values.

The performance evaluation happens at every n_wait analysed samples, at which moment the evaluator will test the learners performance on a test set, formed by yet unseen samples, which will be used to evaluate performance, but not to train the model.

It’s possible to use the same test set for every test made or to dynamically create test sets, so that they differ from each other. If dynamic test sets are enabled, we use the data stream to create test sets on the go. This process is more likely to generate test sets that follow the current concept, in comparison to static test sets.

Thus, if concept drift is known to be present in the stream, using dynamic test sets is recommended. If no concept drift is expected, disabling this parameter will speed up the evaluation process.

Parameters
  • n_wait (int (Default: 10000)) – The number of samples to process between each test. Also defines when to update the plot if show_plot=True. Note that setting n_wait too small can significantly slow the evaluation process.

  • max_samples (int (Default: 100000)) – The maximum number of samples to process during the evaluation.

  • batch_size (int (Default: 1)) – The number of samples to pass at a time to the model(s).

  • max_time (float (Default: float("inf"))) – The maximum duration of the simulation (in seconds).

  • metrics (list, optional (Default: ['accuracy', 'kappa'])) –

    The list of metrics to track during the evaluation. Also defines the metrics that will be displayed in plots and/or logged into the output file. Valid options are
    Classification
    ’accuracy’
    ’kappa’
    ’kappa_t’
    ’kappa_m’
    ’true_vs_predicted’
    ’precision’*
    ’recall’*
    ’f1’*
    ’gmean’*
    * binary-classification only
    Multi-target Classification
    ’hamming_score’
    ’hamming_loss’
    ’exact_match’
    ’j_index’
    Regression
    ’mean_square_error’
    ’mean_absolute_error’
    ’true_vs_predicted’
    Multi-target Regression
    ’average_mean_squared_error’
    ’average_mean_absolute_error’
    ’average_root_mean_square_error’
    Experimental
    ’running_time’
    ’model_size’

  • output_file (string, optional (Default: None)) – File name to save the summary of the evaluation.

  • show_plot (bool (Default: False)) – If True, a plot will show the progress of the evaluation. Warning: Plotting can slow down the evaluation process.

  • restart_stream (bool, optional (Default=True)) – If True, the stream is restarted once the evaluation is complete.

  • test_size (int (Default: 5000)) – The size of the test set.

  • dynamic_test_set (bool (Default: False)) – If True, will continuously change the test set, otherwise will use the same test set for all tests.

Notes

  1. This evaluator can process a single learner to track its performance; or multiple learners at a time, to compare different models on the same stream.

Examples

>>> # The first example demonstrates how to evaluate one model
>>> from skmultiflow.data import SEAGenerator
>>> from skmultiflow.trees import HoeffdingTree
>>> from skmultiflow.evaluation import EvaluateHoldout
>>>
>>> # Set the stream
>>> stream = SEAGenerator(random_state=1)
>>> stream.prepare_for_use()
>>>
>>> # Set the model
>>> ht = HoeffdingTree()
>>>
>>> # Set the evaluator
>>> evaluator = EvaluateHoldout(max_samples=100000,
>>>                             max_time=1000,
>>>                             show_plot=True,
>>>                             metrics=['accuracy', 'kappa'],
>>>                             dynamic_test_set=True)
>>>
>>> # Run evaluation
>>> evaluator.evaluate(stream=stream, model=ht, model_names=['HT'])
>>> # The second example demonstrates how to compare two models
>>> from skmultiflow.data import SEAGenerator
>>> from skmultiflow.trees import HoeffdingTree
>>> from skmultiflow.bayes import NaiveBayes
>>> from skmultiflow.evaluation import EvaluateHoldout
>>>
>>> # Set the stream
>>> stream = SEAGenerator(random_state=1)
>>> stream.prepare_for_use()
>>>
>>> # Set the model
>>> ht = HoeffdingTree()
>>> nb = NaiveBayes()
>>>
>>> # Set the evaluator
>>> evaluator = EvaluateHoldout(max_samples=100000,
>>>                             max_time=1000,
>>>                             show_plot=True,
>>>                             metrics=['accuracy', 'kappa'],
>>>                             dynamic_test_set=True)
>>>
>>> # Run evaluation
>>> evaluator.evaluate(stream=stream, model=[ht, nb], model_names=['HT', 'NB'])
__init__(n_wait=10000, max_samples=100000, batch_size=1, max_time=inf, metrics=None, output_file=None, show_plot=False, restart_stream=True, test_size=5000, dynamic_test_set=False)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([n_wait, max_samples, batch_size, …])

Initialize self.

evaluate(stream, model[, model_names])

Evaluates a learner or set of learners on samples from a stream.

evaluation_summary()

get_current_measurements([model_idx])

Get current measurements from the evaluation (measured on last n_wait samples).

get_info()

Collects and returns the information about the configuration of the estimator

get_mean_measurements([model_idx])

Get mean measurements from the evaluation.

get_measurements([model_idx])

Get measurements from the evaluation.

get_params([deep])

Get parameters for this estimator.

partial_fit(X, y[, classes, sample_weight])

Partially fit all the learners on the given data.

predict(X)

Predicts with the estimator(s) being evaluated.

reset()

Resets the estimator to its initial state.

set_params(**params)

Set the parameters of this estimator.

update_progress_bar(curr, total, steps, time)

evaluate(stream, model, model_names=None)[source]

Evaluates a learner or set of learners on samples from a stream.

Parameters
  • stream (Stream) – The stream from which to draw the samples.

  • model (StreamModel or list) – The learner or list of learners to evaluate.

  • model_names (list, optional (Default=None)) – A list with the names of the learners.

Returns

The trained learner(s).

Return type

StreamModel or list

get_current_measurements(model_idx=None)[source]

Get current measurements from the evaluation (measured on last n_wait samples).

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

  • measurements or list

  • Current measurements. If model_idx is None, returns a list with the measurements – for each model.

Raises

IndexError – If the index is invalid.:

get_info()[source]

Collects and returns the information about the configuration of the estimator

Returns

Configuration of the estimator.

Return type

string

get_mean_measurements(model_idx=None)[source]

Get mean measurements from the evaluation.

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

  • measurements or list

  • Mean measurements. If model_idx is None, returns a list with the measurements – for each model.

Raises

IndexError – If the index is invalid.:

get_measurements(model_idx=None)[source]

Get measurements from the evaluation.

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

  • tuple (mean, current)

  • Mean and Current measurements. If model_idx is None, each member of the tuple – is a a list with the measurements for each model.

Raises

IndexError – If the index is invalid.:

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters

deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

partial_fit(X, y, classes=None, sample_weight=None)[source]

Partially fit all the learners on the given data.

Parameters
  • X (Numpy.ndarray of shape (n_samples, n_features)) – The data upon which the algorithm will create its model.

  • y (Array-like) – An array-like containing the classification labels / target values for all samples in X.

  • classes (list) – Stores all the classes that may be encountered during the classification task. Not used for regressors.

  • sample_weight (Array-like) – Samples weight. If not provided, uniform weights are assumed.

Returns

self

Return type

EvaluateHoldout

predict(X)[source]

Predicts with the estimator(s) being evaluated.

Parameters

X (Numpy.ndarray of shape (n_samples, n_features)) – All the samples we want to predict the label for.

Returns

Model(s) predictions

Return type

list of numpy.ndarray

reset()[source]

Resets the estimator to its initial state.

Returns

Return type

self

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

Return type

self