skmultiflow.evaluation.evaluate_prequential¶

Classes

 EvaluatePrequential([n_wait, max_samples, …]) The prequential evaluation method, or interleaved test-then-train method, is an alternative to the traditional holdout evaluation, inherited from batch setting problems.
class skmultiflow.evaluation.evaluate_prequential.EvaluatePrequential(n_wait=200, max_samples=100000, batch_size=1, pretrain_size=200, max_time=inf, metrics=None, output_file=None, show_plot=False, restart_stream=True, data_points_for_classification=False)[source][source]

The prequential evaluation method, or interleaved test-then-train method, is an alternative to the traditional holdout evaluation, inherited from batch setting problems.

The prequential evaluation is designed specifically for stream settings, in the sense that each sample serves two purposes, and that samples are analysed sequentially, in order of arrival, and become immediately inaccessible.

This method consists of using each sample to test the model, which means to make a predictions, and then the same sample is used to train the model (partial fit). This way the model is always tested on samples that it hasn’t seen yet.

Parameters
• n_wait (int (Default: 200)) – The number of samples to process between each test. Also defines when to update the plot if show_plot=True. Note that setting n_wait too small can significantly slow the evaluation process.

• max_samples (int (Default: 100000)) – The maximum number of samples to process during the evaluation.

• batch_size (int (Default: 1)) – The number of samples to pass at a time to the model(s).

• pretrain_size (int (Default: 200)) – The number of samples to use to train the model before starting the evaluation. Used to enforce a ‘warm’ start.

• max_time (float (Default: float("inf"))) – The maximum duration of the simulation (in seconds).

• metrics (list, optional (Default: ['accuracy', 'kappa'])) –

The list of metrics to track during the evaluation. Also defines the metrics that will be displayed in plots and/or logged into the output file. Valid options are
Classification
’accuracy’
’kappa’
’kappa_t’
’kappa_m’
’true_vs_predicted’
Multi-target Classification
’hamming_score’
’hamming_loss’
’exact_match’
’j_index’
Regression
’mean_square_error’
’mean_absolute_error’
’true_vs_predicted’
Multi-target Regression
’average_mean_squared_error’
’average_mean_absolute_error’
’average_root_mean_square_error’
Experimental
’running_time’
’model_size’

• output_file (string, optional (Default: None)) – File name to save the summary of the evaluation.

• show_plot (bool (Default: False)) – If True, a plot will show the progress of the evaluation. Warning: Plotting can slow down the evaluation process.

• restart_stream (bool, optional (default: True)) – If True, the stream is restarted once the evaluation is complete.

• data_points_for_classification (bool(Default: False)) – If True , the visualization used is a cloud of data points (only works for classification)

Notes

1. This evaluator can process a single learner to track its performance; or multiple learners at a time, to compare different models on the same stream.

2. The metric ‘true_vs_predicted’ is intended to be informative only. It corresponds to evaluations at a specific moment which might not represent the actual learner performance across all instances.

Examples

>>> # The first example demonstrates how to evaluate one model
>>> from skmultiflow.data import SEAGenerator
>>> from skmultiflow.trees import HoeffdingTree
>>> from skmultiflow.evaluation import EvaluatePrequential
>>>
>>> # Set the stream
>>> stream = SEAGenerator(random_state=1)
>>> stream.prepare_for_use()
>>>
>>> # Set the model
>>> ht = HoeffdingTree()
>>>
>>> # Set the evaluator
>>>
>>> evaluator = EvaluatePrequential(max_samples=10000,
>>>                                 max_time=1000,
>>>                                 show_plot=True,
>>>                                 metrics=['accuracy', 'kappa'])
>>>
>>> evaluator.evaluate(stream=stream, model=ht, model_names=['HT'])
>>>
>>> # Run evaluation
>>> evaluator.evaluate(stream=stream, model=ht, model_names=['HT'])

>>> # The second example demonstrates how to compare two models
>>> from skmultiflow.data import SEAGenerator
>>> from skmultiflow.trees import HoeffdingTree
>>> from skmultiflow.bayes import NaiveBayes
>>> from skmultiflow.evaluation import EvaluateHoldout
>>>
>>> # Set the stream
>>> stream = SEAGenerator(random_state=1)
>>> stream.prepare_for_use()
>>>
>>> # Set the models
>>> ht = HoeffdingTree()
>>> nb = NaiveBayes()
>>>
>>> evaluator = EvaluatePrequential(max_samples=10000,
>>>                                 max_time=1000,
>>>                                 show_plot=True,
>>>                                 metrics=['accuracy', 'kappa'])
>>>
>>> # Run evaluation
>>> evaluator.evaluate(stream=stream, model=[ht, nb], model_names=['HT', 'NB'])

>>> # The third example demonstrates how to evaluate one model
>>> # and visualize the predictions using data points.
>>> # Note: You can not in this case compare multiple models
>>> from skmultiflow.data import SEAGenerator
>>> from skmultiflow.trees import HoeffdingTree
>>> from skmultiflow.evaluation import EvaluatePrequential
>>> # Set the stream
>>> stream = SEAGenerator(random_state=1)
>>> stream.prepare_for_use()
>>> # Set the model
>>> ht = HoeffdingTree()
>>> # Set the evaluator
>>> evaluator = EvaluatePrequential(max_samples=200,
>>>                                 n_wait=1,
>>>                                 pretrain_size=1,
>>>                                 max_time=1000,
>>>                                 show_plot=True,
>>>                                 metrics=['accuracy'],
>>>                                 data_points_for_classification=True)
>>> evaluator.evaluate(stream=stream, model=ht, model_names=['HT'])
>>> # Run evaluation
>>> evaluator.evaluate(stream=stream, model=ht, model_names=['HT'])

evaluate(stream, model, model_names=None)[source][source]

Evaluates a learner or set of learners on samples from a stream.

Parameters
• stream (Stream) – The stream from which to draw the samples.

• model (StreamModel or list) – The learner or list of learners to evaluate.

• model_names (list, optional (Default=None)) – A list with the names of the learners.

Returns

The trained learner(s).

Return type

StreamModel or list

get_class_type()[source]

The class type is a string that identifies the type of object generated by that module.

Returns

Return type

The class type

get_current_measurements(model_idx=None)[source]

Get current measurements from the evaluation (measured on last n_wait samples).

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

• measurements or list

• Current measurements. If model_idx is None, returns a list with the measurements – for each model.

Raises

IndexError – If the index is invalid.:

get_info()[source][source]

Returns

Evaluator description.

Return type

string

get_mean_measurements(model_idx=None)[source]

Get mean measurements from the evaluation.

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

• measurements or list

• Mean measurements. If model_idx is None, returns a list with the measurements – for each model.

Raises

IndexError – If the index is invalid.:

get_measurements(model_idx=None)[source]

Get measurements from the evaluation.

Parameters

model_idx (int, optional (Default=None)) – Indicates the index of the model as defined in evaluate(model). If None, returns a list with the measurements for each model.

Returns

• tuple (mean, current)

• Mean and Current measurements. If model_idx is None, each member of the tuple – is a a list with the measurements for each model.

Raises

IndexError – If the index is invalid.:

partial_fit(X, y, classes=None, weight=None)[source][source]

Partially fit all the learners on the given data.

Parameters
• X (Numpy.ndarray of shape (n_samples, n_features)) – The data upon which the algorithm will create its model.

• y (Array-like) – An array-like containing the classification targets for all samples in X.

• classes (list) – Stores all the classes that may be encountered during the classification task.

• weight (Array-like) – Instance weight. If not provided, uniform weights are assumed.

Returns

self

Return type

EvaluatePrequential

predict(X)[source][source]

Predicts the labels of the X samples, by calling the predict function of all the learners.

Parameters

X (Numpy.ndarray of shape (n_samples, n_features)) – All the samples we want to predict the label for.

Returns

A list containing the predicted labels for all instances in X in all learners.

Return type

list

set_params(parameter_dict)[source][source]

This function allows the users to change some of the evaluator’s parameters, by passing a dictionary where keys are the parameters names, and values are the new parameters’ values.

Parameters

parameter_dict (Dictionary) – A dictionary where the keys are the names of attributes the user wants to change, and the values are the new values of those attributes.