skmultiflow.drift_detection.ADWIN

class skmultiflow.drift_detection.ADWIN(delta=0.002)[source]

Adaptive Windowing method for concept drift detection.

Parameters

delta (float (default=0.002)) – The delta parameter for the ADWIN algorithm.

Notes

ADWIN 1 (ADaptive WINdowing) is an adaptive sliding window algorithm for detecting change, and keeping updated statistics about a data stream. ADWIN allows algorithms not adapted for drifting data, to be resistant to this phenomenon.

The general idea is to keep statistics from a window of variable size while detecting concept drift.

The algorithm will decide the size of the window by cutting the statistics’ window at different points and analysing the average of some statistic over these two windows. If the absolute value of the difference between the two averages surpasses a pre-defined threshold, change is detected at that point and all data before that time is discarded.

References

1

Bifet, Albert, and Ricard Gavalda. “Learning from time-changing data with adaptive windowing.” In Proceedings of the 2007 SIAM international conference on data mining, pp. 443-448. Society for Industrial and Applied Mathematics, 2007.

Examples

>>> # Imports
>>> import numpy as np
>>> from skmultiflow.drift_detection.adwin import ADWIN
>>> adwin = ADWIN()
>>> # Simulating a data stream as a normal distribution of 1's and 0's
>>> data_stream = np.random.randint(2, size=2000)
>>> # Changing the data concept from index 999 to 2000
>>> for i in range(999, 2000):
...     data_stream[i] = np.random.randint(4, high=8)
>>> # Adding stream elements to ADWIN and verifying if drift occurred
>>> for i in range(2000):
...     adwin.add_element(data_stream[i])
...     if adwin.detected_change():
...         print('Change detected in data: ' + str(data_stream[i]) + ' - at index: ' + str(i))
__init__(delta=0.002)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([delta])

Initialize self.

add_element(value)

Add a new element to the sample window.

bucket_size(row)

delete_element()

Delete an Item from the bucket list.

detected_change()

Detects concept change in a drifting data stream.

detected_warning_zone()

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

get_change()

Get drift

get_info()

Collects and returns the information about the configuration of the estimator

get_length_estimation()

Returns the length estimation.

get_params([deep])

Get parameters for this estimator.

reset()

Reset detectors

reset_change()

set_clock(clock)

set_params(**params)

Set the parameters of this estimator.

Attributes

MAX_BUCKETS

estimation

estimator_type

n_detections

total

variance

width

width_t

add_element(value)[source]

Add a new element to the sample window.

Apart from adding the element value to the window, by inserting it in the correct bucket, it will also update the relevant statistics, in this case the total sum of all values, the window width and the total variance.

Parameters

value (int or float (a numeric value)) –

Notes

The value parameter can be any numeric value relevant to the analysis of concept change. For the learners in this framework we are using either 0’s or 1’s, that are interpreted as follows: 0: Means the learners prediction was wrong 1: Means the learners prediction was correct

This function should be used at every new sample analysed.

delete_element()[source]

Delete an Item from the bucket list.

Deletes the last Item and updates relevant statistics kept by ADWIN.

Returns

The bucket size from the updated bucket

Return type

int

detected_change()[source]

Detects concept change in a drifting data stream.

The ADWIN algorithm is described in Bifet and Gavaldà’s ‘Learning from Time-Changing Data with Adaptive Windowing’. The general idea is to keep statistics from a window of variable size while detecting concept drift.

This function is responsible for analysing different cutting points in the sliding window, to verify if there is a significant change in concept.

Returns

bln_change – Whether change was detected or not

Return type

bool

Notes

If change was detected, one should verify the new window size, by reading the width property.

detected_warning_zone()[source]

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

Returns

Whether the change detector is in the warning zone or not.

Return type

bool

get_change()[source]

Get drift

Returns

Whether or not a drift occurred

Return type

bool

get_info()[source]

Collects and returns the information about the configuration of the estimator

Returns

Configuration of the estimator.

Return type

string

get_length_estimation()[source]

Returns the length estimation.

Returns

The length estimation

Return type

int

get_params(deep=True)[source]

Get parameters for this estimator.

Parameters

deep (boolean, optional) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

reset()[source]

Reset detectors

Resets statistics and adwin’s window.

Returns

self

Return type

ADWIN

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns

Return type

self