skmultiflow.drift_detection.adwin

Classes

ADWIN([delta])

ADWIN method for concept drift detection

Item([next_item, previous_item])

Item to be used by the List object.

List()

A linked list object for ADWIN algorithm.

class skmultiflow.drift_detection.adwin.ADWIN(delta=0.002)[source][source]

ADWIN method for concept drift detection

Parameters

delta (float (default=0.002)) – The delta parameter for the ADWIN algorithm.

Notes

ADWIN [1] (ADaptive WINdowing) is an adaptive sliding window algorithm for detecting change, and keeping updated statistics about a data stream. ADWIN allows algorithms not adapted for drifting data, to be resistant to this phenomenon.

The general idea is to keep statistics from a window of variable size while detecting concept drift.

The algorithm will decide the size of the window by cutting the statistics’ window at different points and analysing the average of some statistic over these two windows. If the absolute value of the difference between the two averages surpasses a pre-defined threshold, change is detected at that point and all data before that time is discarded.

References

1

Bifet, Albert, and Ricard Gavalda. “Learning from time-changing data with adaptive windowing.” In Proceedings of the 2007 SIAM international conference on data mining, pp. 443-448. Society for Industrial and Applied Mathematics, 2007.

Examples

>>> # Imports
>>> import numpy as np
>>> from skmultiflow.drift_detection.adwin import ADWIN
>>> adwin = ADWIN()
>>> # Simulating a data stream as a normal distribution of 1's and 0's
>>> data_stream = np.random.randint(2, size=2000)
>>> # Changing the data concept from index 999 to 2000
>>> for i in range(999, 2000):
...     data_stream[i] = np.random.randint(4, high=8)
>>> # Adding stream elements to ADWIN and verifying if drift occurred
>>> for i in range(2000):
...     adwin.add_element(data_stream[i])
...     if adwin.detected_change():
...         print('Change detected in data: ' + str(data_stream[i]) + ' - at index: ' + str(i))
add_element(value)[source][source]

Add a new element to the sample window.

Apart from adding the element value to the window, by inserting it in the correct bucket, it will also update the relevant statistics, in this case the total sum of all values, the window width and the total variance.

Parameters

value (int or float (a numeric value)) –

Notes

The value parameter can be any numeric value relevant to the analysis of concept change. For the learners in this framework we are using either 0’s or 1’s, that are interpreted as follows: 0: Means the learners prediction was wrong 1: Means the learners prediction was correct

This function should be used at every new sample analysed.

delete_element()[source][source]

Delete an Item from the bucket list.

Deletes the last Item and updates relevant statistics kept by ADWIN.

Returns

The bucket size from the updated bucket

Return type

int

detected_change()[source][source]

Detects concept change in a drifting data stream.

The ADWIN algorithm is described in Bifet and Gavaldà’s ‘Learning from Time-Changing Data with Adaptive Windowing’. The general idea is to keep statistics from a window of variable size while detecting concept drift.

This function is responsible for analysing different cutting points in the sliding window, to verify if there is a significant change in concept.

Returns

bln_change – Whether change was detected or not

Return type

bool

Notes

If change was detected, one should verify the new window size, by reading the width property.

detected_warning_zone()[source][source]

If the change detector supports the warning zone, this function will return whether it’s inside the warning zone or not.

Returns

Whether the change detector is in the warning zone or not.

Return type

bool

get_change()[source][source]

Get drift

Returns

Whether or not a drift occurred

Return type

bool

get_class_type()[source]

The class type is a string that identifies the type of object generated by that module.

Returns

Return type

The class type

get_info()[source][source]

Collect information about the concept drift detector.

Returns

Configuration for the concept drift detector.

Return type

string

get_length_estimation()[source]

Returns the length estimation.

Returns

The length estimation

Return type

int

reset()[source][source]

Reset detectors

Resets statistics and adwin’s window.

Returns

self

Return type

ADWIN

class skmultiflow.drift_detection.adwin.Item(next_item=None, previous_item=None)[source][source]

Item to be used by the List object.

The Item object, alongside the List object, are the two main data structures used for storing the relevant statistics for the ADWIN algorithm for change detection.

Parameters
  • next_item (Item object) – Reference to the next Item in the List

  • previous_item (Item object) – Reference to the previous Item in the List

get_class_type()[source][source]

The class type is a string that identifies the type of object generated by that module.

Returns

Return type

The class type

get_info()[source][source]

A sum-up of all important characteristics of a class.

The default format of the return string is as follows: ClassName: attribute_one: value_one - attribute_two: value_two - info_one: info_one_value

Returns

  • string

  • A string with the class’ relevant information.

reset()[source][source]

Reset the algorithm’s statistics and window

Returns

self

Return type

ADWIN

class skmultiflow.drift_detection.adwin.List[source][source]

A linked list object for ADWIN algorithm.

Used for storing ADWIN’s bucket list. Is composed of Item objects. Acts as a linked list, where each element points to its predecessor and successor.

get_class_type()[source][source]

The class type is a string that identifies the type of object generated by that module.

Returns

Return type

The class type

get_info()[source][source]

A sum-up of all important characteristics of a class.

The default format of the return string is as follows: ClassName: attribute_one: value_one - attribute_two: value_two - info_one: info_one_value

Returns

  • string

  • A string with the class’ relevant information.