A machine learning package for streaming data in Python

    Incremental learning
    Stream learning models are created incrementally and are updated continuously. They are suitable for big data applications where real-time response is vital.

    Adaptive learning
    Changes in data distribution harm learning. Adaptive methods are specifically designed to be robust to concept drift changes in dynamic environments.

    Resource-wise efficient
    Streaming techniques efficiently handle resources such as memory and processing time given the unbounded nature of data streams.

    Easy to use
    scikit-multiflow is designed for users with any experience level. Experiments are easy to design, setup, and run. Existing methods are easy to modify and extend.

    Stream learning tools
    In its current state, scikit-multiflow contains data generators, multi-output/multi-target stream learning methods, change detection methods, evaluation methods, and more.

    Open source
    Distributed under the BSD 3-Clause, scikit-multiflow is developed and maintained publicly on GitHub by an active, diverse and growing community.

    Use cases

    Learning tasks supported in scikit-multiflow

    Supervised learning When working with labeled data. Depending on the target type can be either classification (discrete values) or regression (continuous values)

    Single/multi output Single-output methods predict a single target-label (binary or multi-class) for classification or a single target-value for regression. Multi-output methods simultaneously predict multiple variables given an input.

    Concept drift detection Changes in data distribution can harm learning. Drift detection methods are designed to rise an alarm in the presence of drift and are used alongside learning methods to improve their robustness against this phenomenon in evolving data streams.

    Unsupervised learning When working with unlabeled data. For example, anomaly detection where the goal is the identification of rare events or samples which differ significantly from the majority of the data.

    Monitor performance

    Prequential evaluation example

    Sponsors
    Collaborating institutions/groups
    ICMC logo
    Citing

    If you want to cite scikit-multiflow, please use the following JMLR paper (bibtex).

    Montiel, J., Read, J., Bifet, A., & Abdessalem, T. (2018). Scikit-multiflow: A multi-output streaming framework. The Journal of Machine Learning Research, 19(72):1−5.