# Stream class¶

The Stream class is in charge of “providing” data inside scikit-multiflow. The most important method of the Stream class is next_sample(batch_size).

The shape $$(n, m)$$ of the $$X$$ and $$Y$$ arrays depends on the batch_size and the type of learning problem.

## Supervised learning¶

next_sample(batch_size) will return a features vector $$X$$ and its corresponding target vector $$Y$$

The number of samples $$n$$ is defined by batch_size which by default is 1.

The total number of features $$m$$ in $$X$$ is equal to the number of numerical features plus the number of categorical features: $$X_m = n_{num} + n_{cat}$$

The number of columns $$m$$ in $$Y$$ determines the number of targets to learn. Consider the following examples:

• S_bc: A binary classification stream

• Number of targets: Y_m = 1

• Unique target values: [0, 1]

• S_mc: A multi-class classification stream with 3 classes (0, 1, 2)

• Number of targets: Y_m = 1

• Unique target values: [0, 1, 2]

• S_mc: A multi-target classification stream, with 2 targets, where classes (0,1,2) correspond to the first target and classes (1, 2) to the second target.

• Number of targets: Y_m = 2

• Unique target values: [[0, 1, 2],[1, 2]]

• S_r: A regression stream

• Number of targets: Y_m = 1

• Target values indicates the data type: [float]

• S_mtr: A multi-target regression stream with 3 targets

• Number of targets: Y_m = 3

• Target values indicates the data type: [float, float, float]