# 1.4. Neural network models (supervised)¶

Warning

This implementation is not intended for large-scale applications. In particular, scikit-learn offers no GPU support. For much faster, GPU-based implementations, as well as frameworks offering much more flexibility to build deep learning architectures, see related_projects.

## 1.4.1. Multi-layer Perceptron¶

**Multi-layer Perceptron (MLP)** is a supervised learning algorithm that learns
a function \(f(\cdot): R^m \rightarrow R^o\) by training on a dataset,
where \(m\) is the number of dimensions for input and \(o\) is the
number of dimensions for output. Given a set of features \(X = {x_1, x_2, ..., x_m}\)
and a target \(y\), it can learn a non-linear function approximator for either
classification or regression.

## 1.4.2. Classification¶

Class `MLPClassifier`

implements a multi-layer perceptron (MLP) algorithm
that trains using Backpropagation.

MLP trains on two arrays: array X of size (n_samples, n_features), which holds the training samples represented as floating point feature vectors; and array y of size (n_samples,), which holds the target values (class labels) for the training samples:

```
>>> from pmlearn.neural_network import MLPClassifier
>>> X = [[0., 0.], [1., 1.]]
>>> y = [0, 1]
>>> clf = MLPClassifier()
...
>>> clf.fit(X, y)
>>> clf.predict([[2., 2.], [-1., -2.]])
array([1, 0])
```

References:

- “Learning representations by back-propagating errors.” Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams.
- “Stochastic Gradient Descent” L. Bottou - Website, 2010.
- “Backpropagation” Andrew Ng, Jiquan Ngiam, Chuan Yu Foo, Yifan Mai, Caroline Suen - Website, 2011.
- “Efficient BackProp” Y. LeCun, L. Bottou, G. Orr, K. Müller - In Neural Networks: Tricks of the Trade 1998.
- “Adam: A method for stochastic optimization.” Kingma, Diederik, and Jimmy Ba. arXiv preprint arXiv:1412.6980 (2014).