next up previous index
Next: Propagation and backpropagation Up: Off-line multi-layer Perceptron Previous: Off-line multi-layer Perceptron   Index



General description

In mathematical terms, a neural network is simply a function with some parameters called weights. This function can be build using a simple element called a neuron. A neuron can be represented by the figure 2.8.

Figure 2.8: Model of a neuron. The output $z$ is calculated using some parameters $\left\{c_i\right\}_{i\in\left[1,p\right]}$ and a function $g$ : $z=g\left(x_1,x_2,\ldots,x_n,c_1,c_2,\ldots,c_p\right)$.
\begin{figure}\begin{center}
\epsfbox{neuronef.ps}
\end{center}
\end{figure}

The function $g$ is composed of a linear combination of inputs which is called potential and an activation function $f$. The potential $\nu$ of a neuron is defined by:


\begin{displaymath}\nu=c_0+\sum_{i=1}^nc_i\cdot x_i\end{displaymath}

where $c_0$ is a constant called bias. The bias can be introduced into the sum by introducing a new input of the neuron whose value is always 1. The bias then becomes the parameter associated with this value. So we can write:


\begin{displaymath}\nu=\sum_{i=0}^nc_i\cdot x_i\end{displaymath}

where $x_0=1$. We can then compute the returned value $z$ using the formula:


\begin{displaymath}z=f(\nu)=f\left(\sum_{i=0}^nc_i\cdot x_i\right)\end{displaymath}

The activation function used is the hyperbolic tangent function or sigmoid. Other sigmoid functions such as $x\mapsto\frac{1}{1+e^{-x}}$ could be used as activation functions.

These neurons can be combined into a network by providing inputs for neurons using the outputs of other neurons. Such a network is called a neural network. So a neural network can have different number of inputs or outputs, and different architectures.

We are particularly interested in an architecture called the multi-layer Perceptron, which figure 2.9 shows. It is composed of $p$ layers, $l_1$ to $l_p$. Each layer is fully connected to the next layer. The output layer has a linear activation function. The other neurons have a sigmoidal activation function.

Figure 2.9: Model of a neural network. The output layer uses a linear activation function and the other neurons use a sigmoid activation function. Each layer is fully connected to the next layer.
\begin{figure}\begin{center}
\epsfbox{mlpfig.ps}
\end{center}
\end{figure}

We also need a way to assess the performance of the learned mapping. The evaluation of the error is done using the sum-of-squares error function over a set $T$. The set $T$ is a set of pairs. Each pair is composed of an input and the corresponding desired value of the output (target) of the neural network. The sum-of-square function is:

\begin{displaymath}E_T(w)=\frac{1}{N}\sum_{i=1}^NE_i(w)\end{displaymath}

where $w$ is a vector of weights, $N$ is the number of samples in the set $T$ and

\begin{displaymath}E_i(w)=\left(d_i-z_i(w)\right)^2\end{displaymath}

$z_i(w)$ is the image of the $i^{th}$ input of the set $T$ calculated with the weights $w$. $d_i$ is the corresponding desired value. $E_i(w)$ is the square of the difference of the result of the neural network for a particular input and the desired value.

This error function has to be minimized with respect to $w$ according to a learning set. The result of this minimization gives us a set of weights which form a trained network. For each regular function, we can find a neural network that can fit this function with an arbitrary precision.

We can assess the performance of the training using a test set which is different from the learning set. The computation of the sum-of-square error over this test set gives us a measure of the performance of the neural network.


next up previous index
Next: Propagation and backpropagation Up: Off-line multi-layer Perceptron Previous: Off-line multi-layer Perceptron   Index

franck 2006-10-15