During the early years of neural network research, the perceptron was considered the most successful type of network. The training algorithm used by perceptron networks is based on mathematics, not on biological phenomenon. In spite of this, the algorithm looks surprisingly like the Hebb rule. In fact the only difference between the perceptron network and the Hebb Net, is a slight adjustment of the learning rule and a new activation function. Instead of using a simple bipolar or binary threshold function, the perceptron uses a new function that allows for a boundary region instead of a boundary line between the two output categories. The new activation function is the following:
This activation function allows an output node in a perceptron to have three responses as opposed to the two we were limited to in a Hebb Net. If we think of one to mean yes and negative one to mean no, zero would then mean I dont know. If a trained perceptron network is presented with data that doesnt seem to fit into the categories it was trained for, the network will probably output 0, instead of classifying the input into a category it doesnt belong to.
The training rule for a perceptron, as was
stated earlier, is very similar to the mathematical
interpretation of the Hebb Rule. The perceptron training rule
simply adds a learning rate constant. The new rule looks like
this: , where t
is the target output value, and a is the learning rate
constant. The purpose of the learning rate constant is to allow
the weights to take on a greater range of values. This allows the
network to have more flexibility during training. However because
of the greater changes, the weights converge upon the ideal set
of weights more slowly. When using the perceptron training rule,
instead of always updating the weights, you only update the
weights if the output values calculated by the network are
different from the desired output values. This prevents the
network from unlearning information learned by previous training
patterns. There is one last difference in the perceptron training
process. Instead of going through each training pattern once
during the training process, you must repeat the process with the
same set of training patterns until no weight changes occur. If
no weight changes occur for an entire epoch, one iteration of
training with all training patterns, then the network is
completely trained and training stops.
Since this training algorithm isnt based on biology, why does it work? The answer to this question is unfortunately very complicated. In fact, the proof that shows that it will work is more complex than the proofs for some of the more powerful training algorithms. Reading the proof gives very little insight into why the neural network works. It is just a lot of vector and matrix mathematics combined with some amazing deductive logic. Because of its complexity, the proof for this training process is going to be omitted. However just remember that there is a theorem, the Perceptron Learning Rule Convergence Theorem, that proves this training algorithm will work if a set of weights that solves the given problem exists. Just for reference, a statement of the theorem is as follows:
If there is a weight vector w* such that f(x(p)w*)=t(p) for all p, then for any starting vector w, the perceptron learning rule will converge to a weight vector (not necessarily unique and not necessarily w*) that gives the correct response for all training patterns, and it will do so in a finite number of steps. (Fausett 77)
Home Page Introduction
Biological Neural
Networks
McCulloch-Pitts Neuron
Perceptron Adaline
Back Propagation
network
References Research Paper
Glossary
Beware: This
page is always under construction
Geocities
Geocities
Research Triangle