havBpNet:J

Intro    Network Classes Overview    Data Classes Overview    Packages

havBpNet:J Home Page

Training a Network

We are now ready to train our newly created network.

Typically, network training involves the following steps:

Consider these steps in the form of the following code excerpt:


in.PutValue(tin[pat_list[i]]);    // put input pattern in input layer
hid.Cycle();                      // feed forward through the network
out.Cycle();
out.PutExpected(tex[pat_list[i]]);//  expected response to the output layer
out.Train();                      // Calculate error and update weights
hid.Train();                      // NOTE: uses Interleaved weight updates
 

These steps are normally performed inside of a training loop. One loop iteration is performed for each training pattern in the training set.

Of course, there are other duties performed by the training loop, such as determining when to stop training, but, for now, let’s examine what is happening in each of the above messages.

In the xortrain.java example, we have defined the training set as an array of training patterns (tin[][]) where each row of the array is a training pattern. Each training pattern consists of three input values. We have defined a separate, parallel array of expected value patterns (tex[][]). In our example, since the output layer contains only one node, each expected value pattern contains a single value.

We begin by presenting the input pattern to the input layer. This particular form of the PutValue(...) method passes a vector of double values (one value for each node) to the input layer. Recall that we have already assigned a fixed value to the bias layer’s node.

Feeding the data forward to a layer is accomplished by sending the receiving layer a Cycle() message. When a layer receives the Cycle() message, it will cause each of its nodes to process input from all input connections. Notice, since we established the transfer function used by the nodes in a layer when we created the layer, we don’t have to concern ourselves with it here.

The PutExpected(...) message is used to provide the values that the output layer will use for determining the overall network error. This message can be sent either before or after data is cycled forward to the output layer, but must be sent before the layer is trained.

Once data has been processed through to the output layer and the output layer has been given the expected response for the current training pattern, we are ready to calculate the network error and adjust the network’s connection weights. These steps are accomplished by sending the Train() message to certain layers in the network. Since training a layer involves adjustment of the layer’s input weights, the Train() message will only be sent to layers with input connections.

Notice that training is performed beginning with the output layer and proceeding backwards through the layers toward the input layer. If there are more than one output layers in a network and if several of these output layers are fed by the same hidden layer, then all of these output layers must be trained before that hidden layer is trained to ensure correct error calculation in the hidden layer.

When the output layer receives the Train() message, it will calculate the overall network error as the difference between the expected response and the actual response of the network. This error will then be used to update (train) the output layer’s input connections.

Notice in our example that the weights between the hidden and output layers will be updated before error is propagated backwards to the hidden layer. This is an example of what we call interleaved weight updates. The effect is that the error in one layer is backpropagated over weights that are not the same as they were when the network forward cycled its response. Many believe that this is acceptable since error is calculated only locally for each node. However, should you want to avoid this condition, you may choose to use non-interleaved weight updates by using the following code excerpt in place of the two Train() messages (shown above) with a ClaculateUpdates() and UpdateWeights() pair (shown below in red):


in.PutValue(tin[pat_list[i]]);    // put input pattern in input layer
hid.Cycle();                      // feed forward through the network
out.Cycle();
out.PutExpected(tex[pat_list[i]]);//  expected response to the output layer
out.CalculateUpdates();           // Calculate error and INPUT weight updates for all layers
hid.CalculateUpdates();
out.UpdateWeights();              // apply weight updates to all layers
hid.UpdateWeights();
 

Notice in this new code excerpt that weight adjustments are first calculated but not applied. This is done using the CalculateUpdates() messages in place of the Train() messages. In this way, error is propagated backwards across weights that were used in the forward processing. Only after all weight adjustments for the entire network are calculated are weight adjustments applied using UpdateWeights() messages.

The “type” of the layer will determine how errors are calculated. A layer of type havOUTPUT will calculate the error as the difference between the expected and actual network response. All other layers will retrieve error across their output connections (from “higher” layers).

This is an important consideration when building composite networks which consist of several smaller networks. If each component network is to be trained with its own expected values, then the output layers of the component networks must be of type havOUTPUT. If, on the other hand, certain component networks are designed to train based on error received from other component networks, then the output layer of these first networks must not be of type havOUTPUT.

If you are creating a network that consists of several sub-nets, consider how the sub-nets should be connected. Consider the example where two sub-nets (A and B) are created with the hidden layer of A connected as input to the hidden layer of B. If this connection is made using the Connect(...) member function, then errors calculated in B will propagate backwards into A. If this is not what you want, then use an intervening layer and use the Copy(...) member function to copy the values from the hidden layer of A to tmp prior to cycling and training B.

Another point of interest related to the training process and available in havBpNet is the use of a so-called training “epoch.”

Typical pattern-by-pattern weight adjustment causes a change to the global error surface after each training pattern. Often, it is useful to determine the weight adjustments needed for several training patterns based upon the same configuration of the error surface. This can be accomplished by setting the epoch size of a layer to some value greater than 0.

An epoch value may be assigned to a layer at any time. The epoch controls when connection weight adjustments are actually applied. If, for example, the epoch of a layer is set to “10”, then weight adjustment will only occur on every tenth receipt of the Train() message. On the other nine occurrences, the amount of weight adjustment required is calculated and summed so that on the tenth occurrence the averaged sum of all ten adjustments is applied.

Imagine some global error surface with some global minimum - M. The net is positioned (by the value of its weights) at some origin - O. When the first pattern is processed, the resulting error would indicate some change in position on the surface (a vector A from O). If the net's weights are changed between each pattern, the effect is that of changing the origin between each pattern.

If, on the other hand, the net's weights are changed only after some number of patterns have been processed, the same origin is used for all patterns between the updates - and when updates are applied, the summed update will be a sum of the change vectors, all of which were calculated from the same origin on the error surface.

Determining Network Training Error

It is the user’s responsibility to determine when a network has been sufficiently trained; however, havBpNet provides several methods that help in the determination.

First, when testing network performance, we want to use a static version of the network. That is to say, we don’t want weight updates occurring between patterns (or epochs) while testing. havBpNet allows the user to disable weight adjustments by sending the SetTraining(havSys.havOFF) message to a layer as illustrated below.


out.SetTraining(havSys.havOFF);   // disable weight updates while testing
 

With weight adjustment thus disabled for the output layer, we are able to send the Train() message to the output layer to calculate the layer’s error without causing weight adjustments. Since, typically, when testing network performance, we are only interested in the overall network error (i.e. that error in the output layer), there is no need to calculate errors in the hidden layer (i.e. no reason to send the Train() message to the hidden layer) and, therefore, no reason to disable weight updates for the hidden layer.

Having calculated the network error, we need a way to allow applications to retrieve the output layer’s error for use in determining network performance. The havBpLayer class provides four methods for retrieving errors from a layer. The user can ...

The user can use any or all of these methods in an application depending on specific needs. For example, one might wish to report the error for each individual node but use the layer’s squared error for overall performance determination.

It is the user’s responsibility, if using the GetError(...) method to retrieve an array of values, to ensure that the array passed as an argument contains at least as many elements as there are nodes in the layer.

Finally, when we have finished testing the static network’s performance, we must re-enable weight adjustments for the output layer and continue training. This is also accomplished using the SetTraining(...) method as shown below...


out.SetTraining(havSys.havON);    // re-enable weight updates
 



Overview || Create || Train || Save || Restore || Consult || Data Scaling

Intro    Network Classes Overview    Data Classes Overview    Packages

Copyright © 1998 by hav.Software. All Rights Reserved.