In this project, we analyze 3 tasks with neural networks:
1. Normal XOR task:
Input 1 Input 2 output 1 0 0 0 0 1 1 1 0 1 1 1 0
2. XOR task with 2 out put nodes, a new output node is added based on task 1, the output of output node 2 is the same as input node 1.
Input 1 Input 2 output 1 output 2 0 0 0 0 0 1 1 0 1 0 1 1 1 1 0 1
3. XOR task with 3 out put nodes, a new output node is added based on task 2, the new output node replicate input node 2.
Input 1 Input 2 output 1 output 2 output 3 0 0 0 0 0 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1
For task 1, we solve it with a neural network of 2 hidden units, the parameters are: time step 0.5, momentum 0.5, stop criterion 0.01, random seed 5, hidden layer activation function is sigmoid*2-1, output layer activation function is sigmoid. The training curve is shown in figure 1. The network is trained after 1520 training cycles (NOT epoch). The hidden layer representation is shown in figure 2. The x-axis represent the output of hidden unit one, the y-axis represent the output of hidden unit two. Blue points means the record with target value 0, red points means the record with target value 1. The blue line is the hidden units' values where the output node will have value 0.5, that is, the weighted sum at output node is zero. So, when the network is trained well, the blue line will divide blue points from red points well. For task 2, we solve it with a neural network of 2 hidden units, the parameters are same as those of task 1, except we have 2 output units. The network is trained after 1810 cycles. The training curve is shown in figure 3, the hidden layer representation is shown in figure 4. In figure 4, the blue line represents the weights from hidden layer to output node 1, and the bias of output unit 1, the green line represents the weights to output node 2, and the bias of output unit 2.
Figure 1. Training curve of task 1 |
Figure 2. Hiden layer representation of task 1 |
Figure 3. Training curve of task 2 |
Figure 4. Hidden layer representation of task 2 |
For task 3, first, we use the neural network with same parameters as those of task 1 and task 2, except there are 3 output units. But the network is never trained. After observing the output values of the network, we found output unit 2 and 3 are already trained, but output unit 1 is not. So we copy the weights and biases of trained network of task 1 to current network, leave the other weights and biases random, and train the network, the network is trained after 1420 cycles. We can see, after training, each line divides two points from the other two. This experiment shows that task 3 can be solved with a neural network of only 2 hidden units.
Before training |
After training |
Dr. Munro
proposed a method to solve task 3, that is, adding weights to the error function:
In this equation, E(n)is total error energy, ej is error signal of neuron j, aj is the training weight of neural j. We choose training weights 0.8, 0.1, 0.1 for the 3 output units. Training curve is shown is figure 7, hidden layer representation is shown in figure 8. In figure 8, we can see, after only 900 cycles, output node 1 is trained, that means, the error of output node 1 is small enough, we can also see it from figure 7, before 900 cycles, the error decreases very fast, after that, error decreases slowly.
(a) Before training |
(b) Output node 1 is trained after 900 cycles |
(c) Output node 2 is trained after 7000 cycles |
(d) Task is solved after 18100 cycles |
Then we change the training weights, and train the neural network with same parameters and initial weights. The training curves is shown in figure 9.
(a) Training weights: 0.9, 0.05, 0.05 |
(b) Training weights: 0.7, 0.15, 0.15 |
When we change the training weight of output node to be less than and equal to 0.6 (the sum of training weights is 1), the network is not trained after 10 million cycles. By observing the results, we find out that output node 1 (that is the XOR task) is never trained.
Since the XOR task at output node one is relatively difficult than those of the other two output nodes, if they have same training weights, the two easier tasks will be trained quickly, they have a large force to not let the XOR task be trained. So if we give the difficult task a larger training weight, the problem can be solved.
We generate XML result file with the Neural Network GUI, then we use matlab program to read the XML result file, and generate animation avi file. The source code to generate the animation is a .m file:
here.
In the above .m file, a XSLT file named math.xsl is called:
here.
A sample XML result file is shown
here.
Here is a list of the result xml files and generated avi files:
| description | result xml file | avi file |
|---|---|---|
| XOR task alone | xor0res.xml | xor.avi |
| XOR task alone with one replicated output | xor1res.xml | xor1.avi |
|
XOR task alone with two replicated outputs
(copy weights from XOR task) |
xor3copyres.xml | xor3copy.avi |
|
XOR task alone with two replicated outputs
(training weights 0.9, 0.05, 0.05) |
xor39res.xml | xor39.avi |
|
XOR task alone with two replicated outputs
(training weights 0.8, 0.1, 0.1) |
xor38res.xml | xor38.avi |
|
XOR task alone with two replicated outputs
(training weights 0.7, 0.15, 0.15) |
xor37res.xml | xor37.avi |