CS030 -- Introduction to Computer Science II
Fall, 2005
Assignment 7
(updated 10/18/2005)


Programming Project:
This week, you will extend your Neural Network project so that it can learn to classify inputs, or in other words to compute an arbitrary function.

First, in order to train a network, we need to distinguish the actual activation from the symbolic value we associate with a particular activation.  For example, an activation of 0.91 may be treated as an output of "1" but we will need the 0.91 when it comes time to train the network.

Second, it is convenient to treat the threshold as just another weight with an artificial input that is constantly -1.  Thus, w0 is usually the effective threshold with a constant -1 input, x0.  Then w1 through wn are the weights for the actual inputs to the LTU.  When you compute the weighted sum from 0 to n, the sum will be greater than 0 when the weighted sum of the actual inputs exceed the threshold (weight on x0).

Third, rather than using the step function we used before to determine a unit's output, we can use a sigmoid function that has certain nice properties.  The commonly used function is: sigma(x)=1/(1+e-x), where is the weighted sum for a given node.  For example, a hidden node j, would have weighted sum Xj (including the threshold, x0), and would have an output, Yj = 1/(1+e-Xj). 

Now, we get to training.  An error signal for node k is the difference between the desired output and the actual output (as given by the sigmoid function).  Thus, ek=dk-yk, where ek is the error for desired output dk and actual output yk.  Now we want to use the error to follow the gradient toward the ideal weight settings across the network.  So taking the derivative we get a error gradient, gk=yk (1-yk) ek, for each output node k, and gj=yj (1-yj) sum(wjk gk) for each hidden layer node j.  Here, the wjk is the weight on the network link between hidden unit j and output unit k.

Finally, during training, wij gets replaced with by wij + a xi gj for weight wij on link between input unit i and hidden unit j, where a is a learning rate parameter that you can hold constant at some small value less than 1.0.  Similarly, weight wjk becomes wjk + a yj gk, for weight on link between hidden node j and output node k, where yj is the sigmoid activation for node j and gk is the error gradient for output node k.

So modify your LTU and NeuralNetwork classes to accomodate training.  Add a method, trainOutputs(int[]), to your NeuralNetwork that consumes the desired array of outputs and alters the weights throughout the network accordingly.  [If you want, you may assume that setInputs and getOutputs have already been called.  --wfi 10/18, 6:14pm]  I want you to consider a seven-segment LED that displays the digits 0-9.  Associate one input with each of the LED segments.  Try a network with four outputs and a binary encoding of the output signals determining the output value.  For example, an output of 0,1,0,0 could represent the digit 8.  Then train your network to correctly predict this.

Submission Instructions:
On your machine where you are doing your homework, create a folder called <your email name> followed by "A7".  For example, someone with email address "cjones" would create a folder called "cjonesA7".  Inside that folder, place plain text file(s) containing your answers to any exercises.  Also, place whatever Java files are necessary for your Programming Projects in the same folder.  Finally, either tar or zip the folder so that when I extract it, the folder "<emailname>A7" will be created.  Finally, submit via Eureka.