CS030 -- Introduction to Computer Science II
Fall, 2006
Homework 7
(updated 10/14/2006)


Exercises:
[none]

Programming Project:
This week, you will create a NeuralNet class that implements the NeuralNetI interface.  Your NeuralNet class will consist partly of collections of instances of your LTU class.  You will use the NeuralNet to train and recognize a seven-segment LED.

Background information
In Machine Learning applications, we often have a collection of examples that consist of attributes and a class label.  When using a neural network, the attributes would be fed into the network as inputs and the output of the network should correspond to the class label.  If they do not correspond, we would update the weights so they do.  Because of the nature of neural networks and the learning rate parameter, the desired and actual outputs might not correspond after a single update.  Thus, we would train the network by repeatedly presenting inputs and updating weights.  If we want to evaluate how well we've learned to classify the data, we set the inputs, compute the outputs and compare to the desired outputs given by the class label.

The important characteristics of a NeuralNet include the number of input units, number of hidden units and number of output units.  Typically, neural networks have an input layer of primitive units, a single hidden layer of units, and an output layer.  Usually (and for our purposes), the network is fully connected between layers; that is, every input unit provides a signal to every hidden unit, and every hidden unit provides an input to every output unit.  When initializing your NeuralNet, you may set weights randomly.  The NeuralNetI interface mandates a method to set inputs and determine outputs.  Naturally, determining outputs should perform a cascaded computation of the inputs at the input layer to the hidden layer units' inputs, compute the outputs of the hidden layer which then feed into the inputs of the output layer units, and finally compute the outputs for the output layer.

Since the input layer units are a bit different from the hidden and output layer units, your code for a single LTU should work with both types.  (Hint: An elegant way to do this is to create an InputUnit type that extends NeturalNetUnit and then treat inputs as NeuralNetUnit.)  Your constructor for the LTU should consume an argument for the number of inputs to the unit (if it does not already do so).

Modify your implementation of the LTU and write an implementation of the NeuralNet.  Your NeuralNet should have a constructor that takes three integers representing the number of input units, hidden units, and output units respectively.  It will also be responsible for stitching together the layers in a fully-connected manner as described above.

Although the foundation is in place to train your NeuralNet, we need an extra bit of flow in order to propogate the error backwards through the network.  (You are implementing what is known as the back-propogation algorithm.)  To repeat and extend the previous instructions:

... in order to train a network, we need to distinguish the actual activation from the symbolic value we associate with a particular activation.  For example, an activation of 0.53 may be treated as an output of "1" but we will need the 0.53 when it comes time to train the network.

... rather than using the step function we used before to determine a unit's output, we can use a sigmoid function that has certain nice properties.  The commonly used function is: sigma(x)=1/(1+e-x), where X is the actual weighted sum for a given node.  For example, a hidden node j, would have weighted sum Xj (including the threshold, x0), and would have an output, Yj = 1/(1+e-Xj).  This expression ranges between 0 and 1 for inputs between negative and plus infinity.

Now, back to training.  An error signal for node k is the difference between the desired output and the actual output (as given by the sigmoid function).  Thus, ek=dk-yk, where ek is the error for desired output dk and actual output yk.  Now we want to use the error to follow the gradient toward the ideal weight settings across the network.  So taking the derivative we get a error gradient, gk=yk·(1-yk)·ek, for each output node k, and gj=yj·(1-yjsum(wjk·gk) for each hidden layer node j.  Here, the wjk is the weight on the network link between hidden unit j and output unit k.

Finally, during training, wij gets replaced by wij + a·xi ·gj for weight wij on link between input unit i and hidden unit j, where a is a learning rate parameter that you can hold constant at some small value less than 1.0.  Similarly, weight wjk becomes wjk + a·yj ·gk, for weight on link between hidden node j and output node k, where yj is the sigmoid activation for node j and gk is the error gradient for output node k.

Requirement summary
So modify (as needed) your NeuralNetUnit implementation and write a class, MyNeuralNet, that implements the NeuralNetI interface.  Your method, MyNeuralNet.trainOutputs(int[], int[]), should consume an array of input values and the corresponding array of desired outputs.  As a side-efect, it should alter the weights throughout the network such that the actual outputs will more closely correspond to the desired outputs. 

I want you to consider a seven-segment LED that displays the digits 0-9.  Associate one input with each of the LED segments.  Try a network with four outputs and a binary encoding of the output signals determining the output value.  For example, an output of 1,0,0,0 could represent the digit 8.  Then train your network to correctly classify this.  For an "era" of size n (say 100), you should randomly select a number between 0 and 9 (inclusive), determine your network's inputs as the segments that are lit, and then train the network for those inputs and the desired outputs.  At the end of each era, you should test your network using each of the 10 digits once and display how many your network classified correctly.  Continue training eras until you get all 10 digits correct.

Optionally, you may want to compare this to a network with 10 outputs where each digit has a dedicated output units.  That is, when the input segments representing the 8 are "on", the ninth output unit should produce 1, while all the other units should be 0.


Submission Instructions:
On your machine where you are doing your homework, create a folder called <your email name> followed by "A7".  For example, someone with email address "cjones" would create a folder called "cjonesA7".  Inside that folder, place plain text file(s) containing your answers to any exercises.  Also, place whatever Java files are necessary for your Programming Projects in the same folder.  Finally, either tar or zip the folder so that when I extract it, the folder "<emailname>A7" will be created.  Finally, submit via Eureka.