Homework Five: Adaptive LTUs and More

Part 1: More with NeuralNetUnits

This week, you will revise your LTU implementation so that it can learn a particular output function.

First, in case you didn't do this to start with, you should treat the threshold as just another weight with an artificial input that is held constant at -1. Thus, w0 is usually the effective threshold and is associated with a constant -1 input, x0. Then w1 through wn are the weights for the actual inputs to the LTU, as before. Now, when you compute the weighted sum, you should sum from 0 through n, to include the threshold. This total sum will be greater than 0 when the previously computed weighted sum over only the actual inputs exceeded the threshold (i.e., weight w0).

Second, rather than using the step function we used before to determine a unit's output, we can use a sigmoid function that has certain desirable properties that you can learn about by taking CS150 Machine Learning. The commonly used function is: sigma(X)=1/(1+e-X), where X is the weighted sum for this LTU (computed as described above). For example, an LTU would have weighted sum X (including the threshold, -1*w0), and would have an activation from the sigmoid function, Y = 1/(1+e-X).

Third, in order to train an LTU, we need to distinguish the activation (i.e., as the value of the sigmoid function given the linear weighted sum) from the symbolic outpet value that we associate with a particular activation, (e.g., either 0 or 1). In other words, an activation of 0.91 may be treated as an output of ‘1’ but we will need the 0.91 when it comes time to train the weights of the LTU.

At last, we get to training where we have a set of examples having particular inputs and an associated desired output value. Recall that we want the LTU to compute a function from inputs to desired outputs. We obtain an error signal for an LTU as the difference between the desired output and the actual activation (as given by the sigmoid function). Thus, the error, e, is given as e=d-Y, for desired output d and actual activation, Y, obtained from the sigmoid function. Now we want to use the error to follow the gradient toward the ideal weight settings for the function that maps inputs to desired outputs. So taking the derivative we get an error gradient, g=Y(1-Y) e, where Y is the output of the sigmoid and e is the error as described above.

Finally, during training, wi gets replaced by wi + a xi g for weight wi and input xi the LTU in question. The variable a is a learning rate parameter that you can hold constant at some small value less than 1.0 (0.1 or 0.05 might be a good starting place).

Requirements

For this week's assignment, use the above information to modify your MyLTU class as follows.

  1. Revise MyLTU to reflect updated NeuralNetUnit interface. As mentioned in my forum post, you should implement the new method, getOutput(); this should amount to changing the name of one your existing methods. Also, modify your implementation of getActivation() so that it returns a double instead of an int.
  2. Complete NeuralNetUnit interface. You should now implement the three methods from the NeuralNetUnit interface that we ignored in Assignment Four. These are: updateWeights, computeGradient, and getGradient.
  3. Constructing an LTU. Your constructor(s) should initialize an LTU to have random weights. I suggest values perhaps in the range [-0.1,0.1].
  4. Exercise the training of your LTU. You probably want to write a main method (or a driver class) that starts with a new LTU (having random weights and threshold), and then trains it by looping through a set of examples so that it eventually computes a basic boolean function on the given inputs.
  5. Test your LTU. You should update your JUnit test class for MyLTU so that it tests the training functionality you have implemented for this assignment. Ideally, this will absorb your code from the previous item. We will test your program with a driver class that (a) creates an instance of your MyLTU with an arbitrary number of inputs, (b) sets those inputs to random values, and then (c) attempts to train your MyLTU on an arbitrary boolean function of our choice.
  6. Documentation. Write the appropriate javadoc comments and generate the html documentation.

Part 2: Recursive Algorithms

Requirements

Write a public static method, lessThan, that takes as argument an array of ints and returns a boolean value. Your method must be recursive and may not contain any for or while loops. The method returns true only in the case where the array elements are in strictly increasing order starting with index 0 up to n-1 for an array, a, with n elements. That is, if a[0] < a[1] < a[2] < ... < a[n-2] < a[n-1], your method should return true; otherwise -- if any of these pairwise relations do not hold -- it should return false. You should place this method in a class, RecursiveLT, that contains this method (although you are welcome to write additional helper methods). You should write a main method for your own testing as we will be grading you, in part, on your own testing. You should also write a JUnit test class for RecursiveLT. We will also base your grade on how your code works with our own driver that calls your static method directly (i.e., without creating an instance).

Submission instructions:

Do not forget to use the javadoc constructs for your documentation and include the html files produced by the javadoc tool in your final submission. Make sure that you create a folder named with your Westmont email name followed by “HW5”. For example, someone with email address “cjones” would create a folder called “cjonesHW5”. Make sure that inside that folder you have: for both parts of the assignment, java source files and html documentation files resulting from javadoc. Finally, either tar or zip the folder so that when we extract it, the folder “<emailname>HW5” will be created. Submit the packaged file for assignment 5 on Eureka.