CSCE 633 
Project 2 - Neural Network
due: Tues, Mar 20, 2012


Implement a neural network using the Backprop algorithm (in any
language you like).  You will have to handle both discrete and
continuous input attributes, but you may assume the target feature is
discrete (i.e. classification problems, though they might have >2
classes).  Important decisions to make are implementation details such as
threshold function, input and output encoding, and stopping criterion
(you will want to monitor mean-squared error (MSE) on a validation
set).  You might also consider implementing momentum or an adaptive
learning rate, and test to see if it improves performance.  You should
make the number of hidden nodes and hidden layers an option in your
code.  As a special case, you should implement a Perceptron
(equivalent to 0 hidden layers), and compare it to a multi-layer
network.  Note that past experience has shown that training a neural
network often requires adjusting the learning rate for each dataset by
trial-and-error (report the chosen learning rate and final MSE).

To test your algorithm, you can run it on datasets from the UCI
Machine Learning Repository.  You should at least present results on
the following databases for comparison: voting, mushroom, heart
disease, and iris.  You may include any other datasets that interest
you.  There are some other good ones, like census, LED, secondary
structure, promoters, chess, wine, etc.

Evaluate your algorithm using 10-fold cross-validation.  Report
appropriate statistics for estimating true accuracy and confidence
intervals. 

Also, compare the performance of your neural network with your
decision tree.  Evaluate the statistical significance of any
differences in accuracy, and try to interpret/explain it.

What to turn in: a written report, plus a print-out of your code.
Focus on a brief description of your implementation (the
methodological details, like the weight-update equations you use and
what stopping criterion you use, not coding details like data
structures).  Put most of your effort into a Results section, in which
you present experiments evaluating different aspects of your
implementation (for example, how does the number of hidden nodes
affect the accuacy? does momentum help convergence?), and statistics
like accuracy and number of training epochs for different datasets.  
At minimum, you should vary the number of hidden nodes, and report
the effect of one other aspect of your implementation.