CSCE 633 Project 2 - Neural Network due: Tues, Mar 20, 2012 Implement a neural network using the Backprop algorithm (in any language you like). You will have to handle both discrete and continuous input attributes, but you may assume the target feature is discrete (i.e. classification problems, though they might have >2 classes). Important decisions to make are implementation details such as threshold function, input and output encoding, and stopping criterion (you will want to monitor mean-squared error (MSE) on a validation set). You might also consider implementing momentum or an adaptive learning rate, and test to see if it improves performance. You should make the number of hidden nodes and hidden layers an option in your code. As a special case, you should implement a Perceptron (equivalent to 0 hidden layers), and compare it to a multi-layer network. Note that past experience has shown that training a neural network often requires adjusting the learning rate for each dataset by trial-and-error (report the chosen learning rate and final MSE). To test your algorithm, you can run it on datasets from the UCI Machine Learning Repository. You should at least present results on the following databases for comparison: voting, mushroom, heart disease, and iris. You may include any other datasets that interest you. There are some other good ones, like census, LED, secondary structure, promoters, chess, wine, etc. Evaluate your algorithm using 10-fold cross-validation. Report appropriate statistics for estimating true accuracy and confidence intervals. Also, compare the performance of your neural network with your decision tree. Evaluate the statistical significance of any differences in accuracy, and try to interpret/explain it. What to turn in: a written report, plus a print-out of your code. Focus on a brief description of your implementation (the methodological details, like the weight-update equations you use and what stopping criterion you use, not coding details like data structures). Put most of your effort into a Results section, in which you present experiments evaluating different aspects of your implementation (for example, how does the number of hidden nodes affect the accuacy? does momentum help convergence?), and statistics like accuracy and number of training epochs for different datasets. At minimum, you should vary the number of hidden nodes, and report the effect of one other aspect of your implementation.