CS 4793: Introduction to Artificial Neural Networks

Syllabus for Fall 2003.

Frequently Asked Questions about Neural Networks.


Project 1. You can download the initial source code as a zip file or as a tar file. Here is the iris dataset.

Project 2.


Homework 1, Homework 2, Homework 3.


1.2.1. Basics of Neural Networks, data used for graphs (iris1, iris2, iris3), gnuplot commands to generate various plots:
plot [0:7] [0:3] "iris1" notitle, "iris2" notitle, "iris3" notitle, \
(-0.6*x + 2.0)/0.8 notitle, -0.5*x + 4.0 notitle

sigmoid(x) = 1.0/(1.0 + exp(-x))
unit1(x,y) = sigmoid(0.95022*x + 1.2699*y - 2.7849)
unit2(x,y) = sigmoid(-3.6089*x - 11.5877*y + 38.2507)
out1(x,y) = sigmoid(-22.7877*unit1(x,y) + 0.26565*unit2(x,y) + 5.023)
out2(x,y) = sigmoid(16.2471*unit1(x,y) + 7.9134*unit2(x,y) - 18.1983)
out3(x,y) = sigmoid(16.196*unit1(x,y) - 37.7073*unit2(x,y) - 12.8456)

splot [0:7] [0:3] unit1(x,y), unit2(x,y)
splot [0:7] [0:3] out1(x,y), out2(x,y), out3(x,y)
plot [-5:5] sigmoid(x)
1.3.1. Approximation, the first four graphs were generated using the book's Matlab software, the fifth graph was generated using the gnuplot command:
plot [-1:2] x**2 + 2*x**3 title "f", 5.2*x + 0.9 title "L2", \
3.62*x + 0.94 title "L1", 7.0*x + 0.98 title "Linf"
I calculated the norms using Maple to do some integration, algebra, and calculations for me.

To do the least-squares calculation, I started with the iris1, iris2, iris3 files. For each line of two inputs, x1 and x2, I generated a line of six inputs and one output. The six inputs are the zero-, first-, and second-order terms, and the output is either 0 or 1. The resulting file was input to a Perl hack of mine for calculating the least squares coefficients (not to be used seriously). The last plot used those coefficients and these three files (iris11, iris22, iris33). Here are the gnuplot commands to create the plot:

set contour base
set view 0,0
set cntrparam levels incremental 0,0.25,1
h(x,y) = -1.30465212333162 + x*1.15333621204672 - y*0.0687323088406203 \
 - x*x*0.197952522130871 - y*y*0.744328027286469 + x*y*0.396016730830849
splot [0:7] [0:3] "iris11" notitle, "iris22" notitle, \
"iris33" notitle, h(x,y) notitle
1.3.2 Nonlinear Approximation. I describe how three of the figures were generated.

The noise in the 1+2x function is from a uniform distribution between -1 and 1. Using gnuplot, the error on the data was calculated by:

err(x,y,w1,w2) = (y - (w1 + w2*x))**2
sumerr(w1,w2) = \
  err(-4,-7.6193314,w1,w2) + err(-3,-4.653893,w1,w2) + \
  err(-2,-3.201543,w1,w2) + err(-1,-0.22456038,w1,w2) + \
  err(0,0.4883707,w1,w2) + err(1,3.3397593,w1,w2) + \
  err(2,4.6062903,w1,w2) + err(3,7.4832516,w1,w2) + \
The error surface was generated by:
set nocontour
set view 60,60
set surface
set xlabel "w0"
set ylabel "w1"
splot [-2:4] [0:4] sumerr(x,y) notitle lw 2
The error contours was generated by:
set contour base
set cntrparam levels discrete 5,10,20,50
set view 0,0
set nosurface
set xlabel "w0"
set ylabel "w1"
splot [-2:4] [0:4] sumerr(x,y) lw 2
Creating a .eps file was done by:
set output "linerr.eps"
set terminal postscript eps color "Timesroman" 24
set terminal x11
I used xfig to convert .eps files to .pdf files.

1.4. Regression and Classification

The first graph was generated with the following gnuplot commands:

sx = sqrt(2.1)
sy = sqrt(1.2)
sxy = -0.9
p = sxy/(sx*sy)
pi = 2.0*acos(0)
P(x,y) = 1.0/(2*pi*sx*sy*sqrt(1-p*p))*exp(-0.5/(1-p*p)* \
   (((x-2.5)/sx)**2 - 2*p*(x-2.5)/sx*(y-1.5)/sy + ((y-1.5)/sy)**2))
splot [0:4.5] [-1:4] P(x,y)
It was saved to a .eps file by these additional gnuplot commands:
set output "mndist.eps"
set terminal postscript eps color "Timesroman" 24
set terminal x11
I used xfig to create a .pdf file from the .eps file.

The contour plot was generated by still more gnuplot commands:

set view 0,0
set contour base
set surface
splot [0:4.5] [-1:4] P(x,y) lw 2, "multinorm.line" notitle with lines lw 2
The file multinorm.line are points on the line that can be derived from the book's formulas.

The last graph is generated from the gnuplot commands:

plot "multinorm.data" using 2:3 notitle ps 2, \
  3.412 - 0.687*x notitle lw 2, \
  1.5 - 0.9*(x-2.5)/2.1 notitle lw 2
The file multinorm.data are points generated randomly using matlab. In the matlab command window:
Sigma = [2.1 -0.9; -0.9 1.2];
mu = [2.5 1.5];
mvnrnd(mu, Sigma)
will generate a single random point from a normal distribution with mean mu and covariance matrix Sigma. Repeatedly calling mvnrnd(mu, Sigma) will generate more random points. I used my perl hack to find the least squares line for these points.

1.4.2. Classification with Normal Distributions

In gnuplot, I define the three probability distribution and decision functions as follows:

sa1 = sqrt(0.0301)
sa2 = sqrt(0.0115)
sa12 = 0.0057
pa = sa12/(sa1*sa2)
pi = 2.0*acos(0)
PA(x1,x2) = 1.0/(2.0*pi*sa1*sa2)*sqrt(1.0-pa*pa)*exp(-0.5/(1.0-pa*pa)* \
(((x1-1.464)/sa1)**2 - 2*pa*((x1-1.464)/sa1)*((x2-0.244)/sa2) + \
DA(x1,x2) = -0.5*log(1.0-pa*pa) -0.5/(1.0-pa*pa)* \
(((x1-1.464)/sa1)**2 - 2*pa*((x1-1.464)/sa1)*((x2-0.244)/sa2) + \

sb1 = sqrt(0.2208)
sb2 = sqrt(0.0731)
sb12 = 0.0731
pb = sb12/(sb1*sb2)
pi = 2.0*acos(0)
PB(x1,x2) = 1.0/(2.0*pi*sb1*sb2)*sqrt(1.0-pb*pb)*exp(-0.5/(1.0-pb*pb)* \
(((x1-4.260)/sb1)**2 - 2*pb*((x1-4.260)/sb1)*((x2-1.326)/sb2) + \
DB(x1,x2) = -0.5*log(1.0-pb*pb) -0.5/(1.0-pb*pb)* \
(((x1-4.260)/sb1)**2 - 2*pb*((x1-4.260)/sb1)*((x2-1.326)/sb2) + \

sc1 = sqrt(0.3046)
sc2 = sqrt(0.0754)
sc12 = 0.0488
pc = sc12/(sc1*sc2)
pi = 2.0*acos(0)
PC(x1,x2) = 1.0/(2.0*pi*sc1*sc2)*sqrt(1.0-pc*pc)*exp(-0.5/(1.0-pc*pc)* \
(((x1-5.552)/sc1)**2 - 2*pc*((x1-5.552)/sc1)*((x2-2.026)/sc2) + \
DC(x1,x2) = -0.5*log(1.0-pc*pc) -0.5/(1.0-pc*pc)* \
(((x1-5.552)/sc1)**2 - 2*pc*((x1-5.552)/sc1)*((x2-2.026)/sc2) + \
PA, PB, and PC are the probability distribution functions, and DA, DB, and DC are slightly simpler decision functions. The plot of the distributions was done by:
splot [0:7] [0:3] [0:2] PA(x,y) notitle, PB(x,y) notitle, PC(x,y) notitle
and the decision boundaries by:
set contour base
set view 0,0
set cntrparam levels discrete 0.0
set cntrparam points 100
splot [0:7] [0:3] "iris11" notitle ps 2, "iris22" notitle ps 2, \
 "iris33" notitle ps 2, \
  DA(x,y) - DB(x,y) notitle lw 2, DC(x,y) - DB(x,y) notitle lw 2
where iris11, iris22, and iris33 are three files.

The contour for the higher loss was done by:

set cntrparam levels discrete log(100)

The linear decision boundaries were graphed by changing some of the variables:

sa1 = sqrt(0.1852)
sa2 = sqrt(0.0420)
sa12 = 0.0425
sb1 = sqrt(0.1852)
sb2 = sqrt(0.0420)
sb12 = 0.0425
sc1 = sqrt(0.1852)
sc2 = sqrt(0.0420)
sc12 = 0.0425
and doing the same kind of plots as before.

3.1 Perceptrons, Project Framework, Perceptron Convergence in Nonseparable Case, 3.2 Adaline, Adaline Convergence.

4.1 Multilayer Perceptrons, Universal Approximation, 4.3 Practical Aspect of Neural Networks.

Here is the original glass dataset and a description of its attributes. After scaling and reformatting, here is the glass dataset that I used to produce my notes.

Introduction to Support Vector Machines and Kernel Functions, A Simple Algorithm that Combines Examples, Statistical Learning Theory, Hyperplane Classification, Support Vector Classification, SMO Algorithm for SVMs, SVM Examples withe a Gaussian Kernel, Support Vector Regression.

Here is a program for learning SVMs (zip file) (tar file). Read the README file. This has been replaced (11/14/03) with a new version with fewer bugs.

Error and Risk, Train and Test, Comparing Algorithms and an Example, Other Aspects of Evaluation.

Matlab Neural Network Toolbox

We will be able to run many neural network algorithms using Matlab's Neural Network Toolbox. Matlab provides extensive help on this software. Click on "Full Product Family Help" in the Help menu. For testing your newfound skills, here is the Iris dataset in a Matlab readable format. I suggest you create a matlab subdirectory with this file in it and run matlab when you are connected to this directory. Then issue the following command in Matlab's command window to load the Iris dataset:
[irisp irist] = iris(0)
The 0 is a useless argument because I didn't take the time to figure out how to write a function with zero arguments. Now irisp contains the inputs (patterns) and irist contains the desired outputs (targets).

There is also Matlab software in conjunction with the book to do some simple experiments. Unzip this file in your matlab directory, and change the name of the new subdirectory from LearnSCp to learnsc. This will help you to follow the book's instructions of how to use the software.


Back to Tom Bylander's Home Page.