Bently Wong

XI-19-2001

CIS 203 Artificial Intelligence

Artificial Neural Networks

Artificial Neural Networks concentrate on imitating humans rather than acting as rational agents.  The original goal in Neural Networks was to imitate the learning that takes place in the brain. Since the brain is not fully understood, an ANN is simply an approximation of its function based on what can be observed.  The modeling that it uses allows ANNs to actually learn, a very core concept of Artificial Intelligence.  Learning in Artificial Intelligence is important because it means that the system is allowed to get smarter.  An ANN can be trained to do a job and once properly trained, the system will retain it’s training and will accomplish it’s job with a constant level of accuracy. 

An Artificial Neural Network consists of multiple interconnected nodes.  These nodes are laid out in multiple layers.  The node is the basic element of an ANN and it is usually referred to as a unit.  The units are connected to each other through links and these links have weights associated with them.  Finally, some of the units are connected to the outside world and these are designated as input or output units.  The weight is a very important component of an ANN.  They are the means of long-term storage in the system and learning takes place by updating these weights.  So in a typical ANN, input, called the activation vector, comes in through the input units, they go through the links to the units.  The weight of the links and units change the activation vector values and this goes on until the output unit is reached. 

The unit is made up of simple computing elements.  Each unit has input links that come from other units and output links to other units.  Intuitively, input comes in through these input links and after the calculations take place in the unit, the output is sent out through the output link.  Each unit also contains a current activation level and a formula to calculate the next activation level given the input and the weights.  The purpose of each unit is to perform a simple computation.  It receives an activation level from its input links and computes a new activation level, which it sends along its output links. This computation takes place in two steps.  First the input function calculates the weighted sum of the units input values, and second, the activation function transforms the weighted sum into the final value that serves as the unit’s activation value. 

An Artificial Neural Network learns when the weights of the links are adjusted.  These weights are adjusted through the process called learning.  Learning takes place when an input is fed in and the output that we get is compared to the desired output.  The difference between the supposed output and the actual output is measured and then distributed among the different weights so that adjustments are made.  This is done with multiple inputs.  When the difference between the desired output and the actual outputs become minimal, we consider the neural network to have learned.

There are two main network structures for Artificial Neural Networks, Feedforward and Feedback.  A Feedforward network is one where every link is unidirectional and acyclic.  In a typical, layered Feedforward network, each unit is linked only to units in the next layer; there are no links between units in the same layer, no links backward to a previous layer, and no links that skip a layer.  In addition, a Feedforward network can either be perceptron or multilayered.  Perceptron means that the network is single layered, meaning no hidden layer.  Perceptron learning happens very quickly but only linearly separable functions can be represented.  In contrast, a multilayered network contains one or more hidden layers.  They learn slower than perceptron but are not limited to linearly separable functions. 

The difference between a Feedforward and a Feedback network is that in a Feedback network, output can be directed back as inputs to previous or same level nodes. In a Feedforward network, because each link is unidirectional, all the output only goes in one direction.  A Recurrent network can be seen as a special case of a Feedback network where the loops are closed.  Because of it’s ability to “feed back” output, a Feedback network is much more complex than a Feedforward network.  The computations that take place on the atomic level is much less orderly than in a Feedforward network.  It is much harder to do learning in a Feedforward network.  However, it more accurately models the brain. 

The problem with early multilayered Artificial Neural Networks was trying to adjust the weights in the middle or hidden layers.  This problem was solved through the back-propagation algorithm.  It calculates what the error is between the desired test result and the result obtained from the network. Then the algorithm calculates which hidden nodes are responsible and passes the results from these calculations back along the network to the responsible nodes. The weights between the layers are then adjusted based on these results.  So the algorithm steps back through each layer of the ANN starting with the last layer until it reaches the original or input layer.  During the training process, these adjustments are done, layer-by-layer, until learning is complete.  At that point, the ANN is considered to be complete. 

Artificial Intelligence itself falls into many different fields.  It’s byproducts and goals are far reaching and involve many different aspects of Computer Science and the world.  The applications for which real Artificial Intelligence can be useful for is varied. The same can be said for Artificial Neural Networks.  There are many practical applications and each one is different from each other.  This is one of ANN’s attributes.  It is flexible enough to be adapted for different jobs.  Some of the applications that ANNs can be used for are word pronunciation, handwritten character recognition, driving, etc.  ANNs have also been successfully used for classification tasks. 

An Artificial Neural Network is very flexible because of its capacity to learn.  It’s learning is only limited in the type of data that can be inputted into the system.  Most of its power comes from its structure.  A typical ANN is made up of many units.  Each unit only does some simple computations so a task will be distributed among the units.  So while each unit is simple, the sheer number of them allows many calculations to be made.  The multiple units in a network also means that the units can work in parallel, improving the system’s throughput.  The large number or units also makes each individual unit less important to the whole.  The other units absorb any potential error that may exist in a unit or a weight in the system. 

Artificial Neural Networks aren’t perfect.  There are times when an ANN shouldn’t be used.  This is when the data cannot be naturally represented as an activation vector, there isn’t enough data to properly train the network, when training data are ambiguous and/or contradictory, when the input and output sets are dynamic, and when there isn’t enough time to properly train the network.  However ANNs can be used in conjunction with other systems in AI to produce good results.  An example of this would be pairing an ANN with an Expert System.  The output of an ANN can then become a rule or fact in an Expert System.  Or an Expert System can perform an interpretation, which can in turn be fed into the ANN as input.  So an ANN can also flexible with other systems as well.

I think the most useful Artificial Neural Networks today are multilayered Feedforward systems.  This is because of their relative simplicity, and their learning capacity.  They can learn any function.  However they cannot guarantee efficiency.  The learning that takes place takes much longer and there is no way to make sure that the ANN has the best formula worked out.  Also, the back-propagation algorithm used for learning works very well with a multilayered acyclic system.  A Perceptron is too simple for most practical applications, because it has some serious limitations.  It has no hidden layers so that all computations take place through the links that connect the input unit to the output unit.  This is the major reason that a Perceptron cannot model more complex functions. 

A Feedback system is much more complicated than a Feedforward system.  Its loop back attribute allows output from a unit to be sent back to the same unit or to a unit on the same level.  However, since a Feedforward system can learn any function I see no need for this extra complexity.  Even if there are some practical applications, which a Feedback system could do very well that a Feedforward system couldn’t do as well, I think that the extra cost for a Feedforward system must also be weighed.  For most practical applications for which an ANN may be necessary, I believe that a multilayered, Feedforward system that used back-propagation to learn will do very well in most situations. 

Bibliography

-         Neural Networks: http://www.messiah.edu/hpages/facstaff/grohrbau/class/csc418/notes/ch19_a_neural_networks_files/frame.htm, XI-12-2001

-         ANNs: http://www.messiah.edu/hpages/facstaff/grohrbau/class/csc418/notes/ch19_b_neural_networks_files/frame.htm, XI-14-2001

-         Neural Networks: http://www.messiah.edu/hpages/facstaff/grohrbau/class/csc418/notes/student/neural_networks_files/frame.htm, XI-05-2001

-         Neural Network Applications: http://web.singnet.com.sg/~midaz/Intronn.htm

-         Neural Networks, Connectionist Systems, and Neural Systems: http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/neural/systems/0.html, II-05-1995