Artificial Neural Networks

by Manfred Füllsack

WHAT IS IT?

This model demonstrates the operation of a backpropagation neural network. For self-organizing maps look here.

Artificial neural networks are computer-generated adaptive systems which in their functioning are thought to resemble biological neural networks, in particular the human brain. In economics and statistics, they are used to find structures and regularites in complex data sets and to predict future developements within certain limits.

Philosophically, they are interesting because they operate on the base of a distributed representation of knowledge. Their knowledge (i.e. their "output" once they are thoroughly trained) is never memorized in one or more distinct neurons (that is in any kind of essentialist entity), but always in the form of patterns or constellations of activations and connections of neurons. This feature gives reason to reassess several fundamental assumptions of classical European philosophy.  The scientific approach to this reassessment is termed connectionism.

HOW IT WORKS

Pushing one of the buttons <AND>, <OR>, <XOR>, <7>, <15>, <31>, and <ASCII> generates an artificial neural network prepared to "learn" the indicated function. <AND>, <OR> and <XOR> denote the corresponding logical operators. <7>, <15> and <31> denote decimals of which the neural network is supposed to learn the binary code. And <ASCII> denotes the letters a to z of which the neural network is supposed to learn the binary ASCII-code.

Pushing <train> assigns a percentage of (randomly chosen) corresponding training sets to the input-neurons of the network (the first row of white globes). This percentage can be set with the <subset-of-training-patterns>-slider. (The readout beneath this slider indicates which training patterns in particular have been chosen). When running, the readout on top of the network-display lets you watch the training patterns that are assigned in each training step. (Training input to the left, expected output to the right).

Propagation

The input algorithm assigns the patterns once, then shuffles the order of the patterns and assigns them again. It does this as long as you don't push <train> again to stop the learning process. The <train-all-patterns-once>-button assigns the patterns once, and the <train-step>-button assigns just one pattern - for example the input-pattern 01 in case one of the logical operator-buttons has been pushed, or 1100101 (denoting the letter e) in case the <ASCII>-button has been pushed.

The respective input can also be seen as red digits on the first row of neurons (white globes), called input-neurons. From here inputs are propagated via connections (blue and green links) to the next row of neurons, called hidden-neurons. Each of these connections has a certain (at first randomly assigned) weight (blue links > 0, greens < 0), which, when propagating, is multiplied with the input value of the connected input-neuron. Each hidden-neuron then sums up all of these multiplied inputs coming from its individual in-links. Then it computes this product through a sigmoid function (in this case 1 / 1 + exp^(-x)) in order to squash the product into the interval between 0 and 1, and displays the result as its activation. The same is done once more if the network has a second layer of hidden-neurons (which you can set with the <two-layers>-switch). And it is also repeated in the same way for the output-neuron (the last white globe at the bottom of the network display).

In order to be able to compute zeros as inputs, the network needs so called bias-neurons on each layer. (Without them, an all-zero-input would result in zero-activations and thus not alter the network). The inputs (or activations if you prefer) of these bias-neurons are constantly set at 1. Only the weights of their connections contribute to the "learning" of the network.

Backpropagation

Once the input has been propagated to the output-neuron, this neuron compares its activation (i.e. the result of the last propagation) with the expected training output, i.e. the output you want the network to learn (you can see this output on the right side of the readout on top of the network display). From this comparison the output-neuron computs an error which then is used to step by step alter the connection weights going backwards through the network. This process is called backpropagation.

In detail, the error of the output-neuron is found by subtracting the current activation of the neuron from the expected output and multiplying this sum with the activation and its complement (i.e. 1 - activation). Correspondingly, the error of the hidden-neurons is found by summing up the product of the output-neuron's error and the weights of the conections leading to the output-neuron. The weights then are altered via adding the learn-rate (set with the <learn-rate>-switch) multiplied by the respective error and the activation of the neuron to which the connection leads back. The same procedure is repeated for each layer of neurons until all weights are reset. Then the next input pattern is assigned to the network and treated in the same way.

In this way, the error should gradually decline until all the expected outputs are correctly displayed. You can check this by either pushing <train-step> and compare each training-output with the final output. (The readout <output> displays the actual output from which the final output is rounded to its nearest integer). Or you can write your own binary code into the <input>-window. Then push enter on your keybord, push the button <check> and compare your input with the final output. (Pushing enter on your keyboard is neccessary to make the applet understand that your input is finished) If the final output is not correct the network needs more training.

Note, that while the <AND> and <OR> functions are "learnd" relatively fast and with few hidden-neurons (AND and OR need just one hidden-neuron), the other functions can take some time and need several more neurons (in my trials, the <31> and the <ASCII> function needed up to 10.000 training steps (ticks) to be learnd correctly). The <XOR> function needs at least two hidden-neurons (this was one of the subjects of the famous objection by Marvin Minsky that drove AI-research away from neural networks for some time in the 1970ies). By varying the number of hidden-neurons and the learn-rate, you can try to optimize the learning process of the network.

An other kind of artificial neural networks are self-organizing maps. An applet and a description can be found here.

CREDITS AND REFERENCES

Code by Manfred Füllsack (source code on demand), March 2009 (to be improved and continued)

(Some) Literature (on connectionism) :

Bechtel, William (1987): Connectionism and the Philosophy of Mind: An Overview; in: The Southern Journal of Philosophy, Supplement: p. 17-41

Bechtel, William / Abrahamsen, A. (1991): Connectionism and the Mind. An Introduction to Parallel Processing in Networks, Cambridge MA, Basil Blackwell.

Churchland, Paul M. (1989): A Neuro-Computational Perspective. The Nature of Mind and the Structure of Science, Cambridge MA: MIT Press.

Churchland, Paul M. (1995): The Engine of Reason, the Seat of the Soul: A Philosophical Journey into the Brain, Cambridge, MA.: MIT Press.

Fodor, Jerry / Lepore, Ernest (1992): Holism: A Shopper's Guide, Cambridge: Blackwell.

See also: The Mind Project