**Learning: Connectionist Approaches**

A NN consists of interconnected nodes. Each node in the network is like an artificial neuron in that it accepts input signals, takes a weighted sum of them as the total input to the unit, and generates an output according to a simple (but usually nonlinear) function. For example, it can be a threshold function of the input.

Each time the NN is used, an input activation vector is applied to the input nodes, then the activation values of the other nodes is calculated according to the activation sent out of the input nodes. This process is repeated until certain condition is satisfied, then the activity values of certain nodes are taken to be the output. The weight values on the links may be adjusted according to a learning algorithm.

Overall, a neural network often corresponds to a function that map input vectors to output vectors, though the function is not explicitly expressed by a formula, but implicitly represented by the NN, in its structure and weight values.

For more complicated functions, one popular learning algorithm is "back propagation" in multilayer perceptrons, which are *fully connected, layered, feedforward* networks. Typically, such a NN has an input layer, a hidden layer, and an output layer. The weights of the links are initialized to random numbers. Then, the network is trained by repeating the following procedure for each training case:

- Apply the input values to the input layer, and use the current weights to calculate the activation value of the hidden layer, then the output layer.
- Compute the difference between the actual output and the target output.
- The weights of the links connecting the output layer and the hidden layer are adjusted to reduce the difference as much as possible (given the current activation of the hidden layer).
- The previous steps are repeated on the links connecting the hidden layer and the input layer.

After repeated training, an "associative memory" will be formed, such that when part of an input pattern is activated, the other part will become active, too.

Hebbian learning can also be used in supervised learning by remember the input/output pair according to Hebbian rule.

Deep learning are approaches that build multiple levels of features or representations of the data. [Demonstrations]

Hierarchical temporal memory (HTM) is a model that takes the large-scale structure of the brain into account.

Typical applications: categorization, pattern recognition, data mining, and so on.

Limitations:

- Can the problem be naturally represented as a mapping from an input vector to an output vector?
- Under the vector representation, should similar inputs produce similar outputs?
- Are there sufficient and stable training data?
- Is there enough time for the training?
- Do we need an explanation of the internal process?