CIS 5603. Artificial Intelligence

Learning as Optimization

Several learning techniques take learning as a process to optimize an evaluation function to improve the quality of the results according to certain standard.

1. Unsupervised learning

Unsupervised learning can carry out classification or clustering from unlabeled instances. The objective of the learning algorithm is usually to find a partition of the dataset with minimum within-cluster distances and maximum between-cluster distances, though sometimes the requirements for a partition are relaxed by allowing overlapping or probabilistic classification.

Some techniques come from statistics:

K-means clustering: repeatedly putting each instance into one of the k clusters that has the nearest mean.
Expectation-maximization (EM): iteratively estimating parameters to increase the prediction power of a probabilistic model for "soft-clustering".

Unsupervised learning is often used to provide labeled data for supervised learning:

Semi-supervised learning: combining a small amount of labeled data and a large amount of unlabeled data in training.
Self-supervised learning: using the patterns abstracted from relevant knowledge to create a model, then the model is refined using training data.
Generative models, such as generative adversarial network (GAN): generating new data with the same statistics as the training set.

Many types of learning in the current machine learning research are distinguished with respect to the implicit assumption of a "label" that correctly categorizes each data item. Compared to concept learning in the human mind, many factors have not been taken into account in this assumption, such as the dependency on motivation and context, and the necessity of multiple inheritance.

Some artificial neural networks are used to discover patterns in data and can be used as associative memory:

Hebbian learning: "Neurons that fire together, wire together."
Hopfield network, Boltzmann Machine, and Free energy principle.
Self-organizing map: similar members are placed closer together.

2. Reinforcement learning

The study of reinforcement started in psychology. A (human or animal) behavior can often be strengthen or weaken by reward and punishment, respectively. Example: classical conditioning.

In AI, reinforcement learning is defined as finding a state-action mapping that maximizes the expected total reward in a Markov decision process where initially only the set of possible states and the set of possible actions are known.

Conceptual issues:

Temporal difference and Q-learning: gradually update the value function according to expected improvement.

Application: AlphaGo and AlphaZero.

There are still different opinions on the potentials and limitations of reinforcement learning.

3. Evolutionary learning

Evolutionary computation: optimization via natural selection in a "species" of solutions for a given problem.

Genetic algorithm:

Each generation is a population of a constant number of solutions.
Each solution is coded by a sequence of "genes" (a binary string).
There is a fitness function that evaluates candidates.
Candidates with higher fitness have higher probability to reproduce solutions for the next generation.
Reproduction happens via crossover between a pair of selected solutions.
Mutation may occur randomly during reproduction.
Some high-fitness solutions are retained in the next generation.
Repeat production until the fitness converges or reach a certain threshold, if it happens within the affordable time.

Examples: walkers, eaters

Limitations: not easy to represent a solution as a binary string with meaningful substrings; the learning time turns to be very long.

Genetic Programming: to approximate a function using automatically generated programs, usually represented as trees.

Related field: Artificial Life

Theoretical issue: the relationship among intelligence, evolution, and life.

Readings

Poole and Mackworth: Sections 4.8, 10.3, 13.1-5
Russell and Norvig: Sections 20.3, 21.7, Chapter 22
Luger: Sections 10.6-7, 11.5-6, 12.1-3