Learning as Optimization
Several learning techniques take learning as a process to optimize an evaluation function to improve the quality of the results according to certain standard.
1. Unsupervised learning
Unsupervised learning can carry out classification or clustering from unlabeled instances. The objective of the learning algorithm is usually to find a partition of the dataset with minimum within-cluster distances and maximum between-cluster distances, though sometimes the requirements for a partition are relaxed by allowing overlapping or probabilistic classification.
Some techniques come from statistics:
-
K-means clustering: repeatedly putting each instance into one of the k clusters that has the nearest mean.
-
Expectation-maximization (EM): iteratively estimating parameters to increase the prediction power of a probabilistic model for "soft-clustering".
Unsupervised learning is often used to provide labeled data for supervised learning:
Many types of learning in the current machine learning research are distinguished with respect to the implicit assumption of a "label" that correctly categorizes each data item. Compared to concept learning in the human mind, many factors have not been taken into account in this assumption, such as the dependency on motivation and context, and the necessity of multiple inheritance.
Some artificial neural networks are used to discover patterns in data and can be used as associative memory:
2. Reinforcement learning
The study of reinforcement started in psychology. A (human or animal) behavior can often be strengthen or weaken by reward and punishment, respectively. Example: classical conditioning.
In AI, reinforcement learning is defined as finding a state-action mapping that maximizes the expected total reward in a Markov decision process where initially only the set of possible states and the set of possible actions are known.
Conceptual issues:
Temporal difference and Q-learning: gradually update the value function according to expected improvement.
Application: AlphaGo and AlphaZero.
There are still different opinions on the potentials and
limitations of reinforcement learning.
3. Evolutionary learning
Evolutionary computation:
optimization via natural selection in a "species" of solutions for a given problem.
Genetic algorithm:
- Each generation is a population of a constant number of solutions.
- Each solution is coded by a sequence of "genes" (a binary string).
- There is a fitness function that evaluates candidates.
- Candidates with higher fitness have higher probability to reproduce solutions for the next generation.
- Reproduction happens via crossover between a pair of selected solutions.
- Mutation may occur randomly during reproduction.
- Some high-fitness solutions are retained in the next generation.
- Repeat production until the fitness converges or reach a certain threshold, if it happens within the affordable time.
Examples:
walkers,
eaters
Limitations: not easy to represent a solution as a binary string with meaningful substrings; the learning time turns to be very long.
Genetic Programming:
to approximate a function using automatically generated programs, usually represented as trees.
Related field: Artificial Life
Theoretical issue: the relationship among intelligence, evolution, and life.
Readings
- Poole and Mackworth: Sections 4.8, 10.3, 13.1-5
- Russell and Norvig: Sections 20.3, 21.7, Chapter 22
- Luger: Sections 10.6-7, 11.5-6, 12.1-3