Learning as Optimization
Several learning techniques take learning as a process to optimize an evaluation function to improve the quality of the results according to certain standard.
1. Unsupervised learning
Unsupervised learning carries out classification or clustering from unlabeled instances. The objective of the learning algorithm is usually to find a partition of the dataset with minimum within-cluster distances and maximum between-cluster distances, though sometimes the requirements for a partition are relaxed by allowing overlapping or probabilistic classification.
Some of the following representative approaches come from statistics:
Some other approaches come from artificial neural networks:
Unsupervised learning is often used to provide labeled data for supervised learning:
Many types of learning in the current machine learning research are distinguished with respect to the implicit assumption of a "label" that correctly categorizes each data item. Compared to concept learning in the human mind, many factors have not been taken into account in this assumption, such as the dependency on motivation and context, and the necessity of multiple inheritance.
K-means clustering: repeatedly putting each instance into one of the k clusters that has the nearest mean.
Expectation-maximization (EM): iteratively estimating parameters to increase the prediction power of a probabilistic model for "soft-clustering".
2. Reinforcement learning
The study of reinforcement started in psychology. A (human or animal) behavior can often be strengthen or weaken by reward and punishment, respectively. Example: classical conditioning.
In AI, reinforcement learning is defined as finding a state-action mapping that maximizes the expected total reward in a Markov decision process where initially only the set of possible states and the set of possible actions are known.
Temporal difference and Q-learning: gradually update the value function according to expected improvement.
Application: AlphaGo and AlphaZero.
There are still different opinions on the potentials and
limitations of reinforcement learning.
3. Evolutionary learning
optimization via natural selection in a "species" of solutions for a given problem.
- Each generation is a population of a constant number of solutions.
- Each solution is coded by a sequence of "genes" (a binary string).
- There is a fitness function that evaluates candidates.
- Candidates with higher fitness have higher probability to reproduce solutions for the next generation.
- Reproduction happens via crossover between a pair of selected solutions.
- Mutation may occur randomly during reproduction.
- Some high-fitness solutions are retained in the next generation.
- Repeat production until the fitness converges or reach a certain threshold, if it happens within the affordable time.
Limitations: not easy to represent a solution as a binary string with meaningful substrings; the learning time turns to be very long.
to approximate a function using automatically generated programs, usually represented as trees.
Related field: Artificial Life
- Poole and Mackworth: Sections 4.8, 10.3, 13.1-5
- Russell and Norvig: Sections 20.3, 21.7, Chapter 22
- Luger: Sections 10.6-7, 11.5-6, 12.1-3