CIS 5603. Artificial Intelligence

Perceiving and Acting

 

Some AI systems can directly interact with the outside (either physical or virtual) world without human users or other computer systems in between. Such a sensorimotor mechanism is also a necessary front-end of a language interface.

1. Perception

Roughly speaking, sensors convert external (physical, chemical, biological, etc.) stimulus into sensations represented within AI systems, which are usually not identical to human sensations, though may be similar to them to various degrees.

Perception forms various levels of abstraction from sensation, and integrate them with the knowledge of the system, so as to guide the system's actions to achieve the goals in a changing environment.

AI research on perception has been focused on vision and sound. The major challenge is to choose proper features for each level of abstraction. Initially, the features are selected by the designers for each level. In computer vision, an influential approach was proposed by David Marr, who considered vision as transforming a 2D projective image on the retina into a 3D model of objects and events in the world.

Deep learning fundamentally changes to the approach of "feature learning" where the features are generated and selected by a learning algorithm according to their contribution to the overall task. In a Convolutionary Neural Network (CNN), convolution kernels are applied to generate feature maps that are further abstracted by the next layers. Trained end-to-end using backpropagation, CNN works well on ImageNet data in recognizing patterns.

A brain-inspired approach of vision is Hierarchical Temporal Memory (HTM), which combines ideas including sparse distributed memory, Bayesian networks, and spatio-temporal clustering.

In the processing of spoken language, deep learning has also greatly improved the quality of speech-text mapping in both directions (Speech recognition and Speech synthesis), which provides one more stage in NLP.

AI has also been experimented in music and art, both in perception and composition/creation. Recently, AI-Generated Content (AIGC) has attracted interests and attentions.

Other ideas:

 

2. Robots

Robots directly interact with environment through sensors and actuators. Some clarifications: Robots are often equipped with special hardware, which is mostly studied in electrical and mechanical engineering. Robot software is closer to AI, and has some special needs: Robots provide a common platform for many AI techniques, though the current special focus is on action control and body control. Robots have been designed with different control paradigms. Examples:

From AI's perspective, the key issue in robots and agents is not their programmed or controlled behavior, but their learned behavior, especially in changing environments. Various ideas:

The public image of "intelligent robots" is often far away from what the AI researchers are doing. Examples:

 

3. Theoretical issues


Readings