On the Redundancy in Deep Learning

Huang Gao
Postdoctoral Reearcher, Department of Computer Science
Cornell University
SERC 306
Wednesday, February 21, 2018 - 11:00
As one of the most representative techniques in artificial intelligence, deep learning has been widely used for computer vision, natural language processing, robotics, etc. On specific tasks, it has achieved or even surpassed human-level performance. However, the strong generalization ability of deep learning has not been well understood by the community, as traditional learning theory does not apply to modern neural networks, whose number of learnable parameters far exceed the size of the training set. Why can the over-parameterized neural networks still generalize well? A principled answer to this question would not only help increase the understanding of deep learning, but also lead to more efficient and elegant model architecture design.

In this talk, I will show that deep networks have considerable parameter redundancy, which means although they are parameterized with millions of parameters, they may not use them effectively. I will first introduce a simple algorithm that reveals the parameter redundancy in deep networks and show that redundancy indeed helps improve generalization. From the practical viewpoint, redundancy is problematic as it increases computational cost. I will then present a novel network architecture which has high efficiency and strong generalization ability by design. Finally, I will introduce an adaptive evaluation method that reduces redundant computation for each individual sample.