Title:
Neural Nets as (mostly) Unsupervised Feature Extractors: Paradox and Opportunities
Abstract:
Neural networks are function approximators which map the input features in vector x (e.g. image pixels) to the output vector y (for example, a label perpixel, or a single label for the entire image). This mapping can be seen as aparameterized conditional distribution p(y|x). The network is optimizedend-to-end so as to optimize the likelihood, or some other matching score, ofthe output y. The expectation is, therefore, that the learning algorithm willfocus all of the network’s parametric capacity to learn which part of thevariation in x is predictive of the variation in y, and ignore the spuriousinput features. Previous supervised models with less capacity and lessnonlinearity did in fact behave this way.
Deep neural nets trained with gradient descent and a variety of speed-up tricks,however, spend a good fraction of their parameters on capturing thestatistics of the input x regardless of the output y, i.e., the characteristicsof p(x), rather than p(y|x). This creates interesting problems but alsoopportunities in using neural nets. I will describe some of these apparentparadoxes, as well as positive consequences in applications such as egocentricstream classification with very few examples, and label super-resolution.
Bio:
Nebojsa Jojic attended the University of Illinois, Urbana-Champaign, ILPh.D. , Electrical and Computer Engineering, expected in 2000.M.S., Electrical and Computer Engineering, October 1997.thesis title: “Computer Modeling, Analysis and Synthesis of Dressed Humans“Advisor: Prof. Thomas S. Huang and the University of Belgrade, Belgrade, Yugoslavia, B.S., Electrical Engineering, June, 1995