University of Minnesota
Computer Science & Engineering
http://www.cs.umn.edu/

From Relative Entropies to Bregman Divergences, to the Design of Convex and Tempered Non-convex Losses

Thursday, November 29, 2012

Presenter: Manfred K. Warmuth
Affiliation: UC Santa Cruz
Website: http://users.soe.ucsc.edu/~manfred/
Time: 11:15 - 12:15
Location: Keller Hall 4-192A
Host: Arindam Banerjee

Abstract:
Bregman divergences are a natural generalization of the relative entropy and the squared Euclidean distance. They are used as regularizers in Machine learning and in the design of loss functions. For a single neuron with an increasing transfer function, the "matching loss" is a certain integral of the transfer function which is always convex since it is a Bregman divergence. For example if the transfer function is the identity function then the matching loss becomes the square loss, and for the sigmoid transfer function the matching loss is the logistic loss.  Convex losses are convenient to use. However they are inherently non-robust to outliers. By bending down the "wings" of a convex loss we get non-convex losses. An example of such a "mismatched loss" occurs when the square loss is used for a sigmoided neuron. This loss is non-convex in the weight vector and can lead to exponentially many local minima. Even though mismatched losses are troublesome they are needed because they are robust to outliers.

We introduce a temperature into the logistic loss by starting with the matching loss corresponding to the t-entropy. By letting letting the temperatures of the label and the estimator veer apart, we gradually introduce non-convexity into the matching loss function. In experiments we achieve robustness to outliers while still avoiding local minima.

Joint work with Nan Ding and S. V. N. Vishwanathan.

Contact CS&E | CS&E Employment | Site Map
Contact: 4-192 Keller Hall, 200 Union St, Minneapolis, MN 55455     Phone: (612) 625-4002