-
Department Info
-
-
Admissions
-
-
Academics
-
-
-
Research
-
-
-
Main navigation | Main content
Thursday, November 29, 2012
| Presenter: | Manfred K. Warmuth |
|---|---|
| Affiliation: | UC Santa Cruz |
| Website: | http://users.soe.ucsc.edu/~manfred/ |
| Time: | 11:15 - 12:15 |
| Location: | Keller Hall 4-192A |
| Host: | Arindam Banerjee |
Abstract:
Bregman divergences are a natural generalization of the relative entropy and the squared Euclidean distance. They are used as regularizers in Machine learning and in the design of loss functions. For a single neuron with an increasing transfer function, the "matching loss" is a certain integral of the transfer function which is always convex since it is a Bregman divergence. For example if the transfer function is the identity function then the matching loss becomes the square loss, and for the sigmoid transfer function the matching loss is the logistic loss. Convex losses are convenient to use. However they are inherently non-robust to outliers. By bending down the "wings" of a convex loss we get non-convex losses. An example of such a "mismatched loss" occurs when the square loss is used for a sigmoided neuron. This loss is non-convex in the weight vector and can lead to exponentially many local minima. Even though mismatched losses are troublesome they are needed because they are robust to outliers.
We introduce a temperature into the logistic loss by starting with the matching loss corresponding to the t-entropy. By letting letting the temperatures of the label and the estimator veer apart, we gradually introduce non-convexity into the matching loss function. In experiments we achieve robustness to outliers while still avoiding local minima.
Joint work with Nan Ding and S. V. N. Vishwanathan.