Outobox - Machine Learning Reading Group


This is an interdisciplinary group of faculty (Arindam Banerjee, Dan Boley, Maria Gini, Rui Kuang, Paul Schrater, and William Schuler) and students who meet every week to discuss research ideas and to share relevant literature. The group is called Outobox, because we like to think ...out of the box. There is a mailing list associated with this group. If you would like to be added, please go to sign-up page.

There is a brand new discussion forum associated with this group. Please contribute to the discussion.


Spring 2008

This semester the meetings will be on Wednesday from 3:30 to 5:00 in EE/CS 6-212.

Fall 2007

This semester the meetings will be on Wednesday from 3:00 to 5:00 in EE/CS 5-212.

Spring 2007

This semester the meetings will be on Thursday from 2:15 to 4:00 in Walter 405. First meeting on January 25 will be in room 404.

Fall 2006

This semester the meetings will be on Wednesday from 3:15 to 5:00 in Walter 405. First meeting on September 13.

Past year meetings

  • June 29 -- Tim Miller has presented a demo of the new speech recognition system developed in the NLP lab and talk about its architecture.
  • June 15 -- Arindam presented his ICML paper.
  • June 1 -- Wolf Ketter did a dry run of his job interview talk.
  • 27 April 2006 - Dongwei Cao talked about his work.
    On Approximate Support Vector Machine and its Generalization Performance

    We study the algorithms that speed up the training process of support vector machines by using an approximate solution. We focus on algorithms that reduce the size of the optimization problem by extracting from the original training data set a small number of representatives and using these representatives to train an approximate SVM. Using the approach of algorithmic stability, we prove a PAC-style generalization bound for the approximate SVM, which includes as a special case the generalization bound for the exact SVM, which is trained using the original training data set.

  • 20 April 2006 - no meeting
  • 13 April 2006 - no meeting
  • 6 April 2006 - Evangelos Theodorou talked about the newer inverse reinforcement learning by Andrew Ng. Paper.
  • 30 March 2006 - Evangelos Theodorou talked about inverse reinforcement learning. Paper
  • 23 March 2006 - Tim Miller talked about hierarchical Hidden Markov Models (HHMMs), their relation to Dynamic Bayes Nets (DBNs), and how they are used in the NLP lab. Paper - Slides
  • 16 March 2006 - Spring break - no meeting
  • 9 March 2006 - William Schuler talked about DBNs and the speech recognition odyssey of the NLP lab.
  • 2 March 2006 - Steve Damer will be talking about his recent work.
  • 23 February 2006 - .. and now we move to MRF. The paper is http://www.cs.cornell.edu/~rdz/Papers/BVZ-cvpr98.pdf. Paul will lead the discussion.
  • 16 February 2006 - Now that we have learned everything about CRFs in theory, we will look at an application to computer vision. The paper we will look at is Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes, by Murphy, Torralba, and Freeman. Nate Bird will be presenting the paper.
  • 9 February 2006 - We will move forward to Conditional Random Fields (CRFs) by reading the paper Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data by Lafferty, McCallum, and Pereira. Other resources are two tutorials (by (1) Wallach and (2) Sutton and McCallum).
  • 2 February 2006 - We will read and discuss a paper on Maximum Entropy models, and if time permits we will study their relevance to predicting how many more weeks of winter we will have. There is a web tutorial on Adam Berger's webpage, as well as a paper on applying maxent models to NLP which is similar to the tutorial through the first 4 sections.
  • 26 January 2006 - For our first meeting this semester, Dongwei Cao will be talking about the work he will be presenting at the SIAM data mining conference, "On Approximate Solutions to Support Vector Machines."
  • 8 December 2005 - This week we will be looking at the paper "Factor Graphs and the Sum-Product Algorithm" by Kschischang, Frey, and Loeliger. Link
  • 1 December 2005 - We decided to do a review on Hidden Markov Models (HMMs). We covered them last year, but this will help get new people up to speed and the old people to clear out the cognitive cobwebs. This is all in anticipation of learning more about undirected graphical models like Markov Random Fields, and then possibly moving to Conditional Random Fields, and after that we should know everything about machine learning. Link to Rabiner paper
  • 24 November 2005 - No meeting - Thanksgiving holiday. Remember to give thanks for the outobox webmaster.
  • 17 November 2005 - Continuation of last week's topic of graphical models.
  • 10 November 2005 - Towards the end of last weeks session we started talking about the application of Markov Random Fields to Miguel's problem of detecting proteins from electron microscopy images. We had so much fun we decided to look a little bit at MRFs and graphical models in general. So, for next week, we will be reading a tutorial by Kevin Murphy which you can find at this location. There is also a tech report by Wainwright and Jordan at this location (postscript format) which is quite long, but fear not! The intro is a nice overview of applications of graphical models and is quite readable I'm told.
  • 3 November 2005 - Miguel, a visiting graduate student from Spain, will talk about his work done there which he is continuing here.
  • 27 October 2005 - We will again not be having a meeting, in observation of Theodore Roosevelt's birthday. If you would like something to mull over the two-week break, think about how you would build an agent to wash dishes by hand. In order to get a feel for the scope of the problem, you are welcome to visit my apartment and try the task out for yourself.
  • 20 October 2005 - We looked at a paper by Friedman, Hastie, and Tibshirani - "Additive Logistic Regression: A statistical view of boosting." Citeseer link is here. The diligent reader may also be interested in this paper "Why the logistic function? A tutorial discussion on probabilities and neural networks" by Michael I. Jordan.
  • 13 October 2005 - No meeting because of Navy Day Holiday
  • 6 October 2005 - We read an introduction paper on boosting, which helped to motivate and clarify the paper we read last week. Here is a link to a pdf version of the paper "A short introduction to boosting," and here is a link to the original adaboost paper, "A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting."
  • 29 September 2005 - Discussed the paper "Logistic Regression, Adaboost, and Bregman Distances" by Collins et al. (2001). Citeseer link
  • 22 September 2005 - Harini Veeraraghavan talked about her research regarding event detection in videos of automobile traffic.
  • 15 September 2005 - New faculty member Arindam Banerjee talked about his research on Bregman Divergences.
    Slides

  • 2004-2005

    Some people have asked for a list of papers the group looked at last year. If anyone has any additions let me know.