Machine Learning for Vision @ Center for Computational Learning Systems
Vision Projects

Welcome to the Machine Learning for Vision Group @ CCLS.

The Center is currently researching scene decomposition. We develop cutting edge algorithms and software for real-time tracking. We apply our algorithms to the field of car and pedestrian detection. The videos we deal with are generated from a camera mounted on a moving car.

*** We are looking for talented students ***

The research involves simultaneous innovations in many fields:

  1. Learning visual features for pedestrian detection.
    1. Creating a labeling framework for the learning tools.
    2. Performance analysis.
    3. Researching fast visual detectors.
    4. Boosted/Cascaded detector creation.

  2. Integrating detections over time to track pedestrians.
    1. Extending particle filters to allow real time tracking.
    2. Solving the data association problem.

Learning Visual Features

For training/testing dataset generation we created a very comfortable software package called VideoApprentice. The video Apprentice is a software framework that is based on the ImageJ NIH open source project. The VideoApprentice is implemented as an ImageJ plugin.

Click to enlarge screenshot
Figure 1. Video Apprentice Screenshot (click to enlarge)

The VideoApprentice can output Viper-GT annotation files, and therefore any annotations made using the VideoApprentice can be ported to the Viper-GT toolkit for performance evaluation.

Visual Features

Our current visual features framework is based on the Seville project. Seville uses a cascaeded detector using control point visual features to quickly detect pedestrians/cars.


Figure 2. An example of the control point features being applied to an image.


Tracking Pedestrians

*** Under Construction ***

Figure 3. Tracking Pedestrians in a cluttered urban scene from a moving camera.


Open Projects (possibly for class credit)

We are currently looking for very talented Java programmers to continue the work on the following projects:

  1. Extending the VideoApprentice (Involves: work with OpenGL and GUI design.)
  2. Rewriting WinSeville in Java and improving the integration with the VideoApprentice (Involves: image processing and machine learning algorithms.)
These are very large tasks and are expected to last for at least one semester.

** This is a non-paying position **

Contact: pelossof at cs.columbia.edu



Updated: Jan 18th 2006