Research Topics

These are topics I've been working on during my PhD, and some related publications. A full list of publications (including local conferences etc.) can be found here or on google scholar.

My thesis is available here.

Datum-Wise Feature Selection

A big part of my research during my PhD has been on sequential models being used for 'standard' prediction (aka classification) tasks. Transforming the 'atomic' classifier into a sequential process allows for much more expressivity and adaptability during the classification process. More specifically, I have worked on using Reinforcement Learning and Markov Decision Processes to model and find good policies to sequential feature selections tasks. The /sequential/ aspect of the feature selection task is with regards to how the classifier functions when given a new data-point: the classifier begins with no information about the newly arriving data point (datum), and sequentially chooses features until it has enough information to classify the point. This sequential aspect means that the number and set of features will be different for every element being classified. This allows for 'obvious' data to be classified with little information, and more 'difficult' data to be described more richly to the classifier.

I co-organized the Predicting with Sequential Models workshop at ICML '13 (Atlanta) which brought together a budding community working on similar sequential approaches to prediction.

These works have resulted in the following publications:

International Journals

International Conferences

Fast Reinforcement Learning Algorithms

The use of Reinforcement Learning in real-world scenarios such as datum-wise classifiers is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the Rollouts Classification Policy Iteration (RCPI) framework, where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches.