\dm_csml_event_details
| Speaker | Yuan (Alan) Qi | 
|---|---|
| Affiliation | Purdue University | 
| Date | Monday, 02 July 2012 | 
| Time | 12:30-14:00 | 
| Location | Zoom | 
| Link | Darwin B15 Biochemistry LT | 
| Event series | Jump Trading/ELLIS CSML Seminar Series | 
| Abstract | Title: Bayesian learning with big data: virtual vector machines and Gaussian processes with sparse eigenvalues Abstract: In this talk I will cover two topics that have become increasingly important given big data: online learning and sparse Gaussian process models. First, in a typical online learning scenario, a learner is required to process a large data stream using a small memory buffer. Such a requirement is usually in conflict with a learner’s primary pursuit of prediction accuracy. To address this dilemma, we introduce a novel Bayesian online classification algorithm, called the Virtual Vector Machine. The virtual vector machine allows you to smoothly trade-off prediction accuracy with memory size. The virtual vector machine summarizes the information contained in the preceding data stream by a Gaussian distribution over the classification weights plus a constant number of virtual data points. The extra information provided by the virtual points leads to improved predictive accuracy over previous online classification algorithms. Second, we propose a sparse Gaussian process model, EigenGP, based on Karhunen-Loeve (KL) expansions of a GP prior. We use the Nystrom approximation to obtain eigenfunctions of the covariance function and use an empirical Bayesian approach to select these eigenfunctions. By selecting eigenfunctions of Gaussian kernels that are associated with data clusters, EigenGP is also suitable for semi-supervised learning. Our experimental results demonstrate improved predictive performance of EigenGP over several state-of-the- art sparse GP and semisupervised learning methods for regression, classification, and semisupervised classification. Slides for the talk: PDF | 
| Biography |