Systems Seminar

Activized Learning: Transforming Passive to Active with Improved Label Complexity

Steve Hanneke
Machine Learning Department
School of Computer Science
Carnegie Mellon University

Abstract

In active learning, a learning algorithm is given access to a large pool of unlabeled examples, and is allowed to request the labels of any particular examples in that pool, interactively. In empirically driven research, one of the most common techniques for designing new active learning algorithms is to use an existing passive learning algorithm as a subroutine, and actively construct a training set for that method by carefully choosing informative examples to label. The resulting active learning algorithms are thus able to inherit the tried-and-true learning bias of the underlying passive algorithm, while often requiring significantly fewer labels to achieve a given accuracy compared to random sampling.

This naturally raises the theoretical question of whether every passive learning algorithm can be "activized", or transformed into an active learning algorithm that uses a smaller number of labels to achieve a given accuracy. In this talk, I will address precisely this question. In particular, I will explain how to use any passive learning algorithm as a subroutine to construct an active learning algorithm that provably achieves a strictly superior asymptotic label complexity. Along the way, I will also describe some of the recently developed mathematical tools for the formal study of active learning in general.

Time and Place: Fri., Nov. 14, at 2:30 in 4610 Engr. Hall. *** NOTE SPECIAL DAY and TIME ***

SYSTEMS SEMINAR WEB PAGE: http://homepages.cae.wisc.edu/~gubner/seminar/schedule.html

File "hanneke.shtml" last modified Tue 15 Oct 2019, 01:45 PM, CDT
Web Page Contact: John (dot) Gubner (at) wisc (dot) edu

Systems Seminar

Activized Learning: Transforming Passive to Active with Improved Label Complexity

Steve Hanneke Machine Learning Department School of Computer Science Carnegie Mellon University

Abstract

Steve Hanneke
Machine Learning Department
School of Computer Science
Carnegie Mellon University