Systems Seminar

Active Learning for Internet Information Retrieval

Prof. Yu Hen Hu
UW ECE Dept.

Abstract

Active learning is a term often used in education theory where the students actively participate the learning process by asking teacher only those questions they do not understand. As such the learning process can be accelerated. In this work, we apply active learning to the task of Internet information retrieval.

A common problem with current search engine is that they often provide too many low quality matches for each query, rendering the search results useless. One way to improve the precision of the search is to ask users to provide feedback indicating whether a set of sample retrieved web documents are relevant to the query they posted. Based on the answers, the search engine can refine the initial query and thereby conducted a more precise search. This approach is known as "relevance feedback". However, relevance feedback must be used judiciously. Asking users to answer too many questions can easily frustrate the user and cause them to abort the search process prematurely. Active learning thus can be applied so that the search engine (the student) can ask only the "most critical" questions to the user (the teacher) to learn the users' real intention efficiently.

In this presentation, we propose a novel approach to active learning, based on a minimax criterion. Specifically, we derived an upper bound (in 1D case) to estimate a quantity called "excessive classification error (ECE)". ECE can be used to gauge how close the performance of a practical classifier to that of an optimal Bayes classifier. We then propose to choose new sampling point that will minimize the maximum ECE. Promising preliminary simulation results will also be discussed.

Time and Place: Wed., Oct. 25, 3:30-4:30 pm in 4610 Engr. Hall.

SYSTEMS SEMINAR WEB PAGE: http://www.cae.wisc.edu/~gubner/seminar/

Systems Seminar

Active Learning for Internet Information Retrieval

Prof. Yu Hen Hu UW ECE Dept.

Abstract

Prof. Yu Hen Hu
UW ECE Dept.