Dataset Shift in Machine Learning

by ; ; ;
Format: Hardcover
Pub. Date: 2008-12-12
Publisher(s): The MIT Press
  • Free Shipping Icon

    This Item Qualifies for Free Shipping!*

    *Excludes marketplace orders.

List Price: $48.00

Buy New

Usually Ships in 5-7 Business Days
$47.95

Rent Textbook

Select for Price
There was a problem. Please try again later.

Used Textbook

We're Sorry
Sold Out

eTextbook

We're Sorry
Not Available

How Marketplace Works:

  • This item is offered by an independent seller and not shipped from our warehouse
  • Item details like edition and cover design may differ from our description; see seller's comments before ordering.
  • Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
  • Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
  • Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.

Summary

Dataset shift is a common problem in predictive modeling that occurs when the joint distribution of inputs and outputs differs between training and test stages. Covariate shift, a particular case of dataset shift, occurs when only the input distribution changes. Dataset shift is present in most practical applications, for reasons ranging from the bias introduced by experimental design to the irreproducibility of the testing conditions at training time. (An example is -email spam filtering, which may fail to recognize spam that differs in form from the spam the automatic filter has been built on.) Despite this, and despite the attention given to the apparently similar problems of semi-supervised learning and active learning, dataset shift has received relatively little attention in the machine learning community until recently. This volume offers an overview of current efforts to deal with dataset and covariate shift. The chapters offer a mathematical and philosophical introduction to the problem, place dataset shift in relationship to transfer learning, transduction, local learning, active learning, and semi-supervised learning, provide theoretical views of dataset and covariate shift (including decision theoretic and Bayesian perspectives), and present algorithms for covariate shift. Contributors: Shai Ben-David, Steffen Bickel, Karsten Borgwardt, Michael Bruckner, David Corfield, Amir Globerson, Arthur Gretton, Lars Kai Hansen, Matthias Hein, Jiayuan Huang, Choon Hui Teo, Takafumi Kanamori, Klaus-Robert Muller, Sam Roweis, Neil Rubens, Tobias Scheffer, Marcel Schmittfull, Bernhard Scholkopf Hidetoshi Shimodaira, Alex Smola, Amos Storkey, Masashi Sugiyama Neural Information Processing series

Author Biography

Joaquin Quiñonero-Candela is a Researcher in the Online Services and Advertising Group at Microsoft Research Cambridge, U.K.

Masashi Sugiyama is Associate Professor in the Department of Computer Science at Tokyo Institute of Technology.

Anton Schwaighofer is an Applied Researcher in the Online Services and Advertising Group at Microsoft Research, Cambridge, U.K.

Neil D. Lawrence is Senior Lecturer and Member of the Machine Learning and Optimisation Research Group in the School of Computer Science at the University of Manchester.

Masashi Sugiyama is Associate Professor in the Department of Computer Science at Tokyo Institute of Technology.

Klaus-Robert Müller is Head of the Intelligent Data Analysis group at the Fraunhofer Institute and Professor in the Department of Computer Science at the Technical University of Berlin.

Alexander J. Smola is Senior Principal Researcher and Machine Learning Program Leader at National ICT Australia/Australian National University, Canberra.

Bernhard Schölkopf is Director at the Max Planck Institute for Intelligent Systems in Tübingen, Germany. He is coauthor of Learning with Kernels (2002) and is a coeditor of Advances in Kernel Methods: Support Vector Learning (1998), Advances in Large-Margin Classifiers (2000), and Kernel Methods in Computational Biology (2004), all published by the MIT Press.

Alexander J. Smola is Senior Principal Researcher and Machine Learning Program Leader at National ICT Australia/Australian National University, Canberra.

Table of Contents

Introduction to dataset shiftp. 1
When training and test sets are different: characterizing learning transferp. 3
Projection and projectabilityp. 29
Theoretical views on dataset and covariate shiftp. 39
Binary classification under sample selection biasp. 41
On Bayesian transduction: implications for the covariate shift problemp. 65
On the training/test distributions gap: a data representation learning frameworkp. 73
Algorithms for covariate shiftp. 85
Geometry of covariate shift with applications to active learningp. 87
A conditional expectation approach to model selection and active learning under covariate shiftp. 107
Covariate shift by kernel mean matchingp. 131
Discriminative learning under covariate shift with a single optimization problemp. 161
An adversarial view of covariate shift and a minimax approachp. 179
Discussionp. 199
Author commentsp. 201
Referencesp. 207
Notation and symbolsp. 219
Contributorsp. 223
Indexp. 227
Table of Contents provided by Blackwell. All Rights Reserved.

An electronic version of this book is available through VitalSource.

This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.

By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.

Digital License

You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.

More details can be found here.

A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.

Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.

Please view the compatibility matrix prior to purchase.