Interactive Information Retrieval with Bandit Feedback

Abstract

Information retrieval (IR) in nature is a process of sequential decision making. Distinct from traditional IR solutions that rigidly execute an offline trained policy, interactive information retrieval emphasizes online policy learning with bandit feedback. In this tutorial, we will first motivate the need for online policy learning in interactive IR, by highlighting its importance in several real-world IR problems where online sequential decision making is necessary, such as web search and recommendations. We will carefully address the new challenges that arose in such a solution paradigm, including sample complexity, costly and even outdated feedback, and ethical considerations in online learning (such as fairness and privacy) in interactive IR. We will prepare the technical discussions by first introducing several classical interactive learning strategies from machine learning literature, and then fully dive into the recent research developments for addressing the aforementioned fundamental challenges in interactive IR.

Note that the concurrent SIGIR 2021 tutorial on "Interactive Information Retrieval: Models, Algorithms, and Evaluation" will provide a broad overview on the general conceptual framework and formal models in interactive IR, while this tutorial covers the online policy learning solutions for interactive IR with bandit feedback.