Page d'accueil // SnT // News & E... // Research Seminar: The Principle of Optimism for Sequential Decision-Making Under Uncertainty

Research Seminar: The Principle of Optimism for Sequential Decision-Making Under Uncertainty

twitter linkedin facebook google+ email this page
Add to calendar
Conférencier : Prof. Peter Auer, University of Leoben
Date de l'événement : mercredi 27 septembre 2017 11:15 - 12:00
Lieu : Room E004, JFK Building,
29 Avenue J.F. Kennedy
L-1855 Kirchberg

Abstract: The principle of optimism in the face of uncertainty is a general approach for approximately optimal decision making, when model parameters are unknown and have to be learned. An optimistic algorithm behaves as if the model parameters assume their best possible values in accordance to the observations so far. While such optimism can be used as exploration heuristic of the model space, it often allows also for a rigorous theoretical analysis. In my talk I will review some fundamental scenarios for sequential decision making: the multi-armed bandit problem, contextual bandits, and reinforcement learning. Besides theoretical results I will also show applications to image recommendation and search and to intrinsically motivated exploration. Finally, I will discuss Thompson sampling as an alternative to the principle of optimism and present ongoing research for reinforcement learning in continuous state spaces.

Peter Auer received a PhD in mathematics from the Vienna University of Technology in 1992. In 1995 and 1996 he has been research scholar at the University of California, Santa Cruz. In 1997 he was appointed associate professor at the Graz University of Technology, and he accepted a full professorship for Information Technology at the Montanuniversitaet Leoben in 2003. He has been working in Machine Learning for 25 years and is currently interested in sequential decision making with an exploration/exploitation trade-off, including multi-armed bandit problems and reinforcement learning. He has authored more than 75 refereed publications in the areas of probability theory, symbolic computation, and machine learning, and he is action editor of the Journal for Machine Learning Research and member of the editorial board of the Machine Learning journal. He has been principal investigator in a number of European research projects such as CompLACS, PinView, PASCAL, and LAVA.