Arrow Research search

Author name cluster

Manuel Lopes

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
1 author row

Possible papers

9

AAMAS Conference 2021 Conference Paper

Teaching Unknown Learners to Classify via Feature Importance

  • Carla Guerra
  • Francisco S. Melo
  • Manuel Lopes

In this work we introduce an interactive machine teaching approach that teaches a classification task to the learner. Our adaptive approach - Feature Importance Teaching (FIT) - does not assume perfect knowledge about the learner, as most machine teaching approaches do. It chooses, online, which sample to show next, as it updates the learner’s model based on feedback from the student on the weights attributed to the features. We present simulated results where the student has a different prior knowledge from the one assumed by the teacher. The results have shown that our teaching approach can mitigate this mismatch and lead to a significantly faster learning curve than the ones obtained in conditions where the teacher randomly selects the samples or does not consider this feedback from the student.

IJCAI Conference 2018 Conference Paper

Interactive Optimal Teaching with Unknown Learners

  • Francisco S. Melo
  • Carla Guerra
  • Manuel Lopes

This paper introduces a new approach for machine teaching that partly addresses the (unavoidable) mismatch between what the teacher assumes about the learning process of the student and the actual process. We analyze several situations in which such mismatch takes place, including when the student? s learning algorithm is known but the corresponding parameters are not, and when the learning algorithm itself is not known. Our analysis is focused on the case of a Bayesian Gaussian learner, and we show that, even in this simple case, the lack of knowledge regarding the student? s learning process significantly deteriorates the performance of machine teaching: while perfect knowledge of the student ensures that the target is learned after a finite number of samples, lack of knowledge thereof implies that the student will only learn asymptotically (i. e. , after an infinite number of samples). We introduce interactivity as a means to mitigate the impact of imperfect knowledge and show that, by using interactivity, we are able to recover finite learning time, in the best case, or significantly faster convergence, in the worst case. Finally, we discuss the extension of our analysis to a classification problem using linear discriminant analysis, and discuss the implications of our results in single- and multi-student settings.

IJCAI Conference 2017 Conference Paper

No Learner Left Behind: On the Complexity of Teaching Multiple Learners Simultaneously

  • Xiaojin Zhu
  • Ji Liu
  • Manuel Lopes

We present a theoretical study of algorithmic teaching in the setting where the teacher must use the same training set to teach multiple learners. This problem is a theoretical abstraction of the real-world classroom setting in which the teacher delivers the same lecture to academically diverse students. We define a minimax teaching criterion to guarantee the performance of the worst learner in the class. We prove that the teaching dimension increases with class diversity in general. For the classes of conjugate Bayesian learners and linear regression learners, respectively, we exhibit corresponding minimax teaching set. We then propose a method to enhance teaching by partitioning the class into sections. We present cases where the optimal partition minimizes overall teaching dimension while maintaining the guarantee on all learners. Interestingly, we show personalized education (one learner per section) is not necessarily the optimal partition. Our results generalize algorithmic teaching to multiple learners and offer insight on how to teach large classes.

IJCAI Conference 2015 Conference Paper

Inverse Reinforcement Learning in Relational Domains

  • Thibaut Munzer
  • Bilal Piot
  • Matthieu Geist
  • Olivier Pietquin
  • Manuel Lopes

In this work, we introduce the first approach to the Inverse Reinforcement Learning (IRL) problem in relational domains. IRL has been used to recover a more compact representation of the expert policy leading to better generalization performances among different contexts. On the other hand, relational learning allows representing problems with a varying number of objects (potentially infinite), thus provides more generalizable representations of problems and skills. We show how these different formalisms allow one to create a new IRL algorithm for relational domains that can recover with great efficiency rewards from expert data that have strong generalization and transfer properties. We evaluate our algorithm in representative tasks and study the impact of diverse experimental conditions such as: the number of demonstrations, knowledge about the dynamics, transfer among varying dimensions of a problem, and changing dynamics.

AAAI Conference 2014 Conference Paper

Calibration-Free BCI Based Control

  • Jonathan Grizou
  • Iñaki Iturrate
  • Luis Montesano
  • Pierre-Yves Oudeyer
  • Manuel Lopes

Recent works have explored the use of brain signals to directly control virtual and robotic agents in sequential tasks. So far in such brain-computer interfaces (BCI), an explicit calibration phase was required to build a decoder that translates raw electroencephalography (EEG) signals from the brain of each user into meaningful instructions. This paper proposes a method that removes the calibration phase, and allows a user to control an agent to solve a sequential task. The proposed method assumes a distribution of possible tasks, and infers the interpretation of EEG signals and the task by selecting the hypothesis which best explains the history of interaction. We introduce a measure of uncertainty on the task and on the EEG signal interpretation to act as an exploratory bonus for a planning strategy. This speeds up learning by guiding the system to regions that better disambiguate among task hypotheses. We report experiments where four users use BCI to control an agent on a virtual world to reach a target without any previous calibration process.

IJCAI Conference 2013 Conference Paper

Active Learning for Teaching a Robot Grounded Relational Symbols

  • Johannes Kulick
  • Marc Toussaint
  • Tobias Lang
  • Manuel Lopes

We investigate an interactive teaching scenario, where a human teaches a robot symbols which abstract the geometric properties of objects. There are multiple motivations for this scenario: First, state-of-the-art methods for relational reinforcement learning demonstrate that we can learn and employ strongly generalizing abstract models with great success for goal-directed object manipulation. However, these methods rely on given grounded action and state symbols and raise the classical question: Where do the symbols come from? Second, existing research on learning from humanrobot interaction has focused mostly on the motion level (e. g. , imitation learning). However, if the goal of teaching is to enable the robot to autonomously solve sequential manipulation tasks in a goal-directed manner, the human should have the possibility to teach the relevant abstractions to describe the task and let the robot eventually leverage powerful relational RL methods. In this paper we formalize human-robot teaching of grounded symbols as an active learning problem, where the robot actively generates pick-and-place geometric situations that maximize its information gain about the symbol to be learned. We demonstrate that the learned symbols can be used by a robot in a relational RL framework to learn probabilistic relational rules and use them to solve object manipulation tasks in a goal-directed manner.

AAAI Conference 2012 Conference Paper

Algorithmic and Human Teaching of Sequential Decision Tasks

  • Maya Cakmak
  • Manuel Lopes

A helpful teacher can significantly improve the learning rate of a learning agent. Teaching algorithms have been formally studied within the field of Algorithmic Teaching. These give important insights into how a teacher can select the most informative examples while teaching a new concept. However the field has so far focused purely on classification tasks. In this paper we introduce a novel method for optimally teaching sequential decision tasks. We present an algorithm that automatically selects the set of most informative demonstrations and evaluate it on several navigation tasks. Next, we explore the idea of using this algorithm to produce instructions for humans on how to choose examples when teaching sequential decision tasks. We present a user study that demonstrates the utility of such instructions.

NeurIPS Conference 2012 Conference Paper

Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress

  • Manuel Lopes
  • Tobias Lang
  • Marc Toussaint
  • Pierre-Yves Oudeyer

Formal exploration approaches in model-based reinforcement learning estimate the accuracy of the currently learned model without consideration of the empirical prediction error. For example, PAC-MDP approaches such as Rmax base their model certainty on the amount of collected data, while Bayesian approaches assume a prior over the transition dynamics. We propose extensions to such approaches which drive exploration solely based on empirical estimates of the learner's accuracy and learning progress. We provide a ``sanity check'' theoretical analysis, discussing the behavior of our extensions in the standard stationary finite state-action case. We then provide experimental studies demonstrating the robustness of these exploration measures in cases of non-stationary environments or where original approaches are misled by wrong domain assumptions.