Arrow Research search

Author name cluster

Ece Kamar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

34 papers
2 author rows

Possible papers

34

ICLR Conference 2024 Conference Paper

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

  • Mert Yüksekgönül
  • Varun Chandrasekaran
  • Erik Jones
  • Suriya Gunasekar
  • Ranjita Naik
  • Hamid Palangi
  • Ece Kamar
  • Besmira Nushi

We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as constraint satisfaction problems and use this framework to investigate how the LLM interacts internally with factual constraints. We find a strong positive relationship between the LLM's attention to constraint tokens and the factual accuracy of generations. We curate a suite of 10 datasets containing over 40,000 prompts to study the task of predicting factual errors with the Llama-2 family across all scales (7B, 13B, 70B). We propose SAT Probe, a method probing attention patterns, that can predict factual errors and fine-grained constraint satisfaction, and allow early error identification. The approach and findings take another step towards using the mechanistic understanding of LLMs to enhance their reliability.

ICLR Conference 2024 Conference Paper

Teaching Language Models to Hallucinate Less with Synthetic Tasks

  • Erik Jones
  • Hamid Palangi
  • Clarisse Simões
  • Varun Chandrasekaran
  • Subhabrata Mukherjee
  • Arindam Mitra
  • Ahmed Hassan Awadallah
  • Ece Kamar

Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. However, optimizing to make LLMs hallucinate less is challenging, as hallucination is hard to efficiently, cheaply, and reliably evaluate at each optimization step. In this work, we show that reducing hallucination on a _synthetic task_ can also reduce hallucination on real-world downstream tasks. Our method, SynTra, first designs a synthetic task where hallucinations are easy to elicit and measure. It next optimizes the LLM's system message via prefix tuning on the synthetic task, then uses the system message on realistic, hard-to-optimize tasks. Across three realistic abstractive summarization tasks, we reduce hallucination for two 13B-parameter LLMs using supervision signal from only a synthetic retrieval task. We also find that optimizing the system message rather than the model weights can be critical; fine-tuning the entire model on the synthetic task can counterintuitively _increase_ hallucination. Overall, SynTra demonstrates that the extra flexibility of working with synthetic data can help mitigate undesired behaviors in practice.

JAIR Journal 2022 Journal Article

Avoiding Negative Side Effects of Autonomous Systems in the Open World

  • Sandhya Saisubramanian
  • Ece Kamar
  • Shlomo Zilberstein

Autonomous systems that operate in the open world often use incomplete models of their environment. Model incompleteness is inevitable due to the practical limitations in precise model specification and data collection about open-world environments. Due to the limited fidelity of the model, agent actions may produce negative side effects (NSEs) when deployed. Negative side effects are undesirable, unmodeled effects of agent actions on the environment. NSEs are inherently challenging to identify at design time and may affect the reliability, usability and safety of the system. We present two complementary approaches to mitigate the NSE via: (1) learning from feedback, and (2) environment shaping. The solution approaches target settings with different assumptions and agent responsibilities. In learning from feedback, the agent learns a penalty function associated with a NSE. We investigate the efficiency of different feedback mechanisms, including human feedback and autonomous exploration. The problem is formulated as a multi-objective Markov decision process such that optimizing the agent’s assigned task is prioritized over mitigating NSE. A slack parameter denotes the maximum allowed deviation from the optimal expected reward for the agent’s task in order to mitigate NSE. In environment shaping, we examine how a human can assist an agent, beyond providing feedback, and utilize their broader scope of knowledge to mitigate the impacts of NSE. We formulate the problem as a human-agent collaboration with decoupled objectives. The agent optimizes its assigned task and may produce NSE during its operation. The human assists the agent by performing modest reconfigurations of the environment so as to mitigate the impacts of NSE, without affecting the agent’s ability to complete its assigned task. We present an algorithm for shaping and analyze its properties. Empirical evaluations demonstrate the trade-offs in the performance of different approaches in mitigating NSE in different settings.

AAAI Conference 2022 Conference Paper

Investigations of Performance and Bias in Human-AI Teamwork in Hiring

  • Andi Peng
  • Besmira Nushi
  • Emre Kiciman
  • Kori Inkpen
  • Ece Kamar

In AI-assisted decision-making, effective hybrid (human-AI) teamwork is not solely dependent on AI performance alone, but also on its impact on human decision-making. While prior work studies the effects of model accuracy on humans, we endeavour here to investigate the complex dynamics of how both a model’s predictive performance and bias may transfer to humans in a recommendation-aided decision task. We consider the domain of ML-assisted hiring, where humans—operating in a constrained selection setting—can choose whether they wish to utilize a trained model’s inferences to help select candidates from written biographies. We conduct a large-scale user study leveraging a re-created dataset of real bios from prior work, where humans predict the ground truth occupation of given candidates with and without the help of three different NLP classifiers (random, bag-of-words, and deep neural network). Our results demonstrate that while high-performance models significantly improve human performance in a hybrid setting, some models mitigate hybrid bias while others accentuate it. We examine these findings through the lens of decision conformity and observe that our model architecture choices have an impact on human-AI conformity and bias, motivating the explicit need to assess these complex dynamics prior to deployment.

AAAI Conference 2021 Conference Paper

Improving the Performance-Compatibility Tradeoff with Personalized Objective Functions

  • Jonathan Martinez
  • Kobi Gal
  • Ece Kamar
  • Levi H. S. Lelis

AI-systems that model and interact with their users can update their models over time to reflect new information and changes in the environment. Although these updates may improve the overall performance of the AI-system, they may actually hurt the performance with respect to individual users. Prior work has studied the tradeoff between improving the system’s performance following an update and the compatibility of the updated system with prior user experience. The more the model is forced to be compatible with a prior version, the higher loss in performance it will incur. This paper challenges this assumption by showing that by personalizing the loss function to specific users, it is possible to increase the prediction performance of the AI-system while sacrificing less compatibility for these users. Our approach updates the sample weights to reflect their contribution to the compatibility of the model for a particular user following the update. We construct a portfolio of different models that vary in how they personalize the loss function for a user. We select the best model to use for a target user based on a validation set. We apply this approach to three supervised learning tasks commonly used in the human-computer decision-making literature. We show that using our approach leads to significant improvements in the performance-compatibility tradeoff over the non-personalized approach of Bansal et al. , achieving up to 300% improvement for certain users. We present several use cases that illustrate the difference between the personalized and non-personalized approach for two of our domains.

AAAI Conference 2021 Conference Paper

Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork

  • Gagan Bansal
  • Besmira Nushi
  • Ece Kamar
  • Eric Horvitz
  • Daniel S. Weld

AI practitioners typically strive to develop the most accurate systems, making an implicit assumption that the AI system will function autonomously. However, in practice, AI systems often are used to provide advice to people in domains ranging from criminal justice and finance to healthcare. In such AI-advised decision making, humans and machines form a team, where the human is responsible for making final decisions. But is the most accurate AI the best teammate? We argue “not necessarily” — predictable performance may be worth a slight sacrifice in AI accuracy. Instead, we argue that AI systems should be trained in a human-centered manner, directly optimized for team performance. We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves. To optimize the team performance for this setting we maximize the team’s expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance and show the benefit of modeling teamwork during training through improvements in expected team utility across datasets, considering parameters such as human skill and the cost of mistakes. We discuss the shortcoming of current optimization approaches beyond well-studied loss functions such as log-loss, and encourage future work on AI optimization problems motivated by human-AI collaboration.

IJCAI Conference 2020 Conference Paper

A Multi-Objective Approach to Mitigate Negative Side Effects

  • Sandhya Saisubramanian
  • Ece Kamar
  • Shlomo Zilberstein

Agents operating in unstructured environments often create negative side effects (NSE) that may not be easy to identify at design time. We examine how various forms of human feedback or autonomous exploration can be used to learn a penalty function associated with NSE during system deployment. We formulate the problem of mitigating the impact of NSE as a multi-objective Markov decision process with lexicographic reward preferences and slack. The slack denotes the maximum deviation from an optimal policy with respect to the agent's primary objective allowed in order to mitigate NSE as a secondary objective. Empirical evaluation of our approach shows that the proposed framework can successfully mitigate NSE and that different feedback mechanisms introduce different biases, which influence the identification of NSE.

JAIR Journal 2020 Journal Article

Blind Spot Detection for Safe Sim-to-Real Transfer

  • Ramya Ramakrishnan
  • Ece Kamar
  • Debadeepta Dey
  • Eric Horvitz
  • Julie Shah

Agents trained in simulation may make errors when performing actions in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult for the agent to discover because the agent is unable to predict them a priori. In this work, we propose the use of oracle feedback to learn a predictive model of these blind spots in order to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: when the agent lacks necessary features to represent the true state of the world, and thus cannot distinguish between numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. Our system learns models for predicting blind spots within unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. These models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach across two domains and demonstrate that it achieves higher predictive performance than baseline methods, and also that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how these biases influence the discovery of blind spots. Further, we include analyses of our approach that incorporate relaxed initial optimality assumptions. (Interestingly, relaxing the assumptions of an optimal oracle and an optimal simulator policy helped our models to perform better.) We also propose extensions to our method that are intended to improve performance when using corrections and demonstrations data.

IJCAI Conference 2020 Conference Paper

Learning to Complement Humans

  • Bryan Wilder
  • Eric Horvitz
  • Ece Kamar

A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks. To date, systems aimed at complementing the skills of people have employed models trained to be as accurate as possible in isolation. We demonstrate how an end-to-end learning strategy can be harnessed to optimize the combined performance of human-machine teams by considering the distinct abilities of people and machines. The goal is to focus machine learning on problem instances that are difficult for humans, while recognizing instances that are difficult for the machine and seeking human input on them. We demonstrate in two real-world domains (scientific discovery and medical diagnosis) that human-machine teams built via these methods outperform the individual performance of machines and people. We then analyze conditions under which this complementarity is strongest, and which training methods amplify it. Taken together, our work provides the first systematic investigation of how machine learning systems can be trained to complement human reasoning.

AAAI Conference 2020 Short Paper

Supervised Discovery of Unknown Unknowns through Test Sample Mining (Student Abstract)

  • Zheng Wang
  • Bruno Abrahao
  • Ece Kamar

Given a fixed hypothesis space, defined to model class structure in a particular domain of application, unknown unknowns (u. u. s) are data examples that form classes in the feature space whose structure is not represented in a trained model. Accordingly, this leads to incorrect class prediction with high confidence, which represents one of the major sources of blind spots in machine learning. Our method seeks to reduce the structural mismatch between the training model and that of the target space in a supervised way. We illuminate further structure through cross-validation on a modified training model, set up to mine and trap u. u. s in a marginal training class, created from examples of a random sample of the test set. Contrary to previous approaches, our method simplifies the solution, as it does not rely on budgeted queries to an Oracle whose outcomes inform adjustments to training. In addition, our empirically results exhibit consistent performance improvements over baselines, on both synthetic and real-world data sets.

AAAI Conference 2019 Conference Paper

Overcoming Blind Spots in the Real World: Leveraging Complementary Abilities for Joint Execution

  • Ramya Ramakrishnan
  • Ece Kamar
  • Besmira Nushi
  • Debadeepta Dey
  • Julie Shah
  • Eric Horvitz

Simulators are being increasingly used to train agents before deploying them in real-world environments. While training in simulation provides a cost-effective way to learn, poorly modeled aspects of the simulator can lead to costly mistakes, or blind spots. While humans can help guide an agent towards identifying these error regions, humans themselves have blind spots and noise in execution. We study how learning about blind spots of both can be used to manage hand-off decisions when humans and agents jointly act in the real-world in which neither of them are trained or evaluated fully. The formulation assumes that agent blind spots result from representational limitations in the simulation world, which leads the agent to ignore important features that are relevant for acting in the open world. Our approach for blind spot discovery combines experiences collected in simulation with limited human demonstrations. The first step applies imitation learning to demonstration data to identify important features that the human is using but that the agent is missing. The second step uses noisy labels extracted from action mismatches between the agent and the human across simulation and demonstration data to train blind spot models. We show through experiments on two domains that our approach is able to learn a succinct representation that accurately captures blind spot regions and avoids dangerous errors in the real world through transfer of control between the agent and the human.

AAAI Conference 2019 Conference Paper

Updates in Human-AI Teams: Understanding and Addressing the Performance/Compatibility Tradeoff

  • Gagan Bansal
  • Besmira Nushi
  • Ece Kamar
  • Daniel S. Weld
  • Walter S. Lasecki
  • Eric Horvitz

AI systems are being deployed to support human decision making in high-stakes domains such as healthcare and criminal justice. In many cases, the human and AI form a team, in which the human makes decisions after reviewing the AI’s inferences. A successful partnership requires that the human develops insights into the performance of the AI system, including its failures. We study the influence of updates to an AI system in this setting. While updates can increase the AI’s predictive performance, they may also lead to behavioral changes that are at odds with the user’s prior experiences and confidence in the AI’s inferences. We show that updates that increase AI performance may actually hurt team performance. We introduce the notion of the compatibility of an AI update with prior user experience and present methods for studying the role of compatibility in human-AI teams. Empirical results on three high-stakes classification tasks show that current machine learning algorithms do not produce compatible updates. We propose a re-training objective to improve the compatibility of an update by penalizing new errors. The objective offers full leverage of the performance/compatibility tradeoff across different datasets, enabling more compatible yet accurate updates.

AAMAS Conference 2018 Conference Paper

Discovering Blind Spots in Reinforcement Learning

  • Ramya Ramakrishnan
  • Ece Kamar
  • Debadeepta Dey
  • Julie Shah
  • Eric Horvitz

Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots.

IJCAI Conference 2018 Conference Paper

Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing

  • Elliot Salisbury
  • Ece Kamar
  • Meredith Ringel Morris

We study how real-time crowdsourcing can be used both for evaluating the value provided by existing automated approaches and for enabling workflows that provide scalable and useful alt text to blind users. We show that the shortcomings of existing AI image captioning systems frequently hinder a user's understanding of an image they cannot see to a degree that even clarifying conversations with sighted assistants cannot correct. Based on analysis of clarifying conversations collected from our studies, we design experiences that can effectively assist users in a scalable way without the need for real-time interaction. Our results provide lessons and guidelines that the designers of future AI captioning systems can use to improve labeling of social media imagery for blind users.

AAAI Conference 2018 Conference Paper

Optimizing Interventions via Offline Policy Evaluation: Studies in Citizen Science

  • Avi Segal
  • Kobi Gal
  • Ece Kamar
  • Eric Horvitz
  • Grant Miller

Volunteers who help with online crowdsourcing such as citizen science tasks typically make only a few contributions before exiting. We propose a computational approach for increasing users’ engagement in such settings that is based on optimizing policies for displaying motivational messages to users. The approach, which we refer to as Trajectory Corrected Intervention (TCI), reasons about the tradeoff between the long-term influence of engagement messages on participants’ contributions and the potential risk of disrupting their current work. We combine model-based reinforcement learning with off-line policy evaluation to generate intervention policies, without relying on a fixed representation of the domain. TCI works iteratively to learn the best representation from a set of random intervention trials and to generate candidate intervention policies. It is able to refine selected policies off-line by exploiting the fact that users can only be interrupted once per session. We implemented TCI in the wild with Galaxy Zoo, one of the largest citizen science platforms on the web. We found that TCI was able to outperform the state-of-the-art intervention policy for this domain, and significantly increased the contributions of thousands of users. This work demonstrates the benefit of combining traditional AI planning with off-line policy methods to generate intelligent intervention strategies.

RLDM Conference 2017 Conference Abstract

Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence

  • Ece Kamar

Historically, a common goal for the development of AI systems has been exhibiting intelligent behaviors that humans excel at. Despite advances in AI, machines still have limitations in accomplishing tasks that come naturally to humans. In this talk, I will argue that hybrid systems that combine the strengths of machine and human intelligence is key to overcoming the limitations of AI algorithms and developing reliable systems. I will provide an overview of three projects, which investigate how to integrate human intelligence into the training, execution and troubleshooting of AI systems. I will highlight the novel decision-making challenges these projects introduce and discuss techniques for addressing them. I will conclude the talk by discussing opportunities for inter-disciplinary research in this space and future directions. Wednesday, June 14, 2017 Elizabeth Phelps: Animal models of associative threat learning provide a basis for understanding human fears and anxiety. Building on research from animal models, we explore a range of means maladaptive defensive responses can be diminished in humans. Extinction and emotion regulation, techniques adapted in cognitive behav- ioral therapy, can be used to control learned defensive responses via inhibitory signals from the ventromedial prefrontal cortex to the amygdala. One drawback of these techniques is that these responses are only in- hibited and can return, with one factor being stress. I will review research examining the lasting control of maladaptive defensive responses by targeting memory reconsolidation and present evidence suggesting that the behavioral interference of reconsolidation in humans diminishes involvement of the prefrontal cortex inhibitory circuitry, although there are limitations to its efficacy. I will also describe two novel behavioral techniques that might result in a more lasting fear reduction, the first by providing control over stressor and the second by substituting a novel, neutral cue for the aversive unconditioned stimulus.

AAAI Conference 2017 Conference Paper

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration

  • Himabindu Lakkaraju
  • Ece Kamar
  • Rich Caruana
  • Eric Horvitz

Predictive models deployed in the real world may assign incorrect labels to instances with high confidence. Such errors or unknown unknowns are rooted in model incompleteness, and typically arise because of the mismatch between training data and the cases encountered at test time. As the models are blind to such errors, input from an oracle is needed to identify these failures. In this paper, we formulate and address the problem of informed discovery of unknown unknowns of any given predictive model where unknown unknowns occur due to systematic biases in the training data. We propose a modelagnostic methodology which uses feedback from an oracle to both identify unknown unknowns and to intelligently guide the discovery. We employ a two-phase approach which first organizes the data into multiple partitions based on the feature similarity of instances and the confidence scores assigned by the predictive model, and then utilizes an explore-exploit strategy for discovering unknown unknowns across these partitions. We demonstrate the efficacy of our framework by varying the underlying causes of unknown unknowns across various applications. To the best of our knowledge, this paper presents the first algorithmic approach to the problem of discovering unknown unknowns of predictive models.

AAAI Conference 2017 Conference Paper

On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems

  • Besmira Nushi
  • Ece Kamar
  • Eric Horvitz
  • Donald Kossmann

We study the problem of troubleshooting machine learning systems that rely on analytical pipelines of distinct components. Understanding and fixing errors that arise in such integrative systems is difficult as failures can occur at multiple points in the execution workflow. Moreover, errors can propagate, become amplified or be suppressed, making blame assignment difficult. We propose a human-in-the-loop methodology which leverages human intellect for troubleshooting system failures. The approach simulates potential component fixes through human computation tasks and measures the expected improvements in the holistic behavior of the system. The method provides guidance to designers about how they can best improve the system. We demonstrate the effectiveness of the approach on an automated image captioning system that has been pressed into real-world use.

IJCAI Conference 2016 Conference Paper

Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence

  • Ece Kamar

Hybrid intelligence systems combine machine and human intelligence to overcome the shortcomings of existing AI systems. This paper reviews recent research efforts towards developing hybrid systems focusing on reasoning methods for optimizing access to human intelligence and on gaining comprehensive understanding of humans as helpers of AI systems. It concludes by discussing short and long term research directions.

IJCAI Conference 2016 Conference Paper

Interactive Teaching Strategies for Agent Training

  • Ofra Amir
  • Ece Kamar
  • Andrey Kolobov
  • Barbara J. Grosz

Agents learning how to act in new environments can benefit from input from more experienced agents or humans. This paper studies interactive teaching strategies for identifying when a student can benefit from teacher-advice in a reinforcement learning framework. In student-teacher learning, a teacher agent can advise the student on which action to take. Prior work has considered heuristics for the teacher to choose advising opportunities. While these approaches effectively accelerate agent training, they assume that the teacher constantly monitors the student. This assumption may not be satisfied with human teachers, as people incur cognitive costs of monitoring and might not always pay attention. We propose strategies for a teacher and a student to jointly identify advising opportunities so that the teacher is not required to constantly monitor the student. Experimental results show that these approaches reduce the amount of attention required from the teacher compared to teacher-initiated strategies, while maintaining similar learning gains. The empirical evaluation also investigates the effect of the information communicated to the teacher and the quality of the student's initial policy on teaching outcomes.

IJCAI Conference 2016 Conference Paper

Intervention Strategies for Increasing Engagement in Crowdsourcing: Platform, Predictions, and Experiments

  • Avi Segal
  • Ya'akov (Kobi) Gal
  • Ece Kamar
  • Eric Horvitz
  • Alex Bowyer
  • Grant Miller

Volunteer-based crowdsourcing depend critically on maintaining the engagement of participants. We explore a methodology for extending engagement in citizen science by combining machine learning with intervention design. We first present a platform for using real-time predictions about forthcoming disengagement to guide interventions. Then we discuss a set of experiments with delivering different messages to users based on the proximity to the predicted time of disengagement. The messages address motivational factors that were found in prior studies to influence users' engagements. We evaluate this approach on Galaxy Zoo, one of the largest citizen science application on the web, where we traced the behavior and contributions of thousands of users who received intervention messages over a period of a few months. We found sensitivity of the amount of user contributions to both the timing and nature of the message. Specifically, we found that a message emphasizing the helpfulness of individual users significantly increased users' contributions when delivered ac- cording to predicted times of disengagement, but not when delivered at random times. The influence of the message on users' contributions was more pronounced as additional user data was collected and made available to the classifier.

IJCAI Conference 2015 Conference Paper

Metareasoning for Planning Under Uncertainty

  • Christopher H. Lin
  • Andrey Kolobov
  • Ece Kamar
  • Eric Horvitz

The conventional model for online planning under uncertainty assumes that an agent can stop and plan without incurring costs for the time spent planning. However, planning time is not free in most realworld settings. For example, an autonomous drone is subject to nature’s forces, like gravity, even while it thinks, and must either pay a price for counteracting these forces to stay in place, or grapple with the state change caused by acquiescing to them. Policy optimization in these settings requires metareasoning—a process that trades off the cost of planning and the potential policy improvement that can be achieved. We formalize and analyze the metareasoning problem for Markov Decision Processes (MDPs). Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking. For reasons we discuss, optimal general metareasoning turns out to be impractical, motivating approximations. We present approximate metareasoning procedures which rely on special properties of the BRTDP planning algorithm and explore the effectiveness of our methods on a variety of problems.

AAAI Conference 2014 Conference Paper

Signals in the Silence: Models of Implicit Feedback in a Recommendation System for Crowdsourcing

  • Christopher Lin
  • Ece Kamar
  • Eric Horvitz

We exploit the absence of signals as informative observations in the context of providing task recommendations in crowdsourcing. Workers on crowdsourcing platforms do not provide explicit ratings about tasks. We present methods that enable a system to leverage implicit signals about task preferences. These signals include types of tasks that have been available and have been displayed, and the number of tasks workers select and complete. In contrast to previous work, we present a general model that can represent both positive and negative implicit signals. We introduce algorithms that can learn these models without exceeding the computational complexity of existing approaches. Finally, using data from a high-throughput crowdsourcing platform, we show that reasoning about both positive and negative implicit feedback can improve the quality of task recommendations.

AAAI Conference 2014 Conference Paper

Stochastic Privacy

  • Adish Singla
  • Eric Horvitz
  • Ece Kamar
  • Ryen White

Online services such as web search and e-commerce applications typically rely on the collection of data about users, including details of their activities on the web. Such personal data is used to maximize revenues via targeting of advertisements and longer engagements of users, and to enhance the quality of service via personalization of content. To date, service providers have largely followed the approach of either requiring or requesting consent for collecting user data. Users may be willing to share private information in return for incentives, enhanced services, or assurances about the nature and extent of the logged data. We introduce stochastic privacy, an approach to privacy centering on the simple concept of providing people with a guarantee that the probability that their personal data will be shared does not exceed a given bound. Such a probability, which we refer to as the privacy risk, can be given by users as a preference or communicated as a policy by a service provider. Service providers can work to personalize and to optimize revenues in accordance with preferences about privacy risk. We present procedures, proofs, and an overall system for maximizing the quality of services, while respecting bounds on privacy risk. We demonstrate the methodology with a case study and evaluation of the procedures applied to web search personalization. We show how we can achieve near-optimal utility of accessing information with provable guarantees on the probability of sharing data.

IJCAI Conference 2013 Conference Paper

Lifelong Learning for Acquiring the Wisdom of the Crowd

  • Ece Kamar
  • Ashish Kapoor
  • Eric Horvitz

Predictive models play a key role for inference and decision making in crowdsourcing. We present methods that can be used to guide the collection of data for enhancing the competency of such predictive models while using the models to provide a base crowdsourcing service. We focus on the challenge of ideally balancing the goals of collecting data over time for learning and for improving task performance with the cost of workers’ contributions over the lifetime of the operation of a system. We introduce the use of distributions over a set of predictive models to represent uncertainty about the dynamics of the world. We employ a novel Monte Carlo algorithm to reason simultaneously about uncertainty about the world dynamics and the progression of task solution as workers are hired over time to optimize hiring decisions. We evaluate the methodology with experiments on a challenging citizen-science problem, demonstrating how it balances exploration and exploitation over the lifetime of a crowdsourcing system.

IJCAI Conference 2013 Conference Paper

Look versus Leap: Computing Value of Information with High-Dimensional Streaming Evidence

  • Stephanie Rosenthal
  • Dan Bohus
  • Ece Kamar
  • Eric Horvitz

A key decision facing autonomous systems with access to streams of sensory data is whether to act based on current evidence or to wait for additional information that might enhance the utility of taking an action. Computing the value of information is particularly difficult with streaming highdimensional sensory evidence. We describe a belief projection approach to reasoning about information value in these settings, using models for inferring future beliefs over states given streaming evidence. These belief projection models can be learned from data or constructed via direct assessment of parameters and they fit naturally in modular, hierarchical state inference architectures. We describe principles of using belief projection and present results drawn from an implementation of the methodology within a conversational system.

AIJ Journal 2013 Journal Article

Modeling information exchange opportunities for effective human–computer teamwork

  • Ece Kamar
  • Yaʼakov (Kobi) Gal
  • Barbara J. Grosz

This paper studies information exchange in collaborative group activities involving mixed networks of people and computer agents. It introduces the concept of “nearly decomposable” decision-making problems to address the complexity of information exchange decisions in such multi-agent settings. This class of decision-making problems arise in settings which have an action structure that requires agents to reason about only a subset of their partnersʼ actions – but otherwise allows them to act independently. The paper presents a formal model of nearly decomposable decision-making problems, NED-MDPs, and defines an approximation algorithm, NED-DECOP that computes efficient information exchange strategies. The paper shows that NED-DECOP is more efficient than prior collaborative planning algorithms for this class of problem. It presents an empirical study of the information exchange decisions made by the algorithm that investigates the extent to which people accept interruption requests from a computer agent. The context for the study is a game in which the agent can ask people for information that may benefit its individual performance and thus the groupʼs collaboration. This study revealed the key factors affecting peopleʼs perception of the benefit of interruptions in this setting. The paper also describes the use of machine learning to predict the situations in which people deviate from the strategies generated by the algorithm, using a combination of domain features and features informed by the algorithm. The methodology followed in this work could form the basis for designing agents that effectively exchange information in collaborations with people.

AAMAS Conference 2012 Conference Paper

Combining Human and Machine Intelligence in Large-scale Crowdsourcing

  • Ece Kamar
  • Severin Hacker
  • Eric Horvitz

We show how machine learning and inference can be harnessed to leverage the complementary strengths of humans and computational agents to solve crowdsourcing tasks. We construct a set of Bayesian predictive models from data and describe how the models operate within an overall crowdsourcing architecture that combines the efforts of people and machine vision on the task of classifying celestial bodies defined within a citizens' science project named Galaxy Zoo. We show how learned probabilistic models can be used to fuse human and machine contributions and to predict the behaviors of workers. We employ multiple inferences in concert to guide decisions on hiring and routing workers to tasks so as to maximize the efficiency of large-scale crowdsourcing processes based on expected utility.

AAMAS Conference 2011 Conference Paper

Jogger: Models for Context-Sensitive Reminding

  • Ece Kamar
  • Eric Horvitz

We describe research on principles of context-sensitive reminding that show promise for serving in systems that work to jog peoples' memories about information that they may forget. The methods center on the construction and use of a set of distinct probabilistic models that predict (1) items that may be forgotten, (2) the expected relevance of the items in a situation, and (3) the cost of interruption associated with alerting about a reminder. We describe the use of this set of models in the Jogger prototype that employs predictions and decision-theoretic optimization to compute the value of reminders about meetings.

IJCAI Conference 2009 Conference Paper

  • Ece Kamar
  • Eric Horvitz

We develop and test computational methods for guiding collaboration that demonstrate how shared plans can be created in real-world settings, where agents can be expected to have diverse and varying goals, preferences, and availabilities. The methods are motivated and evaluated in the realm of ridesharing, using GPS logs of commuting data. We consider challenges with coordination among self-interested people aimed at minimizing the cost of transportation and the impact of travel on the environment. We present planning, optimization, and payment mechanisms that provide fair and ef- ficient solutions to the rideshare collaboration challenge. We evaluate different VCG-based payment schemes in terms of their computational efficiency, budget balance, incentive compatibility, and strategy proofness. We present the behavior and analyses provided by the ABC ridesharing prototype system. The system learns about destinations and preferences from GPS traces and calendars, and considers time, fuel, environmental, and cognitive costs. We review how ABC generates rideshare plans from hundreds of real-life GPS traces collected from a community of commuters and reflect about the promise of employing the ABC methods to reduce the number of vehicles on the road, thus reducing CO2 emissions and fuel expenditures.

AAMAS Conference 2009 Conference Paper

Incorporating Helpful Behavior into Collaborative Planning

  • Ece Kamar
  • Ya'akov Gal
  • Barbara J. Grosz

This paper considers the design of agent strategies for deciding whether to help other members of a group with whom an agent is engaged in a collaborative activity. Three characteristics of collaborative planning must be addressed by these decision-making strategies: agents may have only partial information about their partners’ plans for sub-tasks of the collaborative activity; the effectiveness of helping may not be known a priori; and, helping actions have some associated cost. The paper proposes a novel probabilistic representation of other agents’ beliefs about the recipes selected for their own or for the group activity, given partial information. This representation is compact, and thus makes reasoning about helpful behavior tractable. The paper presents a decision-theoretic mechanism that uses this representation to make decisions about two kinds of helpful actions: communicating information relevant to a partner’s plans for some sub-action, and adding domain actions that are helpful to other agent(s) into the collaborative plan. This mechanism includes a set of rules for reasoning about the utility of helpful actions and the cost incurred by doing them. It was tested using a multi-agent test-bed with configurations that varied agents’ uncertainty about the world, their uncertainty about each others’ capabilities or resources, and the cost of helpful behavior. In all cases, agents using the decision-theoretic mechanism to decide whether to help outperformed agents using purely axiomatic rules.

AAMAS Conference 2008 Conference Paper

Mobile Opportunistic Commerce: Mechanisms, Architecture, and Application

  • Ece Kamar
  • Eric Horvitz
  • Chris Meek

We present mechanisms, architectures, and an implementation addressing challenges with mobile opportunistic commerce centering on markets and mechanisms that support the procurement of goods and services in mobile settings. Our efforts seek to extend core concepts from research in electronic commerce to interactions between mobile buyers and brick and mortar businesses that have geographically situated retail offices. We focus on efficient mechanisms, infrastructure, and automation that can enable sellers and buyers to take joint advantage of the relationship of the locations of retail offices to the routes of mobile buyers who may have another primary destination. The methods promote automated vigilance about opportunities to buy and sell, and to support negotiations on the joint value to buyers and sellers including buyers’ costs of divergence from their original paths to acquire services and commodities. We extend prior work on auction mechanisms to personal procurement settings by analyzing the dynamics of the cost to buyers based on preexisting plans, location, and overall context. We present mechanisms for auctions in single item, combinatorial, and multiattribute settings that take into consideration personal inconvenience costs within timesensitive dynamic markets and challenges with privacy and fairness.