Arrow Research search

Author name cluster

Walter Lasecki

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

RLDM Conference 2017 Conference Abstract

Connecting Instructors, Learning Scientists, and Reinforcement Learning Researchers via Col- laborative Dynamic Personalized Experimentation

  • Joseph Williams
  • Anna Rafferty
  • Andrew Ang
  • Dustin Tingley
  • Walter Lasecki
  • Juho Kim

The shift to digital educational resources provides new opportunities to advance psychology and education research, in tandem with improving instruction using theory and data, by using reinforcement learning to conduct dynamic experiments and turn results into real-time improvements to online resources. To realize this potential, this paper explores how randomized experiments can support mutually benefi- cial instructor-researcher collaborations. We developed the Collaborative Dynamic Experimentation (CDE) framework to address two key tensions. To enable researchers to embed experiments in online lessons while maintaining instructors’ editorial control, Collaborative experiment authoring is needed. To enable instruc- tors to use data for rapid improvement while maintaining statistically valid data for researchers, we apply the Thompson Sampling algorithm for bandits. We worked with an on-campus instructor to implement a proof-of-concept CDE system to experiment within their online calculus quizzes. The qualitative results from this deployment provided insight into how the CDE framework can facilitate alignment of research and practice. To enable this approach to be applied beyond education to any online experiment, we present a software requirements specification for implementing digital experiments, which provides an abstraction for using reinforcement learning algorithms to adapt experiments in real time. This provides data structures and APIs that enable the policy for which experimental conditions are assigned to a user to be dynamically mod- ified, in order to trade off exploration with exploitation (giving the best conditions, personalizing delivery of conditions). The conditions of an experiment correspond to an action space (which can be dynamically expanded via API, allowing algorithms for infinitely armed bandits), the dependent measures to reward functions, characteristics of users to contextual variables (bandits) or a state space (MDPs, POMDPs).

AAAI Conference 2013 Conference Paper

Crowd Formalization of Action Conditions

  • Walter Lasecki
  • Leon Weingard
  • Jefffrey Bigham
  • George Ferguson

Training intelligent systems is a time consuming and costly process that often limits their application to realworld problems. Prior work in crowdsourcing has attempted to compensate for this challenge by generating sets of labeled training data for machine learning algorithms. In this work, we seek to move beyond collecting just statistical data and explore how to gather structured, relational representations of a scenario using the crowd. We focus on activity recognition because of its broad applicability, high level of variation between individual instances, and difficulty of training systems a priori. We present ARchitect, a system that uses the crowd to ascertain pre and post conditions for actions observed in a video and find relations between actions. Our ultimate goal is to identify multiple valid execution paths from a single set of observations, which suggests one-off learning from the crowd is possible.

AAAI Conference 2012 Conference Paper

Real-Time Collaborative Planning with the Crowd

  • Walter Lasecki
  • Jeffrey Bigham
  • James Allen
  • George Ferguson

Planning is vital to a wide range of domains, including robotics, military strategy, logistics, itinerary generation and more, that both humans and computers find difficult. Collaborative planning holds the promise of greatly improving performance on these tasks by leveraging the strengths of both humans and automated planners. However, this requires formalizing the problem domain and input, which must be done by hand, a priori, restricting its use in general real-world domains. We propose using a real-time crowd of workers to simultaneously solve the planning problem, formalize the domain, and train an automated system. As plans are developed, the system is able to learn the domain, and contribute larger segments of work.