Arrow Research search

Author name cluster

Oliver Kroemer

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

54 papers
2 author rows

Possible papers

54

ICRA Conference 2025 Conference Paper

Autonomous Sensor Exchange and Calibration for Cornstalk Nitrate Monitoring Robot

  • Janice Seungyeon Lee
  • Thomas Detlefsen
  • Shara Lawande
  • Saudamini Ghatge
  • Shrudhi Ramesh Shanthi
  • Sruthi Mukkamala
  • George Kantor
  • Oliver Kroemer

Interactive sensors are an important component of robotic systems but often require manual replacement due to wear and tear. Automating this process can enhance system autonomy and facilitate long-term deployment. We developed an autonomous sensor exchange and calibration system for an agriculture crop monitoring robot that inserts a nitrate sensor into cornstalks. A novel gripper and replacement mechanism, featuring a reliable funneling design, were developed to enable efficient and reliable sensor exchanges. To maintain consistent nitrate sensor measurement, an on-board sensor calibration station was integrated to provide in-field sensor cleaning and calibration. The system was deployed at the Ames Curtis Farm in June 2024, where it successfully inserted nitrate sensors with high accuracy into $\mathbf{3 0}$ cornstalks with a $\mathbf{7 7} {\%}$ success rate.

ICRA Conference 2025 Conference Paper

Cascaded Diffusion Models for Neural Motion Planning

  • Mohit Sharma 0001
  • Adam Fishman
  • Vikash Kumar
  • Chris Paxton 0001
  • Oliver Kroemer

Robots in the real world need to perceive and move to goals in complex environments without collisions. Avoiding collisions is especially difficult when relying on sensor perception and when goals are among clutter. Diffusion policies and other generative models have shown strong performance in solving local planning problems, but often struggle at avoiding all of the subtle constraint violations that characterize truly challenging global motion planning problems. In this work, we propose an approach for learning global motion planning using diffusion policies, allowing the robot to generate full trajectories through complex scenes and reasoning about multiple obstacles along the path. Our approach uses cascaded hierarchical models which unify global prediction and local refinement together with online plan repair to ensure the trajectories are collision free. Our method outperforms ( $\approx 5 \%$ ) a wide variety of baselines on challenging tasks in multiple domains including navigation and manipulation.

IROS Conference 2025 Conference Paper

RecoveryChaining: Learning Local Recovery Policies for Robust Manipulation

  • Shivam Vats
  • Devesh K. Jha
  • Maxim Likhachev
  • Oliver Kroemer
  • Diego Romeres

Model-based planners and controllers are commonly used to solve complex manipulation problems as they can efficiently optimize diverse objectives and generalize to long horizon tasks. However, they often fail during deployment due to noisy actuation, partial observability and imperfect models. To enable a robot to recover from such failures, we propose to use hierarchical reinforcement learning to learn a recovery policy. The recovery policy is triggered when a failure is detected based on sensory observations and seeks to take the robot to a state from which it can complete the task using the nominal model-based controllers. Our approach, called RecoveryChaining, uses a hybrid action space, where the model-based controllers are provided as additional nominal options which allows the recovery policy to decide how to recover, when to switch to a nominal controller and which controller to switch to even with sparse rewards. We evaluate our approach in three multi-step manipulation tasks with sparse rewards, where it learns significantly more robust recovery policies than those learned by baselines. We successfully transfer recovery policies learned in simulation to a physical robot to demonstrate the feasibility of sim-to-real transfer with our method.

IROS Conference 2025 Conference Paper

Vibrotactile Sensing for Detecting Misalignments in Precision Manufacturing

  • Kevin Zhang 0002
  • Christopher Chang
  • Shobhit Aggarwal
  • Manuela Veloso
  • F. Zeynep Temel
  • Oliver Kroemer

Small and medium-sized enterprises (SMEs) often struggle with automating high-mix, low-volume (HMLV) manufacturing due to the inflexibility and high cost of traditional automation solutions. This paper presents a novel approach to robotic manipulation for HMLV environments that leverages vibrotactile sensing. We propose integrating vibrotactile sensors, which capture subtle vibrations and acoustic signals, to provide real-time feedback during manipulation tasks. This approach enables the robot to detect subtle misalignments, which can assist in refining vision-based policies and improving the robot’s overall manipulation skills. We demonstrate the effectiveness of this method in several representative insertion tasks, showing how vibrotactile feedback can be used to predict success or failure of an insertion task as well as predict initial contact between an object grasped in-hand and the placement location. Our results suggest that vibrotactile sensing offers a promising pathway towards more robust and adaptable robotic systems that can better empower SMEs to embrace automation.

ICRA Conference 2024 Conference Paper

Estimating Material Properties of Interacting Objects Using Sum-GP-UCB

  • M. Yunus Seker
  • Oliver Kroemer

Robots need to estimate the material and dynamic properties of objects from observations in order to simulate them accurately. We present a Bayesian optimization approach to identifying the material property parameters of objects based on a set of observations. Our focus is on estimating these properties based on observations of scenes with different sets of interacting objects. We propose an approach that exploits the structure of the reward function by modeling the reward for each observation separately and using only the parameters of the objects in that scene as inputs. The resulting lower-dimensional models generalize better over the parameter space, which in turn results in a faster optimization. To speed up the optimization process further, and reduce the number of simulation runs needed to find good parameter values, we also propose partial evaluations of the reward function, wherein the selected parameters are only evaluated on a subset of real world evaluations. The approach was successfully evaluated on a set of scenes with a wide range of object interactions, and we showed that our method can effectively perform incremental learning without resetting the rewards of the gathered observations.

IROS Conference 2024 Conference Paper

Leveraging Simulation-Based Model Preconditions for Fast Action Parameter Optimization with Multiple Models

  • M. Yunus Seker
  • Oliver Kroemer

Optimizing robotic action parameters is a significant challenge for manipulation tasks that demand high levels of precision and generalization. Using a model-based approach, the robot must quickly reason about the outcomes of different actions using a predictive model to find a set of parameters that will have the desired effect. The model may need to capture the behaviors of rigid and deformable objects, as well as objects of various shapes and sizes. Predictive models often need to trade-off speed for prediction accuracy and generalization. This paper proposes a framework that leverages the strengths of multiple predictive models, including analytical, learned, and simulation-based models, to enhance the efficiency and accuracy of action parameter optimization. Our approach uses Model Deviation Estimators (MDEs) to determine the most suitable predictive model for any given state-action parameters, allowing the robot to select models to make fast and precise predictions. We extend the MDE framework by not only learning sim-to-real MDEs, but also sim-to-sim MDEs. Our experiments show that these sim-to-sim MDEs provide significantly faster parameter optimization as well as a basis for efficiently learning sim-to-real MDEs through finetuning. The ease of collecting sim-to-sim training data also allows the robot to learn MDEs based directly on visual inputs and local material properties.

ICRA Conference 2024 Conference Paper

Open X-Embodiment: Robotic Learning Datasets and RT-X Models: Open X-Embodiment Collaboration

  • Abby O'Neill
  • Abdul Rehman
  • Abhiram Maddukuri
  • Abhishek Gupta 0004
  • Abhishek Padalkar
  • Abraham Lee
  • Acorn Pooley
  • Agrim Gupta

Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x. github.io.

ICRA Conference 2024 Conference Paper

Task-Oriented Active Learning of Model Preconditions for Inaccurate Dynamics Models

  • Alex LaGrassa
  • Moonyoung Lee
  • Oliver Kroemer

When planning with an inaccurate dynamics model, a practical strategy is to restrict planning to regions of state-action space where the model is accurate: also known as a model precondition. Empirical real-world trajectory data is valuable for defining data-driven model preconditions regard-less of the model form (analytical, simulator, learned, etc…). However, real-world data is often expensive and dangerous to collect. In order to achieve data efficiency, this paper presents an algorithm for actively selecting trajectories to learn a model precondition for an inaccurate pre-specified dynamics model. Our proposed techniques address challenges arising from the sequential nature of trajectories, and potential benefit of prioritizing task-relevant data. The experimental analysis shows how algorithmic properties affect performance in three planning scenarios: icy gridworld, simulated plant watering, and real-world plant watering. Results demonstrate an improvement of approximately 80% after only four real-world trajectories when using our proposed techniques. More material can be found on our project website: https://sites.google.com/view/active-mde.

ICRA Conference 2024 Conference Paper

Towards Robotic Tree Manipulation: Leveraging Graph Representations

  • Chung Hee Kim
  • Moonyoung Lee
  • Oliver Kroemer
  • George Kantor

There is growing interest in automating agricultural tasks that require intricate and precise interaction with specialty crops, such as trees and vines. However, developing robotic solutions for crop manipulation remains a difficult challenge due to complexities involved in modeling their deformable behavior. In this study, we present a framework for learning the deformation behavior of tree-like crops under contact interaction. Our proposed method involves encoding the state of a spring-damper modeled tree crop as a graph. This representation allows us to employ graph networks to learn both a forward model for predicting resulting deformations, and a contact policy for inferring actions to manipulate tree crops. We conduct a comprehensive set of experiments in a simulated environment and demonstrate generalizability of our method on previously unseen trees. Videos can be found on the project website: https://kantor-lab.github.io/tree_gnn

ICRA Conference 2023 Conference Paper

Efficient Recovery Learning using Model Predictive Meta-Reasoning

  • Shivam Vats
  • Maxim Likhachev
  • Oliver Kroemer

Operating under real world conditions is challenging due to the possibility of a wide range of failures induced by execution errors and state uncertainty. In relatively benign settings, such failures can be overcome by retrying or executing one of a small number of hand-engineered recovery strategies. By contrast, contact-rich sequential manipulation tasks, like opening doors and assembling furniture, are not amenable to exhaustive hand-engineering. To address this issue, we present a general approach for robustifying manipulation strategies in a sample-efficient manner. Our approach incrementally improves robustness by first discovering the failure modes of the current strategy via exploration in simulation and then learning additional recovery skills to handle these failures. To ensure efficient learning, we propose an online algorithm called Meta-Reasoning for Skill Learning (MetaReSkill) that monitors the progress of all recovery policies during training and allocates training resources to recoveries that are likely to improve the task performance the most. We use our approach to learn recovery skills for door-opening and evaluate them both in simulation and on a real robot with little fine-tuning. Compared to open-loop execution, our experiments show that even a limited amount of recovery learning improves task success substantially from 71% to 92. 4% in simulation and from 75% to 90% on a real robot.

ICRA Conference 2023 Conference Paper

Focused Adaptation of Dynamics Models for Deformable Object Manipulation

  • Peter Mitrano
  • Alex LaGrassa
  • Oliver Kroemer
  • Dmitry Berenson

In order to efficiently learn a dynamics model for a task in a new environment, one can adapt a model learned in a similar source environment. However, existing adaptation methods can fail when the target dataset contains transitions where the dynamics are very different from the source environment. For example, the source environment dynamics could be of a rope manipulated in free space, whereas the target dynamics could involve collisions and deformation on obstacles. Our key insight is to improve data efficiency by focusing model adaptation on only the regions where the source and target dynamics are similar. In the rope example, adapting the free-space dynamics requires significantly less data than adapting the free-space dynamics while also learning collision dynamics. We propose a new method for adaptation that is effective in adapting to regions of similar dynamics. Additionally, we combine this adaptation method with prior work on planning with unreliable dynamics to make a method for data-efficient online adaptation, called FOCUS. We first demonstrate that the proposed adaptation method achieves statistically significantly lower prediction error in regions of similar dynamics on simulated rope manipulation and plant watering tasks. We then show on a bimanual rope manipulation task that FOCUS achieves data-efficient online learning, in simulation and in the real world.

ICRA Conference 2023 Conference Paper

Linear Delta Arrays for Compliant Dexterous Distributed Manipulation

  • Sarvesh Patil
  • Tony Tao
  • Tess Lee Hellebrekers
  • Oliver Kroemer
  • F. Zeynep Temel

This paper presents a new type of distributed dexterous manipulator: delta arrays. Our delta array setup consists of 64 linearly-actuated delta robots with 3D-printed compliant linkages. Through the design of the individual delta robots, the modular array structure, and distributed communication and control, we study a wide range of in-plane and out-of-plane manipulations, as well as prehensile manipulations among subsets of neighboring delta robots. We also demonstrate dexterous manipulation capabilities of the delta array using reinforcement learning while leveraging compliance. Our evaluations show that the resulting 192 DoF compliant robot is capable of performing various coordinated distributed manipulations of a variety of objects, including translation, alignment, prehensile squeezing, lifting, and grasping.

IROS Conference 2022 Conference Paper

DeltaZ: An Accessible Compliant Delta Robot Manipulator for Research and Education

  • Sarvesh Patil
  • Samuel C. Alvares
  • Pragna Mannam
  • Oliver Kroemer
  • F. Zeynep Temel

This paper presents the DeltaZ robot, a centimeter-scale, low-cost, delta-style robot that allows for a broad range of capabilities and robust functionalities. The DeltaZ robot is 3D-printed from soft and rigid materials with a design that is easy to assemble and maintain, and lowers the barriers to utilize. Functionality of the robot stems from its three translational degrees of freedom and a closed form kinematic solution which makes manipulation problems more intuitive compared to many other manipulators. Moreover, the low cost of the robot presents an opportunity to democratize manipulators for research and education settings. We describe how the robot can be used as a reinforcement learning bench-mark. Open-source 3D-printable designs and code for building and using the robot are available to the public.

IROS Conference 2022 Conference Paper

Dynamic Inference on Graphs using Structured Transition Models

  • Saumya Saxena
  • Oliver Kroemer

Enabling robots to perform complex dynamic tasks such as picking up an object in one sweeping motion or pushing off a wall to quickly turn a corner is a challenging problem. The dynamic interactions implicit in these tasks are critical towards the successful execution of such tasks. Graph neural networks (GNNs) provide a principled way of learning the dynamics of interactive systems but can suffer from scaling issues as the number of interactions increases. Furthermore, the problem of using learned GNN-based models for optimal control is insufficiently explored. In this work, we present a method for efficiently learning the dynamics of interacting systems by simultaneously learning a dynamic graph structure and a stable and locally linear forward model of the system. The dynamic graph structure encodes evolving contact modes along a trajectory by making probabilistic predictions over the edges of the graph. Additionally, we introduce a temporal dependence in the learned graph structure which allows us to incorporate contact measurement updates during execution thus enabling more accurate forward predictions. The learned stable and locally linear dynamics enable the use of optimal control algorithms such as iLQR for long-horizon planning and control for complex interactive tasks. Through experiments in simulation and in the real world, we evaluate the performance of our method by using the learned interaction dynamics for control and demonstrate generalization to more objects and interactions not seen during training. We introduce a control scheme that takes advantage of contact measurement updates and hence is robust to prediction inaccuracies during execution.

IROS Conference 2022 Conference Paper

Efficiently Learning Manipulations by Selecting Structured Skill Representations

  • Mohit Sharma 0001
  • Oliver Kroemer

A key challenge in learning to perform manipulation tasks is selecting a suitable skill representation. While specific skill representations are often easier to learn, they are often only suitable for a narrow set of tasks. In most prior works, roboticists manually provide the robot with a suitable skill representation to use e. g. a neural network or DMPs. By contrast, we propose to allow the robot to select the most appropriate skill representation for the underlying task. Given the large space of skill representations, we utilize a single demonstration to select a small set of potential task-relevant representations. This set is then further refined using reinforcement learning to select the most suitable skill representation. Experiments in both simulation and real world show how our proposed approach leads to improved sample efficiency and enables directly learning on the real robot.

IROS Conference 2022 Conference Paper

Learning to Singulate Layers of Cloth using Tactile Feedback

  • Sashank Tirumala
  • Thomas Weng
  • Daniel Seita
  • Oliver Kroemer
  • F. Zeynep Temel
  • David Held

Robotic manipulation of cloth has applications ranging from fabrics manufacturing to handling blankets and laundry. Cloth manipulation is challenging for robots largely due to their high degrees of freedom, complex dynamics, and severe self-occlusions when in folded or crumpled configurations. Prior work on robotic manipulation of cloth relies primarily on vision sensors alone, which may pose challenges for fine-grained manipulation tasks such as grasping a desired number of cloth layers from a stack of cloth. In this paper, we propose to use tactile sensing for cloth manipulation; we attach a tactile sensor (ReSkin) to one of the two fingertips of a Franka robot and train a classifier to determine whether the robot is grasping a specific number of cloth layers. During test-time experiments, the robot uses this classifier as part of its policy to grasp one or two cloth layers using tactile feedback to determine suitable grasping points. Experimental results over 180 physical trials suggest that the proposed method outperforms baselines that do not use tactile feedback and has better generalization to unseen cloth compared to methods that use image classifiers. Code, data, and videos are available at https://sites.google.com/view/reskin-cloth.

ICRA Conference 2022 Conference Paper

Search-Based Task Planning with Learned Skill Effect Models for Lifelong Robotic Manipulation

  • Jacky Liang
  • Mohit Sharma 0001
  • Alex LaGrassa
  • Shivam Vats
  • Saumya Saxena
  • Oliver Kroemer

Robots deployed in many real-world settings need to be able to acquire new skills and solve new tasks over time. Prior works on planning with skills often make assumptions on the structure of skills and tasks, such as subgoal skills, shared skill implementations, or task-specific plan skeletons, which limit adaptation to new skills and tasks. By contrast, we propose doing task planning by jointly searching in the space of parameterized skills using high-level skill effect models learned in simulation. We use an iterative training procedure to efficiently generate relevant data to train such models. Our approach allows flexible skill parameterizations and task specifications to facilitate lifelong learning in general-purpose domains. Experiments demonstrate the ability of our planner to integrate new skills in a lifelong manner, finding new task strategies with lower costs in both train and test tasks. We additionally show that our method can transfer to the real world without further fine-tuning.

ICRA Conference 2022 Conference Paper

Synergistic Scheduling of Learning and Allocation of Tasks in Human-Robot Teams

  • Shivam Vats
  • Oliver Kroemer
  • Maxim Likhachev

We consider the problem of completing a set of $n$ tasks with a human-robot team using minimum effort. In many domains, teaching a robot to be fully autonomous can be counterproductive if there are finitely many tasks to be done. Rather, the optimal strategy is to weigh the cost of teaching a robot and its benefit- how many new tasks it allows the robot to solve autonomously. We formulate this as a planning problem where the goal is to decide what tasks the robot should do autonomously (act), what tasks should be delegated to a human (delegate) and what tasks the robot should be taught (learn) so as to complete all the given tasks with minimum effort. This planning problem results in a search tree that grows expo-nentially with $n$ - making standard graph search algorithms intractable. We address this by converting the problem into a mixed integer program that can be solved efficiently using off-the-shelf solvers with bounds on solution quality. To predict the benefit of learning, we use an approximate simulation model of the tasks to train a precondition model that is parameterized by the training task. Finally, we evaluate our approach on peg insertion and Lego stacking tasks- both in simulation and real-world, showing substantial savings in human effort.

JMLR Journal 2021 Journal Article

A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms

  • Oliver Kroemer
  • Scott Niekum
  • George Konidaris

A key challenge in intelligent robotics is creating robots that are capable of directly interacting with the world around them to achieve their goals. The last decade has seen substantial growth in research on the problem of robot manipulation, which aims to exploit the increasing availability of affordable robot arms and grippers to create robots capable of directly interacting with the world to achieve their goals. Learning will be central to such autonomous systems, as the real world contains too much variation for a robot to expect to have an accurate model of its environment, the objects in it, or the skills required to manipulate them, in advance. We aim to survey a representative subset of that research which uses machine learning for manipulation. We describe a formalization of the robot manipulation learning problem that synthesizes existing research into a single coherent framework and highlight the many remaining research opportunities and challenges. [abs] [ pdf ][ bib ] &copy JMLR 2021. ( edit, beta )

ICRA Conference 2021 Conference Paper

Causal Reasoning in Simulation for Structure and Transfer Learning of Robot Manipulation Policies

  • Timothy E. Lee
  • Jialiang Zhao
  • Amrita S. Sawhney
  • Siddharth Girdhar
  • Oliver Kroemer

We present CREST, an approach for causal reasoning in simulation to learn the relevant state space for a robot manipulation policy. Our approach conducts interventions using internal models, which are simulations with approximate dynamics and simplified assumptions. These interventions elicit the structure between the state and action spaces, enabling construction of neural network policies with only relevant states as input. These policies are pretrained using the internal model with domain randomization over the relevant states. The policy network weights are then transferred to the target domain (e. g. , the real world) for fine tuning. We perform extensive policy transfer experiments in simulation for two representative manipulation tasks: block stacking and crate opening. Our policies are shown to be more robust to domain shifts, more sample efficient to learn, and scale to more complex settings with larger state spaces. We also show improved zero-shot sim- to-real transfer of our policies for the block stacking task.

ICRA Conference 2021 Conference Paper

Contact Localization for Robot Arms in Motion without Torque Sensing

  • Jacky Liang
  • Oliver Kroemer

Detecting and localizing contacts is essential for robot manipulators to perform contact-rich tasks in unstructured environments. While robot skins can localize contacts on the surface of robot arms, these sensors are not yet robust or easily accessible. As such, prior works have explored using proprioceptive observations, such as joint velocities and torques, to perform contact localization. Many past approaches assume the robot is static during contact incident, a single contact is made at a time, or having access to accurate dynamics models and joint torque sensing. In this work, we relax these assumptions and propose using Domain Randomization to train a neural network to localize contacts of robot arms in motion without joint torque observations. Our method uses a novel cylindrical projection encoding of the robot arm surface, which allows the network to use convolution layers to process input features and transposed convolution layers to predict contacts. The trained network achieves a contact detection accuracy of 91. 5% and a mean contact localization error of 3. 0cm. We further demonstrate an application of the contact localization model in an obstacle mapping task, evaluated in both simulation and the real world.

ICRA Conference 2021 Conference Paper

Generalizing Object-Centric Task-Axes Controllers using Keypoints

  • Mohit Sharma 0001
  • Oliver Kroemer

To perform manipulation tasks in the real world, robots need to operate on objects with various shapes, sizes and without access to geometric models. To achieve this it is often infeasible to train monolithic neural network policies across such large variations in object properties. Towards this generalization challenge, we propose to learn modular task policies which compose object-centric task-axes controllers. These task-axes controllers are parameterized by properties associated with underlying objects in the scene. We infer these controller parameters directly from visual input using multi-view dense correspondence learning. Our overall approach provides a simple and yet powerful framework for learning manipulation tasks. We empirically evaluate our approach on 3 different manipulation tasks and show its ability to generalize to large variance in object size, shape and geometry.

ICRA Conference 2021 Conference Paper

Learning Reactive and Predictive Differentiable Controllers for Switching Linear Dynamical Models

  • Saumya Saxena
  • Alex LaGrassa
  • Oliver Kroemer

Humans leverage the dynamics of the environment and their own bodies to accomplish challenging tasks such as grasping an object while walking past it or pushing off a wall to turn a corner. Such tasks often involve switching dynamics as the robot makes and breaks contact. Learning these dynamics is a challenging problem and prone to model inaccuracies, especially near contact regions. In this work, we present a framework for learning composite dynamical behaviors from expert demonstrations. We learn a switching linear dynamical model with contacts encoded in switching conditions as a close approximation of our system dynamics. We then use discrete-time LQR as the differentiable policy class for data-efficient learning of control to develop a control strategy that operates over multiple dynamical modes and takes into account discontinuities due to contact. In addition to predicting interactions with the environment, our policy effectively reacts to inaccurate predictions such as unanticipated contacts. Through simulation and real world experiments, we demonstrate generalization of learned behaviors to different scenarios and robustness to model inaccuracies during execution.

ICRA Conference 2021 Conference Paper

Towards Robust Planar Translations using Delta-manipulator Arrays

  • Skye Thompson
  • Pragna Mannam
  • F. Zeynep Temel
  • Oliver Kroemer

Distributed manipulators - consisting of a set of actuators or robots working cooperatively to achieve a manipulation task - are robust and flexible tools for performing a range of planar manipulation skills. One novel example is the delta array, a distributed manipulator composed of a grid of delta robots, capable of performing dexterous manipulation tasks using strategies incorporating both dynamic and static contact. Hand-designing effective distributed control policies for such a manipulator can be complex and time consuming, given the high-dimensional action space and unfamiliar system dynamics. In this paper, we examine the principles guiding development and control of such a delta array for a planar translation task. We explore policy learning as a robust cooperative control approach, allowing for smooth manipulation of a range of objects, showing improved accuracy and efficiency over baseline human-designed policies.

IROS Conference 2021 Conference Paper

Visual Identification of Articulated Object Parts

  • Vicky Zeng
  • Timothy E. Lee
  • Jacky Liang
  • Oliver Kroemer

As autonomous robots interact and navigate around real-world environments such as homes, it is useful to reliably identify and manipulate articulated objects, such as doors and cabinets. Many prior works in object articulation identification require manipulation of the object, either by the robot or a human. While recent works have addressed predicting articulation types from visual observations alone, they often assume prior knowledge of category-level kinematic motion models or sequence of observations where the articulated parts are moving according to their kinematic constraints. In this work, we propose FormNet, a neural network that identifies the articulation mechanisms between pairs of object parts from a single frame of an RGB-D image and segmentation masks. The network is trained on 100k synthetic images of 149 articulated objects from 6 categories. Synthetic images are rendered via a photorealistic simulator with domain randomization. Our proposed model predicts motion residual flows of object parts, and these flows are used to determine the articulation type and parameters. The network achieves an articulation type classification accuracy of 82. 5% on novel object instances in trained categories. Experiments also show how this method enables generalization to novel categories and can be applied to real-world images without fine-tuning.

ICRA Conference 2020 Conference Paper

Camera-to-Robot Pose Estimation from a Single Image

  • Timothy E. Lee
  • Jonathan Tremblay
  • Thang To
  • Jia Cheng
  • Terry Mosier
  • Oliver Kroemer
  • Dieter Fox
  • Stan Birchfield

We present an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot. The image is processed by a deep neural network to detect 2D projections of keypoints (such as joints) associated with the robot. The network is trained entirely on simulated data using domain randomization to bridge the reality gap. Perspective-n-point (PnP) is then used to recover the camera extrinsics, assuming that the camera intrinsics and joint configuration of the robot manipulator are known. Unlike classic hand-eye calibration systems, our method does not require an off-line calibration step. Rather, it is capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration. We show experimental results for three different robots and camera sensors, demonstrating that our approach is able to achieve accuracy with a single frame that is comparable to that of classic off-line hand-eye calibration using multiple frames. With additional frames from a static pose, accuracy improves even further. Code, datasets, and pretrained models for three widely-used robot manipulators are made available.

ICRA Conference 2020 Conference Paper

In-Hand Object Pose Tracking via Contact Feedback and GPU-Accelerated Robotic Simulation

  • Jacky Liang
  • Ankur Handa
  • Karl Van Wyk
  • Viktor Makoviychuk
  • Oliver Kroemer
  • Dieter Fox

Tracking the pose of an object while it is being held and manipulated by a robot hand is difficult for vision-based methods due to significant occlusions. Prior works have explored using contact feedback and particle filters to localize in-hand objects. However, they have mostly focused on the static grasp setting and not when the object is in motion, as doing so requires modeling of complex contact dynamics. In this work, we propose using GPU-accelerated parallel robot simulations and derivative-free, sample-based optimizers to track in-hand object poses with contact feedback during manipulation. We use physics simulation as the forward model for robot-object interactions, and the algorithm jointly optimizes for the state and the parameters of the simulations, so they better match with those of the real world. Our method runs in real-time (30Hz) on a single GPU, and it achieves an average point cloud distance error of 6mm in simulation experiments and 13mm in the real-world ones.

IROS Conference 2020 Conference Paper

Learning Skills to Patch Plans Based on Inaccurate Models

  • Alex LaGrassa
  • Steven Lee
  • Oliver Kroemer

Planners using accurate models can be effective for accomplishing manipulation tasks in the real world, but are typically highly specialized and require significant fine-tuning to be reliable. Meanwhile, learning is useful for adaptation, but can require a substantial amount of data collection. In this paper, we propose a method that improves the efficiency of sub-optimal planners with approximate but simple and fast models by switching to a model-free policy when unexpected transitions are observed. Unlike previous work, our method specifically addresses when the planner fails due to transition model error by patching with a local policy only where needed. First, we use a sub-optimal model-based planner to perform a task until model failure is detected. Next, we learn a local model-free policy from expert demonstrations to complete the task in regions where the model failed. To show the efficacy of our method, we perform experiments with a shape insertion puzzle and compare our results to both pure planning and imitation learning approaches. We then apply our method to a door opening task. Our experiments demonstrate that our patch-enhanced planner performs more reliably than pure planning and with lower overall sample complexity than pure imitation learning.

IROS Conference 2020 Conference Paper

Localization and Force-Feedback with Soft Magnetic Stickers for Precise Robot Manipulation

  • Tess Lee Hellebrekers
  • Kevin Zhang 0002
  • Manuela Veloso
  • Oliver Kroemer
  • Carmel Majidi

Tactile sensors are used in robot manipulation to reduce uncertainty regarding hand-object pose estimation. However, existing sensor technologies tend to be bulky and provide signals that are difficult to interpret into actionable changes. Here, we achieve wireless tactile sensing with soft and conformable magnetic stickers that can be easily placed on objects within the robot's workspace. We embed a small magnetometer within the robot's fingertip that can localize to a magnetic sticker with sub-mm accuracy and enable the robot to pick up objects in the same place, in the same way, every time. In addition, we utilize the soft magnets' ability to exhibit magnetic field changes upon contact forces. We demonstrate the localization and force-feedback features with a 7-DOF Franka arm on deformable tool use and a key insertion task for applications in home, medical, and food robotics. By increasing the reliability of interaction with common tools, this approach to object localization and force sensing can improve robot manipulation performance for delicate, high-precision tasks.

IROS Conference 2019 Conference Paper

Homography-Based Deep Visual Servoing Methods for Planar Grasps

  • Austin S. Wang
  • Wuming Zhang
  • Daniel Troniak
  • Jacky Liang
  • Oliver Kroemer

We propose a visual servoing framework for learning to improve grasps of objects. RGB and depth images from grasp attempts are collected using an automated data collection process. The data is then used to train a Grasp Quality Network (GQN) that predicts the outcome of grasps from visual information. A grasp optimization pipeline uses homography models with the trained network to optimize the grasp success rate. We evaluate and compare several algorithms for adjusting the current gripper pose based on the current observation from a gripper-mounted camera to perform visual servoing. Evaluations in both simulated and hardware environments show considerable improvement in grasp robustness with models trained using less than 30K grasp trials. Success rates for grasping novel objects unseen during training increased from 18. 5% to 81. 0% in simulation, and from 17. 8% to 78. 0% in the real world.

ICRA Conference 2019 Conference Paper

Learning Robust Manipulation Strategies with Multimodal State Transition Models and Recovery Heuristics

  • Austin S. Wang
  • Oliver Kroemer

Robots are prone to making mistakes when performing manipulation tasks in unstructured environments. Robust policies are thus needed to not only avoid mistakes but also to recover from them. We propose a framework for increasing the robustness of contact-based manipulations by modeling the task structure and optimizing a policy for selecting skills and recovery skills. A multimodal state transition model is acquired based on the contact dynamics of the task and the observed transitions. A policy is then learned from the model using reinforcement learning. The policy is incrementally improved by expanding the action space by generating recovery skills with a heuristic. Evaluations on three simulated manipulation tasks demonstrate the effectiveness of the framework. The robot was able to complete the tasks despite multiple contact state changes and errors encountered, increasing the success rate averaged across the tasks from 70. 0% to 95. 3%.

IROS Conference 2019 Conference Paper

Predicting Grasp Success with a Soft Sensing Skin and Shape-Memory Actuated Gripper

  • Julian Zimmer
  • Tess Lee Hellebrekers
  • Tamim Asfour
  • Carmel Majidi
  • Oliver Kroemer

Tactile sensors have been increasingly used to support rigid robot grippers in object grasping and manipulation. However, rigid grippers are often limited in their ability to handle compliant, delicate, or irregularly shaped objects. In recent years, grippers made from soft and flexible materials have become increasingly popular for certain manipulation tasks, e. g. , grasping, due to their ability to conform to the object shape without the need for precise control. Although promising, such soft robot grippers currently suffer from the lack of available sensing modalities. In this work, we introduce a soft and stretchable sensing skin and incorporate it into the two fingers of a shape-memory actuated soft gripper. The onboard sensing skin includes a 9-axis inertial measurement unit (IMU) and five discrete pressure sensors per finger. We use this sensorized soft gripper to study grasp success and stability of over 2585 grasps with various objects using several machine learning methods. Our experiments show that LSTMs were the most accurate predictors of grasp success and stability, compared to SVMs, FFNNs, and ST-HMP. We also evaluated the effects on performance of each sensor's data, and the success rates for individual objects. The results show that the accelerometer data of the IMUs has the largest contribution to the overall grasp prediction, which we attribute to its ability to detect precise movements of the gripper during grasping.

ICRA Conference 2018 Conference Paper

Learning Manipulation Graphs from Demonstrations Using Multimodal Sensory Signals

  • Zhe Su
  • Oliver Kroemer
  • Gerald E. Loeb
  • Gaurav S. Sukhatme
  • Stefan Schaal

Complex contact manipulation tasks can be decomposed into sequences of motor primitives. Individual primitives often end with a distinct contact state, such as inserting a screwdriver tip into a screw head or loosening it through twisting. To achieve robust execution, the robot should be able to verify that the primitive's goal has been reached as well as disambiguate it from erroneous contact states. In this paper, we introduce and evaluate a framework to autonomously construct manipulation graphs from manipulation demonstrations. Our manipulation graphs include sequences of motor primitives for performing a manipulation task as well as corresponding contact state information. The sensory models for the contact states allow the robot to verify the goal of each motor primitive as well as detect erroneous contact changes. The proposed framework was experimentally evaluated on grasping, unscrewing, and insertion tasks on a Barrett arm and hand equipped with two BioTacs. The results of our experiments indicate that the learned manipulation graphs achieve more robust manipulation executions by confirming sensory goals as well as discovering and detecting novel failure modes.

ICRA Conference 2018 Conference Paper

Pilot Surveys for Adaptive Informative Sampling

  • Stephanie Kemna
  • Oliver Kroemer
  • Gaurav S. Sukhatme

Adaptive sampling has been shown to be an effective method for modeling environmental fields, such as algae concentrations in the ocean. In adaptive sampling, a robot adapts its sampling trajectory based on data that it is collecting. This data is often aggregated into models, using techniques such as Gaussian Process (G P) regression. The (hyper-)parameters for these models need to be manually set or, ideally, estimated from data. For GP regression, hyperparameters are typically estimated using prior data. This paper addresses the case where initial hyperparameters need to be estimated, but no prior data is available. Without prior data or accurately pre-defined hyperparameters, adaptive sampling techniques may fail, because there is no good model to base path planning decisions on. One method of gathering data is to perform a pilot survey. This survey needs to select informative samples for initiating the model, but without having a model to determine where best to sample. In this work, we evaluate four pilot surveys, which use a softmax function on the distance between waypoints and previously sampled data for waypoint selection. Simulation results show that pilot surveys that maximize waypoint spread over randomization lead to more stable estimation of GP hyperparameters, and create accurate models more quickly.

ICRA Conference 2017 Conference Paper

Feature selection for learning versatile manipulation skills based on observed and desired trajectories

  • Oliver Kroemer
  • Gaurav S. Sukhatme

For a manipulation skill to be applicable to a wide range of scenarios, it must generalize between different objects and object configurations. Robots should therefore learn skills that adapt to features describing the objects being manipulated. Most of these object features will however be irrelevant for generalizing the skill and, hence, the robot should select a small set of relevant features for adapting the skill. We use a framework for learning versatile manipulation skills that adapt to a sparse set of object features. Skills are initially learned from demonstrations and subsequently improved using reinforcement learning. The robot also learns a meta prior over the features' relevances to guide the feature selection process. In this paper, we explore using either desired trajectories or observed trajectories for selecting the relevant features. The framework was evaluated on placing, tilting, and wiping tasks. The evaluations showed that using the desired trajectories to select the relevant features lead to better skill learning performance.

IROS Conference 2016 Conference Paper

Contact localization on grasped objects using tactile sensing

  • Artem Molchanov
  • Oliver Kroemer
  • Zhe Su
  • Gaurav S. Sukhatme

Manipulation tasks often require robots to make contact between a grasped tool and another object in the robot's environment. The ability to detect and estimate the positions and directions of these contact points is crucial for monitoring the progress of the task, and detecting failures. In this paper, we present a data-driven approach for detecting and localizing contacts between a grasped object and the environment using tactile sensing. We explore framing the contact localization as both a regression and a classification problem and train neural networks accordingly to estimate the contact parameters. We also compare the neural networks with Gaussian process regression and support vector machine classification with spatio-temporal hierarchical matching pursuit feature learning. We evaluate the presented approach using hundreds of contact events on eighteen objects with different shapes, sizes and material properties. The experiments show that the neural network approach can learn to localize contact events for individual objects with a mean absolute error of less than 2. 5 cm for the positions and less than 10° for the directions.

JMLR Journal 2016 Journal Article

Hierarchical Relative Entropy Policy Search

  • Christian Daniel
  • Gerhard Neumann
  • Oliver Kroemer
  • Jan Peters

Many reinforcement learning (RL) tasks, especially in robotics, consist of multiple sub-tasks that are strongly structured. Such task structures can be exploited by incorporating hierarchical policies that consist of gating networks and sub-policies. However, this concept has only been partially explored for real world settings and complete methods, derived from first principles, are needed. Real world settings are challenging due to large and continuous state-action spaces that are prohibitive for exhaustive sampling methods. We define the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-policies for execution by the agent. In order to efficiently share experience with all sub-policies, also called inter-policy learning, we treat these sub-policies as latent variables which allows for distribution of the update information between the sub-policies. We present three different variants of our algorithm, designed to be suitable for a wide variety of real world robot learning tasks and evaluate our algorithms in two real robot learning scenarios as well as several simulations and comparisons. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

ICRA Conference 2015 Conference Paper

Towards learning hierarchical skills for multi-phase manipulation tasks

  • Oliver Kroemer
  • Christian G. Daniel
  • Gerhard Neumann
  • Herke van Hoof
  • Jan Peters 0001

Most manipulation tasks can be decomposed into a sequence of phases, where the robot's actions have different effects in each phase. The robot can perform actions to transition between phases and, thus, alter the effects of its actions, e. g. grasp an object in order to then lift it. The robot can thus reach a phase that affords the desired manipulation. In this paper, we present an approach for exploiting the phase structure of tasks in order to learn manipulation skills more efficiently. Starting with human demonstrations, the robot learns a probabilistic model of the phases and the phase transitions. The robot then employs model-based reinforcement learning to create a library of motor primitives for transitioning between phases. The learned motor primitives generalize to new situations and tasks. Given this library, the robot uses a value function approach to learn a high-level policy for sequencing the motor primitives. The proposed method was successfully evaluated on a real robot performing a bimanual grasping task.

ICRA Conference 2014 Conference Paper

Interaction primitives for human-robot cooperation tasks

  • Heni Ben Amor
  • Gerhard Neumann
  • Sanket Kamthe
  • Oliver Kroemer
  • Jan Peters 0001

To engage in cooperative activities with human partners, robots have to possess basic interactive abilities and skills. However, programming such interactive skills is a challenging task, as each interaction partner can have different timing or an alternative way of executing movements. In this paper, we propose to learn interaction skills by observing how two humans engage in a similar task. To this end, we introduce a new representation called Interaction Primitives. Interaction primitives build on the framework of dynamic motor primitives (DMPs) by maintaining a distribution over the parameters of the DMP. With this distribution, we can learn the inherent correlations of cooperative activities which allow us to infer the behavior of the partner and to participate in the cooperation. We will provide algorithms for synchronizing and adapting the behavior of humans and robots during joint physical activities.

IROS Conference 2014 Conference Paper

Learning robot tactile sensing for object manipulation

  • Yevgen Chebotar
  • Oliver Kroemer
  • Jan Peters 0001

Tactile sensing is a fundamental component of object manipulation and tool handling skills. With robots entering unstructured environments, tactile feedback also becomes an important ability for robot manipulation. In this work, we explore how a robot can learn to use tactile sensing in object manipulation tasks. We first address the problem of in-hand object localization and adapt three pose estimation algorithms from computer vision. Second, we employ dynamic motor primitives to learn robot movements from human demonstrations and record desired tactile signal trajectories. Then, we add tactile feedback to the control loop and apply relative entropy policy search to learn the parameters of the tactile coupling. Additionally, we show how the learning of tactile feedback can be performed more efficiently by reducing the dimensionality of the tactile information through spectral clustering and principal component analysis. Our approach is implemented on a real robot, which learns to perform a scraping task with a spatula in an altered environment.

ICRA Conference 2014 Conference Paper

Learning to predict phases of manipulation tasks as hidden states

  • Oliver Kroemer
  • Herke van Hoof
  • Gerhard Neumann
  • Jan Peters 0001

Phase transitions in manipulation tasks often occur when contacts between objects are made or broken. A switch of the phase can result in the robot's actions suddenly influencing different aspects of its environment. Therefore, the boundaries between phases often correspond to constraints or subgoals of the manipulation task. In this paper, we investigate how the phases of manipulation tasks can be learned from data. The task is modeled as an autoregressive hidden Markov model, wherein the hidden phase transitions depend on the observed states. The model is learned from data using the expectation-maximization algorithm. We demonstrate the proposed method on both a pushing task and a pepper mill turning task. The proposed approach was compared to a standard autoregressive hidden Markov model. The experiments show that the learned models can accurately predict the transitions in phases during the manipulation tasks.

IROS Conference 2014 Conference Paper

Predicting object interactions from contact distributions

  • Oliver Kroemer
  • Jan Peters 0001

Contacts between objects play an important role in manipulation tasks. Depending on the locations of contacts, different manipulations or interactions can be performed with the object. By observing the contacts between two objects, a robot can learn to detect potential interactions between them. Rather than defining a set of features for modeling the contact distributions, we propose a kernel-based approach. The contact points are first modeled using a Gaussian distribution. The similarity between these distributions is computed using a kernel function. The contact distributions are then classified using kernel logistic regression. The proposed approach was used to predict stable grasps of an elongated object, as well as to construct towers out of assorted toy blocks.

ICRA Conference 2013 Conference Paper

Learning sequential motor tasks

  • Christian G. Daniel
  • Gerhard Neumann
  • Oliver Kroemer
  • Jan Peters 0001

Many real robot applications require the sequential use of multiple distinct motor primitives. This requirement implies the need to learn the individual primitives as well as a strategy to select the primitives sequentially. Such hierarchical learning problems are commonly either treated as one complex monolithic problem which is hard to learn, or as separate tasks learned in isolation. However, there exists a strong link between the robots strategy and its motor primitives. Consequently, a consistent framework is needed that can learn jointly on the level of the individual primitives and the robots strategy. We present a hierarchical learning method which improves individual motor primitives and, simultaneously, learns how to combine these motor primitives sequentially to solve complex motor tasks. We evaluate our method on the game of robot hockey, which is both difficult to learn in terms of the required motor primitives as well as its strategic elements.

ICRA Conference 2012 Conference Paper

A kernel-based approach to direct action perception

  • Oliver Kroemer
  • Emre Ugur
  • Erhan Öztop
  • Jan Peters 0001

The direct perception of actions allows a robot to predict the afforded actions of observed objects. In this paper, we present a non-parametric approach to representing the affordance-bearing subparts of objects. This representation forms the basis of a kernel function for computing the similarity between different subparts. Using this kernel function, together with motor primitive actions, the robot can learn the required mappings to perform direct action perception. The proposed approach was successfully implemented on a real robot, which could then quickly learn to generalize grasping and pouring actions to novel objects.

NeurIPS Conference 2012 Conference Paper

Algorithms for Learning Markov Field Policies

  • Abdeslam Boularias
  • Jan Peters
  • Oliver Kroemer

We present a new graph-based approach for incorporating domain knowledge in reinforcement learning applications. The domain knowledge is given as a weighted graph, or a kernel matrix, that loosely indicates which states should have similar optimal actions. We first introduce a bias into the policy search process by deriving a distribution on policies such that policies that disagree with the provided graph have low probabilities. This distribution corresponds to a Markov Random Field. We then present a reinforcement and an apprenticeship learning algorithms for finding such policy distributions. We also illustrate the advantage of the proposed approach on three problems: swing-up cart-balancing with nonuniform and smooth frictions, gridworlds, and teaching a robot to grasp new objects.

IROS Conference 2012 Conference Paper

Generalization of human grasping for multi-fingered robot hands

  • Heni Ben Amor
  • Oliver Kroemer
  • Ulrich Hillenbrand
  • Gerhard Neumann
  • Jan Peters 0001

Multi-fingered robot grasping is a challenging problem that is difficult to tackle using hand-coded programs. In this paper we present an imitation learning approach for learning and generalizing grasping skills based on human demonstrations. To this end, we split the task of synthesizing a grasping motion into three parts: (1) learning efficient grasp representations from human demonstrations, (2) warping contact points onto new objects, and (3) optimizing and executing the reach-and-grasp movements. We learn low-dimensional latent grasp spaces for different grasp types, which form the basis for a novel extension to dynamic motor primitives. These latent-space dynamic motor primitives are used to synthesize entire reach-and-grasp movements. We evaluated our method on a real humanoid robot. The results of the experiment demonstrate the robustness and versatility of our approach.

IROS Conference 2012 Conference Paper

Maximally informative interaction learning for scene exploration

  • Herke van Hoof
  • Oliver Kroemer
  • Heni Ben Amor
  • Jan Peters 0001

Creating robots that can act autonomously in dynamic, unstructured environments is a major challenge. In such environments, learning to recognize and manipulate novel objects is an important capability. A truly autonomous robot acquires knowledge through interaction with its environment without using heuristics or prior information encoding human domain insights. Static images often provide insufficient information for inferring the relevant properties of the objects in a scene. Hence, a robot needs to explore these objects by interacting with them. However, there may be many exploratory actions possible, and a large portion of these actions may be non-informative. To learn quickly and efficiently, a robot must select actions that are expected to have the most informative outcomes. In the proposed bottom-up approach, the robot achieves this goal by quantifying the expected informativeness of its own actions. We use this approach to segment a scene into its constituent objects as a first step in learning the properties and affordances of objects. Evaluations showed that the proposed information-theoretic approach allows a robot to efficiently infer the composite structure of its environment.

ECAI Conference 2012 Conference Paper

Robot Skill Learning

  • Jan Peters 0001
  • Katharina Muelling
  • Jens Kober
  • Duy Nguyen-Tuong
  • Oliver Kroemer

Learning robots that can acquire new motor skills and refine existing ones have been a long standing vision of robotics, artificial intelligence, and the cognitive sciences. Early steps towards this goal in the 1980s made clear that reasoning and human insights will not suffice. Instead, new hope has been offered by the rise of modern machine learning approaches. However, to date, it becomes increasingly clear that off-the-shelf machine learning approaches will not be adequate for robot skill learning as these methods often do not scale into the high-dimensional domains of manipulator and humanoid robotics, nor do they fulfill the real-time requirement of the domain. As an alternative, we propose to divide the generic skill learning problem into parts that can be well-understood from a robotics point of view. After designing appropriate learning approaches for these basic components, these will serve as the ingredients of a general approach to robot skill learning. In this paper, we discuss our recent and current progress in this direction. As such, we present our work on learning to control, learning elementary movements, as well as our steps towards the learning of complex tasks. We show several evaluations using both real robots as well as physically realistic simulations.

ICRA Conference 2011 Conference Paper

A flexible hybrid framework for modeling complex manipulation tasks

  • Oliver Kroemer
  • Jan Peters 0001

Future service robots will need to perform a wide range of tasks using various objects. In order to perform complex tasks, robots require a suitable internal representation of the task. We propose a hybrid framework for representing manipulation tasks, which combines continuous motion planning and discrete task-level planning. In addition, we use a mid-level planner to optimize individual actions according to the plan. The proposed framework incorporates biologically-inspired concepts, such as affordances and motor primitives, in order to efficiently plan for manipulation tasks. The final framework is modular, can generalize well to different situations, and is straightforward to expand. Our demonstrations also show how the use of affordances and mid-level planning can lead to improved performance.

NeurIPS Conference 2011 Conference Paper

A Non-Parametric Approach to Dynamic Programming

  • Oliver Kroemer
  • Jan Peters

In this paper, we consider the problem of policy evaluation for continuous-state systems. We present a non-parametric approach to policy evaluation, which uses kernel density estimation to represent the system. The true form of the value function for this model can be determined, and can be computed using Galerkin's method. Furthermore, we also present a unified view of several well-known policy evaluation methods. In particular, we show that the same Galerkin method can be used to derive Least-Squares Temporal Difference learning, Kernelized Temporal Difference learning, and a discrete-state Dynamic Programming solution, as well as our proposed method. In a numerical evaluation of these algorithms, the proposed approach performed better than the other methods.

IROS Conference 2011 Conference Paper

Learning robot grasping from 3-D images with Markov Random Fields

  • Abdeslam Boularias
  • Oliver Kroemer
  • Jan Peters 0001

Learning to grasp novel objects is an essential skill for robots operating in unstructured environments. We therefore propose a probabilistic approach for learning to grasp. In particular, we learn a function that predicts the success probability of grasps performed on surface points of a given object. Our approach is based on Markov Random Fields (MRF), and motivated by the fact that points that are geometrically close to each other tend to have similar grasp success probabilities. The MRF approach is successfully tested in simulation, and on a real robot using 3-D scans of various types of objects. The empirical results show a significant improvement over methods that do not utilize the smoothness assumption and classify each point separately from the others.

IROS Conference 2010 Conference Paper

Learning probabilistic discriminative models of grasp affordances under limited supervision

  • Ayse Erkan
  • Oliver Kroemer
  • Renaud Detry
  • Yasemin Altun
  • Justus H. Piater
  • Jan Peters 0001

This paper addresses the problem of learning and efficiently representing discriminative probabilistic models of object-specific grasp affordances particularly when the number of labeled grasps is extremely limited. The proposed method does not require an explicit 3D model but rather learns an implicit manifold on which it defines a probability distribution over grasp affordances. We obtain hypothetical grasp configurations from visual descriptors that are associated with the contours of an object. While these hypothetical configurations are abundant, labeled configurations are very scarce as these are acquired via time-costly experiments carried out by the robot. Kernel logistic regression (KLR) via joint kernel maps is trained to map the hypothesis space of grasps into continuous class-conditional probability values indicating their achievability. We propose a soft-supervised extension of KLR and a framework to combine the merits of semi-supervised and active learning approaches to tackle the scarcity of labeled grasps. Experimental evaluation shows that combining active and semi-supervised learning is favorable in the existence of an oracle. Furthermore, semi-supervised learning outperforms supervised learning, particularly when the labeled data is very limited.

ICRA Conference 2010 Conference Paper

Movement templates for learning of hitting and batting

  • Jens Kober
  • Katharina Muelling
  • Oliver Kroemer
  • Christoph H. Lampert
  • Bernhard Schölkopf
  • Jan Peters 0001

Hitting and batting tasks, such as tennis forehands, ping-pong strokes, or baseball batting, depend on predictions where the ball can be intercepted and how it can properly be returned to the opponent. These predictions get more accurate over time, hence the behaviors need to be continuously modified. As a result, movement templates with a learned global shape need to be adapted during the execution so that the racket reaches a target position and velocity that will return the ball over to the other side of the net or court. It requires altering learned movements to hit a varying target with the necessary velocity at a specific instant in time. Such a task cannot be incorporated straightforwardly in most movement representations suitable for learning. For example, the standard formulation of the dynamical system based motor primitives (introduced by Ijspeert et al. [1]) does not satisfy this property despite their flexibility which has allowed learning tasks ranging from locomotion to kendama. In order to fulfill this requirement, we reformulate the Ijspeert framework to incorporate the possibility of specifying a desired hitting point and a desired hitting velocity while maintaining all advantages of the original formulation. We show that the proposed movement template formulation works well in two scenarios, i. e. , for hitting a ball on a string with a table tennis racket at a specified velocity and for returning balls launched by a ball gun successfully over the net using forehand movements. All experiments were carried out on a Barrett WAM using a four camera vision system.

IROS Conference 2009 Conference Paper

Active learning using mean shift optimization for robot grasping

  • Oliver Kroemer
  • Renaud Detry
  • Justus H. Piater
  • Jan Peters 0001

When children learn to grasp a new object, they often know several possible grasping points from observing a parent's demonstration and subsequently learn better grasps by trial and error. From a machine learning point of view, this process is an active learning approach. In this paper, we present a new robot learning framework for reproducing this ability in robot grasping. For doing so, we chose a straightforward approach: first, the robot observes a few good grasps by demonstration and learns a value function for these grasps using Gaussian process regression. Subsequently, it chooses grasps which are optimal with respect to this value function using a mean-shift optimization approach, and tries them out on the real system. Upon every completed trial, the value function is updated, and in the following trials it is more likely to choose even better grasping points. This method exhibits fast learning due to the data-efficiency of the Gaussian process regression framework and the fact that the mean-shift method provides maxima of this cost function. Experiments were repeatedly carried out successfully on a real robot system. After less than sixty trials, our system has adapted its grasping policy to consistently exhibit successful grasps.