Arrow Research search

Author name cluster

Emre Ugur

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers
2 author rows

Possible papers

18

NeurIPS Conference 2025 Conference Paper

Forecasting in Offline Reinforcement Learning for Non-stationary Environments

  • Suzan Ece Ada
  • Georg Martius
  • Emre Ugur
  • Erhan Oztop

Offline Reinforcement Learning (RL) provides a promising avenue for training policies from pre-collected datasets when gathering additional interaction data is infeasible. However, existing offline RL methods often assume stationarity or only consider synthetic perturbations at test time, assumptions that often fail in real-world scenarios characterized by abrupt, time-varying offsets. These offsets can lead to partial observability, causing agents to misperceive their true state and degrade performance. To overcome this challenge, we introduce Forecasting in Non-stationary Offline RL (FORL), a framework that unifies (i) conditional diffusion-based candidate state generation, trained without presupposing any specific pattern of future non-stationarity, and (ii) zero-shot time-series foundation models. FORL targets environments prone to unexpected, potentially non-Markovian offsets, requiring robust agent performance from the onset of each episode. Empirical evaluations on offline RL benchmarks, augmented with real-world time-series data to simulate realistic non-stationarity, demonstrate that FORL consistently improves performance compared to competitive baselines. By integrating zero-shot forecasting with the agent's experience, we aim to bridge the gap between offline RL and the complexities of real-world, non-stationary environments.

ICRA Conference 2023 Conference Paper

Bimanual Rope Manipulation Skill Synthesis through Context Dependent Correction Policy Learning from Human Demonstration

  • T. Baturhan Akbulut
  • Gülsüm Tuba Çibuk Girgin
  • Arash Mehrabi
  • Minoru Asada
  • Emre Ugur
  • Erhan Öztop

Learning from demonstration (LfD) with behavior cloning is attractive for its simplicity; however, compounding errors in long and complex skills can be a hindrance. Considering a target skill as a sequence of motor primitives is helpful in this respect. Then the requirement that a motor primitive ends in a state that allows the successful execution of the subsequent primitive must be met. In this study, we focus on this problem by proposing to learn an explicit correction policy when the expected transition state between primitives is not achieved. The correction policy is learned via behavior cloning by the use of Conditional Neural Motor Primitives (CNMPs) that can generate correction trajectories in a context-dependent way. The advantage of the proposed system over learning the complete task as a single action is shown with a table-top setup in simulation, where an object has to be pushed through a corridor in two steps. Then, the applicability of the proposed method to bi-manual knotting in the real world is shown by equipping an upper-body humanoid robot with the skill of making knots over a bar in 3D space.

JAIR Journal 2022 Journal Article

DeepSym: Deep Symbol Generation and Rule Learning for Planning from Unsupervised Robot Interaction

  • Alper Ahmetoglu
  • M. Yunus Seker
  • Justus Piater
  • Erhan Oztop
  • Emre Ugur

Symbolic planning and reasoning are powerful tools for robots tackling complex tasks. However, the need to manually design the symbols restrict their applicability, especially for robots that are expected to act in open-ended environments. Therefore symbol formation and rule extraction should be considered part of robot learning, which, when done properly, will offer scalability, flexibility, and robustness. Towards this goal, we propose a novel general method that finds action-grounded, discrete object and effect categories and builds probabilistic rules over them for non-trivial action planning. Our robot interacts with objects using an initial action repertoire that is assumed to be acquired earlier and observes the effects it can create in the environment. To form action-grounded object, effect, and relational categories, we employ a binary bottleneck layer in a predictive, deep encoderdecoder network that takes the image of the scene and the action applied as input, and generates the resulting effects in the scene in pixel coordinates. After learning, the binary latent vector represents action-driven object categories based on the interaction experience of the robot. To distill the knowledge represented by the neural network into rules useful for symbolic reasoning, a decision tree is trained to reproduce its decoder function. Probabilistic rules are extracted from the decision paths of the tree and are represented in the Probabilistic Planning Domain Definition Language (PPDDL), allowing off-the-shelf planners to operate on the knowledge extracted from the sensorimotor experience of the robot. The deployment of the proposed approach for a simulated robotic manipulator enabled the discovery of discrete representations of object properties such as ‘rollable’ and ‘insertable’. In turn, the use of these representations as symbols allowed the generation of effective plans for achieving goals, such as building towers of the desired height, demonstrating the effectiveness of the approach for multi-step object manipulation. Finally, we demonstrate that the system is not only restricted to the robotics domain by assessing its applicability to the MNIST 8-puzzle domain in which learned symbols allow for the generation of plans that move the empty tile into any given position.

ICRA Conference 2021 Conference Paper

Reward Conditioned Neural Movement Primitives for Population-Based Variational Policy Optimization

  • Tuluhan Akbulut
  • Utku Bozdogan
  • Ahmet Ercan Tekden
  • Emre Ugur

This paper aims to study the reward-based policy exploration problem in a supervised learning approach and enable robots to form complex movement trajectories in challenging reward settings and search spaces. For this, the experience of the robot, which can be bootstrapped from demonstrated trajectories, is used to train a novel Neural Processes-based deep network that samples from its latent space and generates the required trajectories given desired rewards. Our framework can generate progressively improved trajectories by sampling them from high reward landscapes, increasing the reward gradually. Variational inference is used to create a stochastic latent space to sample varying trajectories in generating a population of trajectories given target rewards. We benefit from Evolutionary Strategies and propose a novel crossover operation, which is applied in the self-organized latent space of the individual policies, allowing blending of the individuals that might address different factors in the reward function. Using a number of tasks that require sequential reaching to multiple points or passing through gaps between objects, we showed that our method provides stable learning progress and significantly higher sample efficiency compared to a number of state-of-the-art robotic reinforcement learning methods. Finally, we show the real-world suitability of our method through real robot execution involving obstacle avoidance.

ICRA Conference 2020 Conference Paper

Belief Regulated Dual Propagation Nets for Learning Action Effects on Groups of Articulated Objects

  • Ahmet Ercan Tekden
  • Aykut Erdem
  • Erkut Erdem
  • Mert Imre
  • M. Yunus Seker
  • Emre Ugur

In recent years, graph neural networks have been successfully applied for learning the dynamics of complex and partially observable physical systems. However, their use in the robotics domain is, to date, still limited. In this paper, we introduce Belief Regulated Dual Propagation Networks (BRDPN), a general-purpose learnable physics engine, which enables a robot to predict the effects of its actions in scenes containing groups of articulated multi-part objects. Specifically, our framework extends recently proposed propagation networks (PropNets) and consists of two complementary components, a physics predictor and a belief regulator. While the former predicts the future states of the object(s) manipulated by the robot, the latter constantly corrects the robot’s knowledge regarding the objects and their relations. Our results showed that after training in a simulator, the robot can reliably predict the consequences of its actions in object trajectory level and exploit its own interaction experience to correct its belief about the state of the environment, enabling better predictions in partially observable environments. Furthermore, the trained model was transferred to the real world and verified in predicting trajectories of pushed interacting objects whose joint relations were initially unknown. We compared BRDPN against PropNets, and showed that BRDPN performs consistently well. Moreover, BRDPN can adapt its physic predictions, since the relations can be predicted online.

IROS Conference 2018 Conference Paper

Associative Skill Memory Models

  • Hakan Girgin
  • Emre Ugur

Associative Skill Memories (ASMs) were formulated to encode stereotypical movements along with their stereotypical sensory events to increase the robustness of underlying dynamic movement primitives (DMPs) against noisy perception and perturbations. In ASMs, the stored sensory trajectories, such as the haptic and tactile measurements, are used to compute how much a perturbed movement deviates from the desired one, and to correct the movement if possible. In our work, we extend ASMs: rather than using stored single sensory trajectory instances, our system generates sensory event models and exploits those models to correct the perturbed movements during executions with the aim of generalizing to novel configurations. In particular, measured force and the torque trajectories are modelled using Parametric Hidden Markov Models, and then reproduced by Gaussian Mixture Regression. With Baxter robot, we demonstrate that our proposed force feedback model can be used to correct a trajectory while pushing an object with a mass never experienced before, and which otherwise slips away from the gripper because of noise. In the end, we discuss how far this skill can be generalized using the force model and possible future improvements.

IROS Conference 2016 Conference Paper

Robotic playing for hierarchical complex skill learning

  • Simon Hangl
  • Emre Ugur
  • Sándor Szedmák
  • Justus H. Piater

In complex manipulation scenarios (e. g. tasks requiring complex interaction of two hands or in-hand manipulation), generalization is a hard problem. Current methods still either require a substantial amount of (supervised) training data and / or strong assumptions on both the environment and the task. In this paradigm, controllers solving these tasks tend to be complex. We propose a paradigm of maintaining simpler controllers solving the task in a small number of specific situations. In order to generalize to novel situations, the robot transforms the environment from novel situations into a situation where the solution of the task is already known. Our solution to this problem is to play with objects and use previously trained skills (basis skills). These skills can either be used for estimating or for changing the current state of the environment and are organized in skill hierarchies. The approach is evaluated in complex pick-and-place scenarios that involve complex manipulation. We further show that these skills can be learned by autonomous playing.

ICRA Conference 2015 Conference Paper

Bottom-up learning of object categories, action effects and logical rules: From continuous manipulative exploration to symbolic planning

  • Emre Ugur
  • Justus H. Piater

This work aims for bottom-up and autonomous development of symbolic planning operators from continuous interaction experience of a manipulator robot that explores the environment using its action repertoire. Development of the symbolic knowledge is achieved in two stages. In the first stage, the robot explores the environment by executing actions on single objects, forms effect and object categories, and gains the ability to predict the object/effect categories from the visual properties of the objects by learning the nonlinear and complex relations among them. In the next stage, with further interactions that involve stacking actions on pairs of objects, the system learns logical high-level rules that return a stacking-effect category given the categories of the involved objects and the discrete relations between them. Finally, these categories and rules are encoded in Planning Domain Definition Language (PDDL), enabling symbolic planning. We realized our method by learning the categories and rules in a physics-based simulator. The learned symbols and operators are verified by generating and executing non-trivial symbolic plans on the real robot in a tower building task.

IROS Conference 2014 Conference Paper

Knowledge propagation and relation learning for predicting action effects

  • Sándor Szedmák
  • Emre Ugur
  • Justus H. Piater

Learning to predict the effects of actions applied to pairs of objects is a difficult task that requires learning complex relations with sparse, incomplete and noisy information. Our Knowledge Propagation approach propagates affordance predictions by exploiting similarities among object properties, action parameters and resulting effects. The knowledge is propagated in a graph where a missing edge, corresponding to an unknown interaction between two objects (nodes), is predicted via the superposition of all paths connecting those objects in the graph. The high complexity of affordance representation is addressed through the use of Maximum Margin Multi-Valued Regression (MMMVR), which scales well to complex problems of multiple layers. With increased diversity and size of object databases and the addition of other parametric combinatory actions, we expect to achieve complex systems that leverage learned structure for subsequent learning, achieving structural bootstrapping over lifelong development and learning. In this paper, we extend MMMVR for learning of paired-object affordances, i. e. , for predicting the effects of actions applied to pairs of objects. In our experiments, we evaluated this method on a dataset composed of 83 objects and 83×83 interactions. We compared the prediction performance with standard classifiers that predict the effect category given the object pair's low-level features or single-object affordances. The experiments show that our proposed method achieves significantly higher prediction performance especially when supported with Active Learning.

ICRA Conference 2012 Conference Paper

A kernel-based approach to direct action perception

  • Oliver Kroemer
  • Emre Ugur
  • Erhan Öztop
  • Jan Peters 0001

The direct perception of actions allows a robot to predict the afforded actions of observed objects. In this paper, we present a non-parametric approach to representing the affordance-bearing subparts of objects. This representation forms the basis of a kernel function for computing the similarity between different subparts. Using this kernel function, together with motor primitive actions, the robot can learn the required mappings to perform direct action perception. The proposed approach was successfully implemented on a real robot, which could then quickly learn to generalize grasping and pouring actions to novel objects.

ICRA Conference 2012 Conference Paper

Closed-loop primitives: A method to generate and recognize reaching actions from demonstration

  • Mustafa Parlaktuna
  • Doruk Tunaoglu
  • Erol Sahin
  • Emre Ugur

The studies on mirror neurons observed in monkeys indicate that recognition of other's actions activates neural circuits that are also responsible for generating the very same actions in the animal. The mirror neuron hypothesis argues that such an overlap between action generation and recognition can provide a shared worldview among individuals and be a key pillar for communication. Inspired by these findings, this paper extends a learning by demonstration method for online recognition of observed actions. The proposed method is shown to recognize and generate different reaching actions demonstrated by a human on a humanoid robot platform. Experiments show that the proposed method is robust to both occlusions during the observed actions as well as variances in the speed of the observed actions. The results are successfully demonstrated in an interactive game with the iCub humanoid robot platform.

IROS Conference 2012 Conference Paper

Self-discovery of motor primitives and learning grasp affordances

  • Emre Ugur
  • Erol Sahin
  • Erhan Öztop

Human infants practice their initial, seemingly random arm movements for transforming them into voluntary reaching and grasping actions. With the developing perceptual abilities, infants further explore their environment using the behavior repertoire they have developed, and learn causality relations in the form of affordances, which they use for goal satisfaction and motor planning. This study proposes and implements a developmental progression on a robotic system mimicking the aforementioned infant development stages: An anthropomorphic robot hand with one basic action of swing-hand and the palmar reflex (i. e. the enclosure of the fingers upon contact) at its disposal, executes swing-hand action targeted to a salient object with different hand speeds. During the executions, it monitors the changes in its sensors, automatically forming behavior primitives such as ‘grasp’, ‘hit’, ‘carry-object’ and ‘drop’ by segmenting and differentiating the initial swing-hand action. The study then focuses on one of these behaviors, namely grasping, and shows how further practice allows the robot to learn affordances of more complex objects, which can be further used to make plans to achieve desired goals using the discovered behavior repertoire.

ICRA Conference 2011 Conference Paper

Going beyond the perception of affordances: Learning how to actualize them through behavioral parameters

  • Emre Ugur
  • Erhan Öztop
  • Erol Sahin

In this paper, we propose a method that enables a robot to learn not only the existence of affordances provided by objects, but also the behavioral parameters required to actualize them, and the prediction of effects generated on the objects in an unsupervised way. In a previous study, it was shown that through self-interaction and self-observation, analogous to an infant, an anthropomorphic robot can learn object affordances in a completely unsupervised way, and use this knowledge to make plans in its perceptual space. This paper extends the affordances model proposed in that study by using parametric behaviors and including the behavior parameters into affordance learning and goal-oriented plan generation. Furthermore, for handling complex behaviors and complex objects (such as execution of precision grasp on a mug), the perceptual processing is improved by using a combination of local and global features. Finally, a hierarchical clustering algorithm is used to discover the affordances in non-homogenous feature space. In short, object affordances for object manipulation are discovered together with behavior parameters based on the monitored effects.

ICRA Conference 2011 Conference Paper

Unsupervised learning of object affordances for planning in a mobile manipulation platform

  • Emre Ugur
  • Erol Sahin
  • Erhan Öztop

In this paper, we use the notion of affordances, proposed in cognitive science, as a framework to propose a developmental method that would enable a robot to ground symbolic planning mechanisms in the continuous sensory-motor experiences of a robot. We propose a method that allows a robot to learn the symbolic relations that pertain to its interactions with the world and show that they can be used in planning. Specifically, the robot interacts with the objects in its environment using a pre-coded repertoire of behaviors and records its interactions in a triple that consist of the initial percept of the object, the behavior applied and its effect, defined as the difference between the initial and the final percept. The method allows the robot to learn object affordance relations which can be used to predict the change in the percept of the object when a certain behavior is applied. These relations can then be used to develop plans using forward chaining. The method is implemented and evaluated on a mobile robot system with limited object manipulation capabilities. We have shown that the robot is able to learn the physical affordances of objects from range images and use them to build symbols and relations that can be used in making multi-step predictions about the affordances of objects and achieve complex goals.

ICRA Conference 2008 Conference Paper

Using learned affordances for robotic behavior development

  • Mehmet Remzi Dogar
  • Emre Ugur
  • Erol Sahin
  • Maya Cakmak

"Developmental robotics" proposes that, instead of trying to build a robot that shows intelligence once and for all, what one must do is to build robots that can develop. These robots should be equipped with behaviors that are simple but enough to bootstrap the system. Then, as the robot interacts with its environment, it should display increasingly complex behaviors. In this paper, we propose such a development scheme for a mobile robot. J. J. Gibson's concept of "affordances" and a formalization of this concept provides the basis of this development scheme. We show that an autonomous robot can start with pre-coded primitive behaviors, and as it executes its behaviors randomly in an environment, it can learn the affordance relations between the environment and its behaviors. We then present two ways of using these learned structures, in achieving more complex, intentional behaviors. In the first case, the sequencing of these primitive behaviors are such that new more complex behaviors emerge. In the second case, the robot makes a "blending" of its pre-coded primitive behaviors to create new behaviors.

IROS Conference 2007 Conference Paper

From primitive behaviors to goal-directed behavior using affordances

  • Mehmet Remzi Dogar
  • Maya Cakmak
  • Emre Ugur
  • Erol Sahin

In this paper, we studied how a mobile robot equipped with a 3D laser scanner can start from primitive behaviors and learn to use them to achieve goal-directed behaviors. For this purpose, we propose a learning scheme that is based on the concept of "affordances", where the robot first learns about the different kind of effects it can create in the environment and then links these effects with the perception of the initial environment and the executed primitive behavior. It uses these learned relations to create certain effects in the environment and achieve more complex behaviors.

ICRA Conference 2007 Conference Paper

The learning and use of traversability affordance using range images on a mobile robot

  • Emre Ugur
  • Mehmet Remzi Dogar
  • Maya Cakmak
  • Erol Sahin

We are interested in how the concept of affordances can affect our view to autonomous robot control, and how the results obtained from autonomous robotics can be reflected back upon the discussion and studies on the concept of affordances. In this paper, we studied how a mobile robot, equipped with a 3D laser scanner, can learn to perceive the traversability affordance and use it to wander in a room tilled with spheres, cylinders and boxes. The results showed that after learning, the robot can wander around avoiding contact with non-traversable objects (i. e. boxes, upright cylinders, or lying cylinders in certain orientation), but moving over traversable objects (such as spheres, and lying cylinders in a rollable orientation with respect to the robot) rolling them out of its way. We have shown that for each action approximately 1% of the perceptual features were relevant to determine whether it is afforded or not and that these relevant features are positioned in certain regions of the range image. The experiments are conducted both using a physics-based simulator and on a real robot.