Author name cluster

Nutan Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

ICRA Conference 2025 Conference Paper

LIMT: Language-Informed Multi-Task Visual World Models

Elie Aljalbout
Nikolaos Sotirakis
Patrick van der Smagt
Maximilian Karl
Nutan Chen

Most recent successes in robot reinforcement learning involve learning a specialized single-task agent. However, robots capable of performing multiple tasks can be much more valuable in real-world applications. Multi-task reinforcement learning can be very challenging due to the increased sample complexity and the potentially conflicting task objectives. Previous work on this topic is dominated by model-free approaches. The latter can be very sample inefficient even when learning specialized single-task agents. In this work, we focus on model-based multi-task reinforcement learning. We propose a method for learning multi-task visual world models, leveraging pre-trained language models to extract semantically meaningful task representations. These representations are used by the world model and policy to reason about task similarity in dynamics and behavior. Our results highlight the benefits of using language-driven task representations for world models and a clear advantage of model-based multi-task learning over the more common model-free paradigm.

Details

ICML Conference 2020 Conference Paper

Learning Flat Latent Manifolds with VAEs

Nutan Chen
Alexej Klushyn
Francesco Ferroni
Justin Bayer
Patrick van der Smagt

Measuring the similarity between data points often requires domain knowledge, which can in parts be compensated by relying on unsupervised methods such as latent-variable models, where similarity/distance is estimated in a more compact latent space. Prevalent is the use of the Euclidean metric, which has the drawback of ignoring information about similarity of data stored in the decoder, as captured by the framework of Riemannian geometry. We propose an extension to the framework of variational auto-encoders allows learning flat latent manifolds, where the Euclidean metric is a proxy for the similarity between data points. This is achieved by defining the latent space as a Riemannian manifold and by regularising the metric tensor to be a scaled identity matrix. Additionally, we replace the compact prior typically used in variational auto-encoders with a recently presented, more expressive hierarchical one—and formulate the learning problem as a constrained optimisation problem. We evaluate our method on a range of data-sets, including a video-tracking benchmark, where the performance of our unsupervised approach nears that of state-of-the-art supervised approaches, while retaining the computational efficiency of straight-line-based approaches.

Details

NeurIPS Conference 2019 Conference Paper

Learning Hierarchical Priors in VAEs

Alexej Klushyn
Nutan Chen
Richard Kurle
Botond Cseke
Patrick van der Smagt

We propose to learn a hierarchical prior in the context of variational autoencoders to avoid the over-regularisation resulting from a standard normal prior distribution. To incentivise an informative latent representation of the data, we formulate the learning problem as a constrained optimisation problem by extending the Taming VAEs framework to two-level hierarchical models. We introduce a graph-based interpolation method, which shows that the topology of the learned latent representation corresponds to the topology of the data manifold---and present several examples, where desired properties of latent representation such as smoothness and simple explanatory factors are learned by the prior.

PDF Details

IROS Conference 2018 Conference Paper

Active Learning based on Data Uncertainty and Model Sensitivity

Nutan Chen
Alexej Klushyn
Alexandros Paraschos
Djalel Benbouzid
Patrick van der Smagt

Robots can rapidly acquire new skills from demonstrations. However, during generalisation of skills or transitioning across fundamentally different skills, it is unclear whether the robot has the necessary knowledge to perform the task. Failing to detect missing information often leads to abrupt movements or to collisions with the environment. Active learning can quantify the uncertainty of performing the task and, in general, locate regions of missing information. We introduce a novel algorithm for active learning and demonstrate its utility for generating smooth trajectories. Our approach is based on deep generative models and metric learning in latent spaces. It relies on the Jacobian of the likelihood to detect non-smooth transitions in the latent space, i. e. , transitions that lead to abrupt changes in the movement of the robot. When non-smooth transitions are detected, our algorithm asks for an additional demonstration from that specific region. The newly acquired knowledge modifies the data manifold and allows for learning a latent representation for generating smooth movements. We demonstrate the efficacy of our approach on generalising elementary skills, transitioning across different skills, and implicitly avoiding collisions with the environment. For our experiments, we use a simulated pendulum where we observe its motion from images and a 7-DoF anthropomorphic arm.

Details

IROS Conference 2016 Conference Paper

Stable reinforcement learning with autoencoders for tactile and visual data

Herke van Hoof
Nutan Chen
Maximilian Karl
Patrick van der Smagt
Jan Peters 0001

For many tasks, tactile or visual feedback is helpful or even crucial. However, designing controllers that take such high-dimensional feedback into account is non-trivial. Therefore, robots should be able to learn tactile skills through trial and error by using reinforcement learning algorithms. The input domain for such tasks, however, might include strongly correlated or non-relevant dimensions, making it hard to specify a suitable metric on such domains. Auto-encoders specialize in finding compact representations, where defining such a metric is likely to be easier. Therefore, we propose a reinforcement learning algorithm that can learn non-linear policies in continuous state spaces, which leverages representations learned using auto-encoders. We first evaluate this method on a simulated toy-task with visual input. Then, we validate our approach on a real-robot tactile stabilization task.

Details

IROS Conference 2015 Conference Paper

Measuring fingertip forces from camera images for random finger poses

Nutan Chen
Sebastian Urban
Justin Bayer
Patrick van der Smagt

Robust fingertip force detection from fingernail image is a critical strategy that can be applied in many areas. However, prior research fixed many variables that influence the finger color change. This paper analyzes the effect of the finger joint on the force detection in order to deal with the constrained finger position setting. A force estimator method is designed: a model to predict the fingertip force from finger joints measured from 2D cameras and 3 rectangular markers in cooperation with the fingernail images are trained. Then the error caused by the color changes of the joint bending can be avoided. This strategy is a significant step forward from a finger force estimator that requires tedious finger joint setting. The approach is evaluated experimentally. The result shows that it increases the accuracy over 10% for the force in conditions of the finger joint free movement. The estimator is used to demonstrate lifting and replacing objects with various weights.

Details

ICRA Conference 2014 Conference Paper

Estimating finger grip force from an image of the hand using Convolutional Neural Networks and Gaussian processes

Nutan Chen
Sebastian Urban
Christian Osendorfer
Justin Bayer
Patrick van der Smagt

Estimating human fingertip forces is required to understand force distribution in grasping and manipulation. Human grasping behavior can then be used to develop force-and impedance-based grasping and manipulation strategies for robotic hands. However, estimating human grip force naturally is only possible with instrumented objects or unnatural gloves, thus greatly limiting the type of objects used. In this paper we describe an approach which uses images of the human fingertip to reconstruct grip force and torque at the finger. Our approach does not use finger-mounted equipment, but instead a steady camera observing the fingers of the hand from a distance. This allows for finger force estimation without any physical interference with the hand or object itself, and is therefore universally applicable. We construct a 3-dimensional finger model from 2D images. Convolutional Neural Networks (CNN) are used to predict the 2D image to a 3D model transformation matrix. Two methods of CNN are designed for separate and combined outputs of orientation and position. After learning, our system shows an alignment accuracy over 98% on unknown data. In the final step, a Gaussian process estimates finger force and torque from the aligned images based on color changes and deformations of the nail and its surrounding skin. Experimental results shows that the accuracy achieves about 95% in the force estimation and 90% in the torque.

Details