Arrow Research search

Author name cluster

Arnau Ramisa

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

IJCAI Conference 2017 Conference Paper

Multimodal News Article Analysis

  • Arnau Ramisa

The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.

ICRA Conference 2015 Conference Paper

Efficient monocular pose estimation for complex 3D models

  • Antonio Rubio 0001
  • Michael Villamizar
  • Luis Ferraz
  • Adrián Peñate Sánchez
  • Arnau Ramisa
  • Edgar Simo-Serra
  • Alberto Sanfeliu
  • Francesc Moreno-Noguer

We propose a robust and efficient method to estimate the pose of a camera with respect to complex 3D textured models of the environment that can potentially contain more than 100; 000 points. To tackle this problem we follow a top down approach where we combine high-level deep network classifiers with low level geometric approaches to come up with a solution that is fast, robust and accurate. Given an input image, we initially use a pre-trained deep network to compute a rough estimation of the camera pose. This initial estimate constrains the number of 3D model points that can be seen from the camera viewpoint. We then establish 3D-to-2D correspondences between these potentially visible points of the model and the 2D detected image features. Accurate pose estimation is finally obtained from the 2D-to-3D correspondences using a novel PnP algorithm that rejects outliers without the need to use a RANSAC strategy, and which is between 10 and 100 times faster than other methods that use it. Two real experiments dealing with very large and complex 3D models demonstrate the effectiveness of the approach.

IROS Conference 2013 Conference Paper

FINDDD: A fast 3D descriptor to characterize textiles for robot manipulation

  • Arnau Ramisa
  • Guillem Alenyà
  • Francesc Moreno-Noguer
  • Carme Torras

Most current depth sensors provide 2. 5D range images in which depth values are assigned to a rectangular 2D array. In this paper we take advantage of this structured information to build an efficient shape descriptor which is about two orders of magnitude faster than competing approaches, while showing similar performance in several tasks involving deformable object recognition. Given a 2D patch surrounding a point and its associated depth values, we build the descriptor for that point, based on the cumulative distances between their normals and a discrete set of normal directions. This processing is made very efficient using integral images, even allowing to compute descriptors for every range image pixel in a few seconds. The discriminative power of our descriptor, dubbed FINDDD, is evaluated in three different scenarios: recognition of specific cloth wrinkles, instance recognition from geometry alone, and detection of reliable and informed grasping points.

ICRA Conference 2012 Conference Paper

Using depth and appearance features for informed robot grasping of highly wrinkled clothes

  • Arnau Ramisa
  • Guillem Alenyà
  • Francesc Moreno-Noguer
  • Carme Torras

Detecting grasping points is a key problem in cloth manipulation. Most current approaches follow a multiple re-grasp strategy for this purpose, in which clothes are sequentially grasped from different points until one of them yields to a desired configuration. In this paper, by contrast, we circumvent the need for multiple re-graspings by building a robust detector that identifies the grasping points, generally in one single step, even when clothes are highly wrinkled. In order to handle the large variability a deformed cloth may have, we build a Bag of Features based detector that combines appearance and 3D geometry features. An image is scanned using a sliding window with a linear classifier, and the candidate windows are refined using a non-linear SVM and a “grasp goodness” criterion to select the best grasping point. We demonstrate our approach detecting collars in deformed polo shirts, using a Kinect camera. Experimental results show a good performance of the proposed method not only in identifying the same trained textile object part under severe deformations and occlusions, but also the corresponding part in other clothes, exhibiting a degree of generalization.

ECAI Conference 2008 Conference Paper

Learning to Select Object Recognition Methods for Autonomous Mobile Robots

  • Reinaldo A. C. Bianchi
  • Arnau Ramisa
  • Ramón López de Mántaras

Selecting which algorithms should be used by a mobile robot computer vision system is a decision that is usually made a priori by the system developer, based on past experience and intuition, not systematically taking into account information that can be found in the images and in the visual process itself to learn which algorithm should be used, in execution time. This paper presents a method that uses Reinforcement Learning to decide which algorithm should be used to recognize objects seen by a mobile robot in an indoor environment, based on simple attributes extracted on-line from the images, such as mean intensity and intensity deviation. Two state-of-the-art object recognition algorithms can be selected: the constellation method proposed by Lowe together with its interest point detector and descriptor, the Scale-Invariant Feature Transform and a bag of features approach. A set of empirical evaluations was conducted using a household mobile robots image database, and results obtained shows that the approach adopted here is very promising.

ICRA Conference 2008 Conference Paper

Mobile robot localization using panoramic vision and combinations of feature region detectors

  • Arnau Ramisa
  • Adriana Tapus
  • Ramón López de Mántaras
  • Ricardo Toledo

This paper presents a vision-based approach for mobile robot localization. The environmental model is topological. The new approach uses a constellation of different types of affine covariant regions to characterize a place. This type of representation permits a reliable and distinctive environment modeling. The performance of the proposed approach is evaluated using a database of panoramic images from different rooms. Additionally, we compare different combinations of complementary feature region detectors to find the one that achieves the best results. Our experimental results show promising results for this new localization method. Additionally, similarly to what happens with single detectors, different combinations exhibit different strengths and weaknesses depending on the situation, suggesting that a context-aware method to combine the different detectors would improve the localization results.