Author name cluster

Patrick Mania

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

1 author row

ICRA Conference 2025 Conference Paper

Towards Autonomous Verification: Integrating Cognitive AI and Semantic Digital Twins in Medical Robotics

Patrick Mania
Michael Neumann
Franklin Kenghagho Kenfack
Michael Beetz

In medical laboratory environments, where pre-cision and safety are critical, the deployment of autonomous robots requires not only accurate object manipulation but also the ability to verify task success to comply with regulatory requirements. This paper introduces a novel imagination-enabled perception framework that integrates cognitive AI with semantic digital twins to allow medical robots to sim-ulate task outcomes, compare them with real-world results, and autonomously verify the success of their actions. Our approach addresses challenges related to handling small and transparent objects commonly found in sterility testing kits and other related consumables. By enhancing the RoboKudo perception system with parthood-based reasoning, we enable more accurate task verification through focused attention on object subparts. Experiments show that our system significantly improves performance compared to traditional object-centric methods, increasing accuracy in complex environments without the need for extensive retraining. This work demonstrates a novel concept in making robotic systems more adaptable and reliable for critical tasks in medical laboratories.

Details

ICRA Conference 2024 Conference Paper

An Open and Flexible Robot Perception Framework for Mobile Manipulation Tasks

Patrick Mania
Simon Stelter
Gayane Kazhoyan
Michael Beetz

Over the last years, powerful methods for solving specific perception problems such as object detection, pose estimation or scene understanding have been developed. While performing mobile manipulation actions, a robot’s perception framework needs to execute a series of these methods in a specific sequence each time it receives a new perception task. Generating proficient combinations of vision methods to solve individual perception tasks remains a challenge, as the combination depends on the requirements of the task and the capabilities of the robot’s hardware. In this paper, we propose RoboKudo, an open-source knowledge-enabled perception framework that leverages the strengths of the Unstructured Information Management (UIM) principle and the flexibility of Behavior Trees to model task-specific perception processes. The framework can combine state-of-the-art computer vision methods to satisfy the requirements of each perception task and scales to different robot platforms. The generality and effectiveness of the framework are evaluated in real world experiments where it solves various perception tasks in the context of mobile manipulation actions in a household domain. Code and additional material are available at https://robokudo. ai. uni-bremen. de/rkop.

Details

ICRA Conference 2024 Conference Paper

Perception through Cognitive Emulation: "A Second Iteration of NaivPhys4RP for Learningless and Safe Recognition and 6D-Pose Estimation of (Transparent) Objects"

Franklin Kenghagho Kenfack
Michael Neumann
Patrick Mania
Michael Beetz

In our previous work, we designed a human-like white-box and causal generative model of perception NaivPhys4RP, essentially based on cognitive emulation to understand the past, the present and the future of the state of complex worlds from poor observations. In this paper, as recommended in that previous work, we first refine the theoretical model of NaivPhys4RP in terms of integration of variables as well as perceptual inference tasks to solve. Intuitively, the system is closed under the injection, update and dependency of variables. Then, we present a first implementation of NaivPhys4RP that demonstrates the learningless and safe recognition and 6D-Pose estimation of objects from poor sensor data (e. g. , occlusion, transparency, poor-depth, in-hand). This does not only make a substantial step forward comparatively to classical perception systems in perceiving objects in these scenarios, but escape the burden of data-intensive learning and operate safely (transparency and causality — we fit sensor data into mentally constructed meaningful worlds). With respect to ChatGPT’s ambitions, it can imagine physico-realistic socio-physical scenes from texts, demonstrate understanding of these texts, and all these with no data- and resource-intensive learning.

Details

IROS Conference 2021 Conference Paper

Imagination-enabled Robot Perception

Patrick Mania
Franklin Kenghagho Kenfack
Michael Neumann
Michael Beetz

Many of today’s robot perception systems aim at accomplishing perception tasks that are too simplistic and too hard. They are too simplistic because they do not require the perception systems to provide all the information needed to accomplish manipulation tasks. Typically the perception results do not include information about the part structure of objects, articulation mechanisms and other attributes needed for adapting manipulation behavior. On the other hand, the perception problems stated are also too hard because — unlike humans— the perception systems cannot leverage the expectations about what they will see to their full potential. Therefore, we investigate a variation of robot perception tasks suitable for robots accomplishing everyday manipulation tasks, such as household robots or a robot in a retail store. In such settings it is reasonable to assume that robots know most objects and have detailed models of them. We propose a perception system that maintains its beliefs about its environment as a scene graph with physics simulation and visual rendering. When detecting objects, the perception system retrieves the model of the object and places it at the corresponding place in a VR-based environment model. The physics simulation ensures that object detections that are physically not possible are rejected and scenes can be rendered to generate expectations at the image level. The result is a perception system that can provide useful information for manipulation tasks.

Details

ICRA Conference 2019 Conference Paper

A Framework for Self-Training Perceptual Agents in Simulated Photorealistic Environments

Patrick Mania
Michael Beetz

The development of high-performance perception for mobile robotic agents is still challenging. Learning appropriate perception models usually requires extensive amounts of labeled training data that ideally follows the same distribution as the data an agent will encounter in its target task. Recent developments in gaming industry led to game engines able to generate photorealistic environments in real-time, which can be used to realistically simulate the sensory input of an agent. We propose a novel framework which allows the definition of different learning scenarios and instantiates these scenarios in a high quality game engine where a perceptual agent can act and learn in. The scenarios are specified in a newly developed scenario description language that allows the parametrization of the virtual environment and the perceptual agent. New scenarios can be sampled from a task-specific object distribution that allows the automatic generation of extensive amounts of different learning environments for the perceptual agent. We will demonstrate the plausibility of the framework by conducting object recognition experiments on a real robotic system which has been trained within our framework.

Details

ICRA Conference 2016 Conference Paper

Scaling perception towards autonomous object manipulation - in knowledge lies the power

Ferenc Balint-Benczedi
Patrick Mania
Michael Beetz

Mobile robots operating in a human environment face the challenge of recognizing objects that possess a multitude of different visual characteristics, affordances, and are found in visually challenging scenes. Because of this, perceptual capabilities of such robots need to go beyond detection or categorization of objects, and be able to answer queries not only about where certain objects are located based on their class label, but also about functional properties of these. To achieve an optimal performance, robots need to be aware of their environment, the task that they are to execute, and their perceptual capabilities. Given this knowledge, robotic agents need adequate mechanisms that apply the right method at the right time, in the right situation. In this paper we present a self-adaptive robotic perception system, that acts as a planner for task aware robot manipulation and enables querying on a broad domain. This is done through extending our existing perception framework, ROBOSHERLOCK, with the capability to adapt its perception pipelines based on the query, using knowledge-based reasoning. We will demonstrate the success of the approach, by presenting challenging queries, where the benefits of integrating knowledge processing into perception systems is shown.

Details