Arrow Research search

Author name cluster

Benjamin Alt

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
2 author rows

Possible papers

8

ICRA Conference 2025 Conference Paper

QueryCAD: Grounded Question Answering for CAD Models

  • Claudius Kienle
  • Benjamin Alt
  • Darko Katic
  • Rainer Jäkel
  • Jan Peters 0001

CAD models are widely used in industry and are essential for robotic automation processes. However, these models are rarely considered in novel AI-based approaches, such as the automatic synthesis of robot programs, as there are no readily available methods that would allow CAD models to be incorporated for the analysis, interpretation, or extraction of information. To address these limitations, we propose QueryCAD, the first system designed for CAD question answering, enabling the extraction of precise information from CAD models using natural language queries. QueryCAD incorporates SegCAD, an open-vocabulary instance segmentation model we developed to identify and select specific parts of the CAD model based on part descriptions. We further propose a CAD question answering benchmark to evaluate QueryCAD and establish a foundation for future research. Lastly, we integrate QueryCAD within an automatic robot program synthesis framework, validating its ability to enhance deep-learning solutions for robotics by enabling them to process CAD models. https://claudius-kienle.github.com/querycad.

ICRA Conference 2025 Conference Paper

Shadow Program Inversion with Differentiable Planning: A Framework for Unified Robot Program Parameter and Trajectory Optimization

  • Benjamin Alt
  • Claudius Kienle
  • Darko Katic
  • Rainer Jäkel
  • Michael Beetz

This paper presents Shadow Program Inversion with Differentiable Planning (SPI-DP), a novel first-order optimizer capable of optimizing robot programs with respect to both high-level task objectives and motion-level constraints. To that end, we introduce Differentiable Gaussian Process Motion Planning for N-DoF Manipulators (dGPMP2-ND), a differentiable collision-free motion planner for serial N-DoF kinematics, and integrate it into an iterative, gradient-based optimization approach for generic, parameterized robot program representations. SPI-DP allows first-order optimization of planned trajectories and program parameters with respect to objectives such as cycle time or smoothness subject to e. g. collision constraints, while enabling humans to understand, modify or even certify the optimized programs. We provide a comprehensive evaluation on two practical household and industrial applications.

IROS Conference 2024 Conference Paper

MuTT: A Multimodal Trajectory Transformer for Robot Skills

  • Claudius Kienle
  • Benjamin Alt
  • Onur Celik
  • Philipp Becker
  • Darko Katic
  • Rainer Jäkel
  • Gerhard Neumann

High-level robot skills represent an increasingly popular paradigm in robot programming. However, configuring the skills’ parameters for a specific task remains a manual and time-consuming endeavor. Existing approaches for learning or optimizing these parameters often require numerous real-world executions or do not work in dynamic environments. To address these challenges, we propose Multimodal Trajectory Transformer (MuTT), a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills by integrating vision, trajectory, and robot skill parameters. Notably, we pioneer the fusion of vision and trajectory, introducing a novel trajectory projection. Furthermore, we illustrate MuTT’s efficacy as a predictor when combined with a model-based robot skill optimizer. This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions during optimization. Designed for compatibility with any representation of robot skills, MuTT demonstrates its versatility across three comprehensive experiments, showcasing superior performance across two different skill representations.

ICRA Conference 2024 Conference Paper

RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots

  • Benjamin Alt
  • Florian Stöckl
  • Silvan Müller
  • Christopher Braun
  • Julian Raible
  • Saad Alhasan
  • Oliver Rettig
  • Lukas Ringle

Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identification, an interactive voice-controlled wizard system for the AI-assisted bootstrapping and parameterization of robot programs, and an automatic planning and execution pipeline for force-controlled robotic surface treatment. RoboGrind is evaluated both under laboratory and real-world conditions in the context of refabricating fiberglass wind turbine blades.

KR Conference 2023 Conference Paper

Knowledge-Driven Robot Program Synthesis from Human VR Demonstrations

  • Benjamin Alt
  • Franklin Kenghagho Kenfack
  • Andrei Haidu
  • Darko Katic
  • Rainer Jäkel
  • Michael Beetz

Aging societies, labor shortages and increasing wage costs call for assistance robots capable of autonomously performing a wide array of real-world tasks. Such open-ended robotic manipulation requires not only powerful knowledge representations and reasoning (KR&R) algorithms, but also methods for humans to instruct robots what tasks to perform and how to perform them. In this paper, we present a system for automatically generating executable robot control programs from human task demonstrations in virtual reality (VR). We leverage common-sense knowledge and game engine-based physics to semantically interpret human VR demonstrations, as well as an expressive and general task representation and automatic path planning and code generation, embedded into a state-of-the-art cognitive architecture. We demonstrate our approach in the context of force-sensitive fetch-and-place for a robotic shopping assistant. The source code is available at https: //github. com/ease-crc/vr-program-synthesis.

IROS Conference 2022 Conference Paper

Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments

  • Benjamin Alt
  • Darko Katic
  • Rainer Jäkel
  • Michael Beetz

In both industrial and service domains, a central benefit of the use of robots is their ability to quickly and reliably execute repetitive tasks. However, even relatively simple peg-in-hole tasks are typically subject to stochastic variations, requiring search motions to find relevant features such as holes. While search improves robustness, it comes at the cost of increased runtime: More exhaustive search will maximize the probability of successfully executing a given task, but will significantly delay any downstream tasks. This trade-off is typically resolved by human experts according to simple heuristics, which are rarely optimal. This paper introduces an automatic, data-driven and heuristic-free approach to optimize robot search strategies. By training a neural model of the search strategy on a large set of simulated stochastic environments, conditioning it on few real-world examples and inverting the model, we can infer search strategies which adapt to the time-variant characteristics of the underlying probability distributions, while requiring very few real-world measurements. We evaluate our approach on two different industrial robots in the context of spiral and probe search for THT electronics assembly. **See github.com/benjaminalt/dpse for code and data.

IROS Conference 2022 Conference Paper

LapSeg3D: Weakly Supervised Semantic Segmentation of Point Clouds Representing Laparoscopic Scenes

  • Benjamin Alt
  • Christian Kunz
  • Darko Katic
  • Rayan Younis
  • Rainer Jäkel
  • Beat Peter Müller-Stich
  • Martin Wagner 0001
  • Franziska Mathis-Ullrich

The semantic segmentation of surgical scenes is a prerequisite for task automation in robot assisted interventions. We propose LapSeg3D, a novel DNN-based approach for the voxel-wise annotation of point clouds representing surgical scenes. As the manual annotation of training data is highly time consuming, we introduce a semi-autonomous clustering-based pipeline for the annotation of the gallbladder, which is used to generate segmented labels for the DNN. When evaluated against manually annotated data, LapSeg3D achieves an F1 score of 0. 94 for gallbladder segmentation on various datasets of ex-vivo porcine livers. We show LapSeg3D to generalize accurately across different gallbladders and datasets recorded with different RGB-D camera systems.

ICRA Conference 2021 Conference Paper

Robot Program Parameter Inference via Differentiable Shadow Program Inversion

  • Benjamin Alt
  • Darko Katic
  • Rainer Jäkel
  • Asil Kaan Bozcuoglu
  • Michael Beetz

Challenging manipulation tasks can be solved effectively by combining individual robot skills, which must be parameterized for the concrete physical environment and task at hand. This is time-consuming and difficult for human programmers, particularly for force-controlled skills. To this end, we present Shadow Program Inversion (SPI), a novel approach to infer optimal skill parameters directly from data. SPI leverages unsupervised learning to train an auxiliary differentiable program representation ("shadow program") and realizes parameter inference via gradient-based model inversion. Our method enables the use of efficient first-order optimizers to infer optimal parameters for originally non-differentiable skills, including many skill variants currently used in production. SPI zero-shot generalizes across task objectives, meaning that shadow programs do not need to be retrained to infer parameters for different task variants. We evaluate our methods on three different robots and skill frameworks in industrial and household scenarios. Code and examples are available at https://innolab.artiminds.com/icra2021.