Author name cluster

Michael Jenkin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

ICRA Conference 2025 Conference Paper

Diver to Robot Communication Underwater

Robert Codd-Downey
Michael Jenkin

Gesture-based communication is a standard underwater communication strategy that is taught to divers as part of their regular diver training and it would seem a natural mechanism to leverage for diver to robot communication underwater. Enabling an unmanned underwater vehicle (UUV) to understand such sequences would involve having the robot learn the large set of gestures that divers use and the way they are combined. As perfect transcription of gestures is unlikely, the communication process also requires an error-correcting framework to ensure that communication is clear and correct. Here we describe an interactive process that provides this infrastructure. A weakly supervised transfer learning approach is used to recognize standard SCUBA gestures in individual video frames and within a Sim2Real process to train a LSTM to recognize gesture sequences. This process is placed within a per-gesture and per-sequence interaction process to assist and confirm the recognition of individual gestures and to confirm entire gesture sequences. Individual aspects of this process and complete end-to-end operation are demonstrated using an unmanned underwater vehicle.

IROS Conference 2024 Conference Paper

Towards Enhanced Fairness and Sample Efficiency in Traffic Signal Control

Xingshuai Huang
Di Wu 0044
Michael Jenkin
Benoit Boulet

Traffic signal control (TSC) has seen substantial advancements through the application of reinforcement learning (RL) algorithms, which have shown remarkable potential in enhancing traffic flow efficiency. These RL-based approaches often surpass traditional rule-based methods, particularly in dynamic traffic environments. However, current RL solutions for TSC predominantly rely on model-free methods, necessitating extensive environmental interactions during training. This requirement can be prohibitively expensive or unfeasible in real-world implementations. Furthermore, existing methods have frequently neglected the issue of fairness in multi-intersection control, resulting in unbalanced congestion across different intersections. To address these challenges, we present FM2Light, a fairness-aware model-based multi-agent RL framework for TSC. Our approach leverages an ensemble of global world models for generating synthetic samples to enhance sample efficiency, thereby mitigating the data-intensive nature of the training process. Additionally, FM2Light incorporates a refined reward structure to promote fairness and improve coordination across multiple intersections. Extensive evaluations conducted in diverse real-world scenarios demonstrate that FM2Light achieves performance comparable to or exceeding that of model-free RL (MFRL) methods, while significantly reducing sample requirements and ensuring more equitable control among multiple agents.

ICRA Conference 2023 Conference Paper

ANSEL Photobot: A Robot Event Photographer with Semantic Intelligence

Dmitriy Rivkin
Gregory Dudek
Nikhil Kakodkar
David Meger
Oliver Limoyo
Michael Jenkin
Xue Liu 0004
Francois Robert Hogan

Our work examines the way in which large language models can be used for robotic planning and sampling in the context of automated photographic documentation. Specifically, we illustrate how to produce a photo-taking robot with an exceptional level of semantic awareness by leveraging recent advances in general purpose language (LM) and vision-language (VLM) models. Given a high-level description of an event we use an LM to generate a natural-language list of photo descriptions that one would expect a photographer to capture at the event. We then use a VLM to identify the best matches to these descriptions in the robot's video stream. The photo portfolios generated by our method are consistently rated as more appropriate to the event by human evaluators than those generated by existing methods.

IROS Conference 2022 Conference Paper

SESNO: Sample Efficient Social Navigation from Observation

Bobak H. Baghi
Abhisek Konar
Francois Robert Hogan
Michael Jenkin
Gregory Dudek

In this paper, we present the Sample Efficient Social Navigation from Observation (SESNO) algorithm that efficiently learns socially-compliant navigation policies from observations of human trajectories. SESNO is an inverse reinforcement learning (IRL)-based algorithm that learns from human trajectory observations without knowledge of their actions. We improve the sample-efficiency over previous IRL-based methods by introducing a shared experience replay buffer that allows reuse of past trajectory experiences to estimate the policy and the reward. We evaluate SESNO using publicly available pedestrian motion data sets and compare its performance to related baseline methods in the literature. We show that SESNO yields performance superior to existing baselines while dramatically improving the sample complexity by using as few as a hundredth of the samples required by existing baselines.

ICRA Conference 2022 Conference Paper

Visuotactile-RL: Learning Multimodal Manipulation Policies with Deep Reinforcement Learning

Johanna Hansen
Francois Robert Hogan
Dmitriy Rivkin
David Meger
Michael Jenkin
Gregory Dudek

Manipulating objects with dexterity requires timely feedback that simultaneously leverages the senses of vision and touch. In this paper, we focus on the problem setting where both visual and tactile sensors provide pixel-level feedback for Visuotactile reinforcement learning agents. We investigate the challenges associated with multimodal learning and propose several improvements to existing RL methods; including tactile gating, tactile data augmentation, and visual degradation. When compared with visual-only and tactile-only baselines, our Visuotactile-RL agents showcase (1) significant improvements in contact-rich tasks; (2) improved robustness to visual changes (lighting/camera view) in the workspace; and (3) resilience to physical changes in the task environment (weight/friction of objects).

AAAI Conference 2021 Conference Paper

Learning Intuitive Physics with Multimodal Generative Models

Sahand Rezaei-Shoshtari
Francois R. Hogan
Michael Jenkin
David Meger
Gregory Dudek

Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelligent and anticipatory actions. This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. Visual information captures object properties such as 3D shape and location, while tactile information provides critical cues about interaction forces and resulting object motion when it makes contact with the environment. Utilizing a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces, our system captures both the visual appearance and the tactile properties of objects. We interpret the dual stream signals from the sensor using a Multimodal Variational Autoencoder (MVAE), allowing us to capture both modalities of contacting objects and to develop a mapping from visual to tactile interaction and vice-versa. Additionally, the perceptual system can be used to infer the outcome of future physical interactions, which we validate through simulated and realworld experiments in which the resting state of an object is predicted from given initial conditions.

ICRA Conference 2021 Conference Paper

Optimizing Cellular Networks via Continuously Moving Base Stations on Road Networks

Yogesh A. Girdhar
Dmitriy Rivkin
Di Wu 0044
Michael Jenkin
Xue Liu 0004
Gregory Dudek

Although existing cellular network base stations are typically immobile, the recent development of small form factor base stations and self driving cars has enabled the possibility of deploying a team of continuously moving base stations that can reorganize the network infrastructure to adapt to changing network traffic usage patterns. Given such a system of mobile base stations (MBSes) that can freely move on the road, how should their path be planned in an effort to optimize the experience of the users? This paper addresses this question by modeling the problem as a Markov Decision Process where the actions correspond to the MBSes deciding which direction to go at traffic intersections; states corresponds to the position of MBSes; and rewards correspond to minimization of packet loss in the network. A Monte Carlo Tree Search (MCTS)-based anytime algorithm that produces path plans for multiple base stations while optimizing expected packet loss is proposed. Simulated experiments in the city of Verdun, QC, Canada with varying user equipment (UE) densities and random initial conditions show that the proposed approach consistently outperforms myopic planners, and is able to achieve near-optimal performance.

ICRA Conference 2019 Conference Paper

Finding divers with SCUBANet

Robert Codd-Downey
Michael Jenkin

Robot-diver communication underwater is complicated by the attenuation of RF signals, the complexities of the environment in terms of deploying interaction devices, and issues related to the cognitive loading of human operators. Humans operating underwater have developed a simple yet effective strategy for diver-diver communication based on the visual recognition of gestures. Can a similar approach be effective for diver-robot communication? Here we present experiments with SCUBANet, an underwater detection dataset of body parts associated with diver-robot communication. Given the nature of standard diver gestures, here we concentrate on diver recognition and in particular on diver body-head-hand localization and examine the feasibility of using a CNN-based approach to address this problem. Such data-driven approaches typically require an appropriately annotated dataset. The SCUBANet dataset contains images of object classes commonly encountered during human-robot communication underwater. Object classes are labeled using per-instance bounding boxes. Annotations were created through crowd sourcing via a web-based interface to ease deployment. We provide baseline performance on diver and diver component recognition and localization using transfer learning on three widely available pre-trained models.

ICRA Conference 2017 Conference Paper

On the utility of additional sensors in aquatic simultaneous localization and mapping

Robert Codd-Downey
Michael Jenkin

Simultaneous Localization and Mapping (SLAM) is a key stepping stone on the road to truly autonomous robots. SLAM is of particular importance to robots with large motion estimation problems, such as robots operating on the surface of aquatic GPS-denied environments where a paucity of local landmarks complicates SLAM and accurate navigation. Visual sensors have proven to be an effective tool for SLAM generally and have wide applicability, but is vision enough to solve SLAM in this environment, and how important are other sensors including a compass and water column depth to solve SLAM for an aquatic surface vehicle? Here we show that more sensors are almost always helpful in terms of improving SLAM performance in such a situation but that a compass is a particularly useful sensor for SLAM for autonomous surface vehicles; suggesting that a compass is a worthwhile investment for such a robot, and that compass alternatives should be considered when operating an autonomous vehicle in environments that are both GPS and compass-denied.

IROS Conference 2014 Conference Paper

Integrating multiple soft constraints for planning practical paths

Jing Yang
Patrick W. Dymond
Michael Jenkin

Sampling-based algorithms are a common approach to high-dimensional real-world path planning problems. Unfortunately the solutions found using such planners are often not practical in that they do not take into account soft application-specific constraints. This paper formulates the practicality of paths based on the notion of soft constraints found in the Planning Domain Definition Language 3 (PDDL3) [21] and a range of optimization strategies are developed targeted towards user-preferred qualities by integrating soft constraints in the pre-processing, planning and post-processing phases of the sampling-based path planners. An auction-based resource allocation approach coordinates competing optimization strategies. This approach uses an adaptive bidding strategy for each optimizer and in each round the optimizer with the best predicted performance is selected. This general coordination system allows for flexibility in both the number and types of the optimizers used. Experimental validation demonstrates the effectiveness of the approach.

ICRA Conference 2011 Conference Paper

6DOF pose estimation using 3D sensors

Bart Verzijlenberg
Michael Jenkin

Pose estimation is an important capability for mobile agents. A wide variety of solutions have been proposed, but work in the literature has focused primarily on solutions for robots whose mobility is restricted to the ground plane. In this work we present a framework for 6DOF pose estimation. Normally the increased computational cost associated with this higher dimensional space makes pose estimation intractable. The approach presented here addresses the computational issues associated with the higher dimensional problem by decoupling orientation estimation from position estimation. Assuming that orientation can be estimated separately from position allows efficient methods to be used for the (unimodal) orientation estimate, while more sophisticated methods are used for the position estimate. Although similar to Rao-Blackwellization, the approach is essentially reversed. Results on real and simulated datasets and a comparison with a naive 6DOF filter are presented.

ICRA Conference 2011 Conference Paper

The relative power of immovable markers in topological mapping

Hui Wang 0068
Michael Jenkin
Patrick W. Dymond

The fundamental problem in robotic exploration and mapping of an unknown environment is answering the question 'have I been here before? ', which involves disambiguating the robot's current location from previously visited or known locations. One approach to answering this problem in embedded topological worlds is to resort to the use of an external aid that can help the robot disambiguate places. Here we investigate the power of different marker-based aids in exploring undirected topological graphs. We demonstrate that for undirected graphs, certain marker aids are insufficient, while others have powers that are sufficient to develop asymptotically optimal exploration algorithms.

IROS Conference 2010 Conference Paper

Swimming with robots: Human robot communication at depth

Bart Verzijlenberg
Michael Jenkin

Human-robot communication is a complex problem even in the terrestrial domain. Failure to properly communicate instructions to a robot and receive appropriate feedback can at the very least hamper the ability of the robot to perform its task, and at worst prevent the task from being completed. The problem of providing effective communication between a robot and its operator becomes even more complex underwater. Many communication channels available in the terrestrial domain become unavailable, and communication between team members and task oversight become even more complex. This paper describes initial experiments with the AQUATablet - a robot interaction device designed to be operated by a diver tethered to, or in visual communication with, an underwater robot. The basic requirements of the device are described along with design considerations and results of initial experiments with the device conducted in the pool and in the open ocean.

IROS Conference 2010 Conference Paper

Using a string to map the world

Hui Wang 0068
Michael Jenkin
Patrick W. Dymond

Literature and folklore is rife with a range of oracles that have been used by explorers to explore unknown environments. But how effective are these various oracles? This paper considers the power of string and string-like oracles to map an unknown embedded topological environment. We demonstrate that for undirected graphs, even very short strings can be used to explore an unknown environment but that significant performance improvements can be found when longer strings are available.

IROS Conference 2006 Conference Paper

Development of an Underwater Vision Sensor for 3D Reef Mapping

Andrew Hogue
Michael Jenkin

Coral reef health is an indicator of global climate change and coral reefs themselves are important for sheltering fish and other aquatic life. Monitoring reefs is a time-consuming and potentially dangerous task and as a consequence autonomous robotic mapping and surveillance is desired. This paper describes an underwater vision-based sensor to aid in this task. Underwater environments present many challenges for vision-based sensors and robotic vehicles. Lighting is highly variable, optical snow/particulate matter can confound traditional noise models, the environment lacks visual structure, and limited communication between autonomous agents including divers and surface support exacerbates the potentially dangerous environment. We describe experiments with our multi-camera stereo reconstruction algorithm geared towards coral reef monitoring. The sensor is used to estimate volumetric scene structure while simultaneously estimating sensor ego-motion. Preliminary field trials indicate the utility of the sensor for 3D reef monitoring and results of land-based evaluation of the sensor are shown to evaluate the accuracy of the system

ICRA Conference 2006 Conference Paper

Underwater 3D SLAM through Entropy Minimization

Juan Manuel Sáez
Andrew Hogue
Francisco Escolano
Michael Jenkin

The aquatic realm is ideal for testing autonomous robotic technology. The challenges presented in this environment are numerous due to the highly dynamic nature of the medium. Applications for underwater robotics include the autonomous inspection of coral reef, ships, pipelines, and other environmental assessment programs. In this paper we present current results in using 6DOF entropy minimization SLAM (simultaneous localization and mapping) for creating dense 3D visual maps of underwater environments that are suitable for such applications. The proposed SLAM algorithm exploits dense information coming from a stereo system, and performs robust egomotion estimation and global-rectification following an optimization approach

IROS Conference 2005 Conference Paper

A visually guided swimming robot

Gregory Dudek
Michael Jenkin
Chris Prahacs
Andrew Hogue
Junaed Sattar
Philippe Giguère
Andrew German
Hui Liu

We describe recent results obtained with AQUA, a mobile robot capable of swimming, walking and amphibious operation. Designed to rely primarily on visual sensors, the AQUA robot uses vision to navigate underwater using servo-based guidance, and also to obtain high-resolution range scans of its local environment. This paper describes some of the pragmatic and logistic obstacles encountered, and provides an overview of some of the basic capabilities of the vehicle and its associated sensors. Moreover, this paper presents the first ever amphibious transition from walking to swimming.

IROS Conference 2004 Conference Paper

AQUA: an aquatic walking robot

Christina Georgiades
Andrew German
Andrew Hogue
Hui Liu
Chris Prahacs
Arlene Ripsman
Robert Sim
Luz Abril Torres-Méndez

This paper describes an underwater walking robotic system being developed under the name AQUA, the goals of the AQUA project, the overall hardware and software design, the basic hardware and sensor packages that have been developed, and some initial experiments. The robot is based on the RHex hexapod robot and uses a suite of sensing technologies, primarily based on computer vision and INS, to allow it to navigate and map clear shallow-water environments. The sensor-based navigation and mapping algorithms are based on the use of both artificial floating visual and acoustic landmarks as well as on naturally occurring underwater landmarks and trinocular stereo.

IROS Conference 2000 Conference Paper

The paparazzi problem

Michael Jenkin
Gregory Dudek

Multiple mobile robots, or robot collectives, have been proposed as solutions to various tasks in which distributed sensing and action are required. Here we consider applying a collective of robots to the paparazzi problem - the problem of providing sensor coverage of a target robot. We demonstrate how the computational task of the collective can be formulated as a global energy minimization task over the entire collective, and show how individual members of the collective can solve the task in a distributed fashion so that the entire collective meets its goal. This result is then extended to consider unbounded communication delays between members and complete failure of individual members of the collective.

IROS Conference 1998 Conference Paper

Computation of stereo disparity for space materials

Michael Jenkin
Piotr Jasiobedzki

One of the challenges facing computer vision systems used in space is the presence of specular surfaces. Such surfaces lead to several adverse effects such as the creation of reflected "virtual" images of objects due to specular reflections. These effects may lead to incorrect measurements and loss of data in the case of sensor saturation or inadequate intensity of the returned laser beams in the case of an active illuminant. In addition, the instruments inside space structures such as satellites may be extremely sensitive to active illuminants such as laser beams or radar signals, and thus passive vision systems which rely on either natural or low-power projection systems are preferred over active sensing technologies. Here we consider the task of recovering the local surface structure of highly specular surfaces such as satellites using passive stereopsis without resulting to the introduction of additional light source.

IROS Conference 1997 Conference Paper

A probability-based approach to model-based path planning

Iraj Mantegh
Michael Jenkin
Andrew A. Goldenberg

By capitalizing on the known properties of harmonic potential functions this work develops a new approach to probability-based path planning that is intuitive, free from local traps (local minima) and computationally less complex than many existing methods. Although the approach presented here is based on the hill-climbing method, it is still able to guarantee goal attainment. Furthermore the algorithm presented here is able to handle arbitrary-shaped geometries and does not require any geometrical or topological approximation at the environment representation level.

IROS Conference 1995 Conference Paper

Experiments in sensing and communication for robot convoy navigation

Gregory Dudek
Michael Jenkin
Evangelos E. Milios
David Wilkes

This paper deals with coordinating behaviour in a multi-autonomous robot system. When two or more autonomous robots must interact in order to accomplish some common goal, communication between the robots is essential. Different inter-robot communications strategies give rise to different overall system performance and reliability. After a brief consideration of some theoretical approaches to multiple robot collections, we present concrete implementations of different strategies for convoy-like behaviour. The convoy system is based around two RWI B12 mobile robots and uses only passive visual sensing for inter-robot communication. The issues related to different communication strategies are considered.

IROS Conference 1994 Conference Paper

ARK: autonomous mobile robot for an industrial environment

Michael Jenkin
N. Bains
J. Bruce
T. Campbell
Brian Down
Piotr Jasiobedzki
Allan D. Jepson
B. Majarais

This paper describes research on the ARK (Autonomous Mobile Robot in a Known Environment) project. The technical objective of the project is to build a robot that can navigate and carry out survey/inspection tasks in a complex but known industrial environment. Rather than altering the robots environment by adding easily identifiable beacons the robot relies on naturally occurring objects to use as visual landmarks for navigation. The robot is equipped with various sensors that are used to detect unmapped obstacles, landmarks and objects. This paper describes the robot's industrial environment, it's control architecture, and some results in processing the robot's range and vision sensor data for navigation. >

IROS Conference 1993 Conference Paper

A taxonomy for swarm robots

Gregory Dudek
Michael Jenkin
Evangelos E. Milios
David Wilkes

In many cases several mobile robots (autonomous agents) can be used together to accomplish tasks that would be either more difficult or impossible for a robot acting alone. Many different models have been suggested for the makeup of such collections of robots. In this paper the authors present a taxonomy of the different ways in which such a collection of autonomous robotic agents can be structured. It is shown that certain swarms provide little or no advantage over having a single robot, while other swarms can obtain better than linear speedup over a single robot. There exist both trivial and non-trivial problems for which a swarm of robots can succeed where a single robot will fail. Swarms are more than just networks of independent processors - they are potentially reconfigurable networks of communicating agents capable of coordinated sensing and interaction with the environment.

IROS Conference 1993 Conference Paper

Global navigation for ARK

Michael Jenkin
Evangelos E. Milios
Piotr Jasiobedzki
N. Bains
K. Tran

ARK (Autonomous Robot for a Known environment), is a visually-guided mobile robot which is being constructed as part of the Precarn project in mobile robotics. ARK operates in a previously mapped environment and navigates with respect to visual landmarks that have been previously located. While the robot moves, it utilizes an active vision sensor to register the robot with respect to these landmarks. As the landmarks may be scarce in certain regions of its environment, ARK plans paths which minimize both path length and path uncertainty. The global path planner assumes that the robot will use a Kalman filter to integrate landmark information with odometry data to correct path deviations as the robot moves, and then uses this information to choose a path which reduces the expected path deviation.