Author name cluster

Ben Upcroft

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

32 papers

1 author row

ICRA Conference 2017 Conference Paper

Action recognition: From static datasets to moving robots

Fahimeh Rezazadegan
Sareh Shirazi
Ben Upcroft
Michael Milford

Deep learning models have achieved state-of-the-art performance in recognizing human activities, but often rely on utilizing background cues present in typical computer vision datasets that predominantly have a stationary camera. If these models are to be employed by autonomous robots in real world environments, they must be adapted to perform independently of background cues and camera motion effects. To address these challenges, we propose a new method that firstly generates generic action region proposals with good potential to locate one human action in unconstrained videos regardless of camera motion and then uses action proposals to extract and classify effective shape and motion features by a ConvNet framework. In a range of experiments, we demonstrate that by actively proposing action regions during both training and testing, state-of-the-art or better performance is achieved on benchmarks. We show the outperformance of our approach compared to the state-of-the-art in two new datasets; one emphasizes on irrelevant background, the other highlights the camera motion. We also validate our action recognition method in an abnormal behavior detection scenario to improve workplace safety. The results verify a higher success rate for our method due to the ability of our system to recognize human actions regardless of environment and camera motion.

ICRA Conference 2017 Conference Paper

Deep learning features at scale for visual place recognition

Zetao Chen
Adam Jacobson
Niko Sünderhauf
Ben Upcroft
Lingqiao Liu
Chunhua Shen
Ian D. Reid 0001
Michael Milford

The success of deep learning techniques in the computer vision domain has triggered a range of initial investigations into their utility for visual place recognition, all using generic features from networks that were trained for other types of recognition tasks. In this paper, we train, at large scale, two CNN architectures for the specific place recognition task and employ a multi-scale feature encoding method to generate condition- and viewpoint-invariant features. To enable this training to occur, we have developed a massive Specific PlacEs Dataset (SPED) with hundreds of examples of place appearance change at thousands of different places, as opposed to the semantic place type datasets currently available. This new dataset enables us to set up a training regime that interprets place recognition as a classification problem. We comprehensively evaluate our trained networks on several challenging benchmark place recognition datasets and demonstrate that they achieve an average 10% increase in performance over other place recognition algorithms and pre-trained CNNs. By analyzing the network responses and their differences from pre-trained networks, we provide insights into what a network learns when training for place recognition, and what these results signify for future research in this area.

ICRA Conference 2017 Conference Paper

The ACRV picking benchmark: A robotic shelf picking benchmark to foster reproducible research

Jürgen Leitner
Adam W. Tow
Niko Sünderhauf
Jake E. Dean
Joseph W. Durham
Matthew Cooper 0005
Markus Eich
Chris Lehnert

Robotic challenges like the Amazon Picking Challenge (APC) or the DARPA Challenges are an established and important way to drive scientific progress. They make research comparable on a well-defined benchmark with equal test conditions for all participants. However, such challenge events occur only occasionally, are limited to a small number of contestants, and the test conditions are very difficult to replicate after the main event. We present a new physical benchmark challenge for robotic picking: the ACRV Picking Benchmark. Designed to be reproducible, it consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils. A well-defined evaluation protocol enables the comparison of complete robotic systems - including perception and manipulation - instead of sub-systems only. Our paper also describes and reports results achieved by an open baseline system based on a Baxter robot.

ICRA Conference 2016 Conference Paper

Alextrac: Affinity learning by exploring temporal reinforcement within association chains

Alex Bewley
Lionel Ott
Fabio Ramos 0001
Ben Upcroft

This paper presents a self-supervised approach for learning to associate object detections in a video sequence as often required in tracking-by-detection systems. In this paper we focus on learning an affinity model to estimate the data association cost, which can adapt to different situations by exploiting the sequential nature of video data. We also propose a framework for gathering additional training samples at test time with high variation in visual appearance, naturally inherent in large temporal windows. Reinforcing the model with these difficult samples greatly improves the affinity model compared to standard similarity measures such as cosine similarity. We experimentally demonstrate the efficacy of the resulting affinity model on several multiple object tracking (MOT) benchmark sequences. Using the affinity model alone places this approach in the top 25 state-of-the-art trackers with an average rank of 21. 3 across 11 test sequences and an overall multiple object tracking accuracy (MOTA) of 17%. This is considerable as our simple approach only uses the appearance of the detected regions in contrast to other techniques with global optimisation or complex motion models.

ICRA Conference 2016 Conference Paper

Find my office: Navigating real space from semantic descriptions

Ben Talbot
Obadiah Lam
Ruth Schulz
Feras Dayoub
Ben Upcroft
Gordon F. Wyeth

This paper shows that by using only symbolic language phrases, a mobile robot can purposefully navigate to specified rooms in previously unexplored environments. The robot intelligently organises a symbolic language description of the unseen environment and “imagines” a representative map, called the abstract map. The abstract map is an internal representation of the topological structure and spatial layout of symbolically defined locations. To perform goal-directed exploration, the abstract map creates a high-level semantic plan to reason about spaces beyond the robot's known world. While completing the plan, the robot uses the metric guidance provided by a spatial layout, and grounded observations of door labels, to efficiently guide its navigation. The system is shown to complete exploration in unexplored spaces by travelling only 13. 3% further than the optimal path.

IROS Conference 2016 Conference Paper

High-fidelity simulation for evaluating robotic vision performance

John Skinner
Sourav Garg
Niko Sünderhauf
Peter Corke
Ben Upcroft
Michael Milford

Robotic vision, unlike computer vision, typically involves processing a stream of images from a camera with time varying pose operating in an environment with time varying lighting conditions and moving objects. Repeating robotic vision experiments under identical conditions is often impossible, making it difficult to compare different algorithms. For machine learning applications a critical bottleneck is the limited amount of real world image data that can be captured and labelled for both training and testing purposes. In this paper we investigate the use of a photo-realistic simulation tool to address these challenges, in three specific domains: robust place recognition, visual SLAM and object recognition. For the first two problems we generate images from a complex 3D environment with systematically varying camera paths, camera viewpoints and lighting conditions. For the first time we are able to systematically characterise the performance of these algorithms as paths and lighting conditions change. In particular, we are able to systematically generate varying camera viewpoint datasets that would be difficult or impossible to generate in the real world. We also compare algorithm results for a camera in a real environment and a simulated camera in a simulation model of that real environment. Finally, for the object recognition domain, we generate labelled image data and characterise the viewpoint dependency of a current convolution neural network in performing object recognition. Together these results provide a multi-domain demonstration of the beneficial properties of using simulation to characterise and analyse a wide range of robotic vision algorithms.

ICRA Conference 2016 Conference Paper

Place categorization and semantic mapping on a mobile robot

Niko Sünderhauf
Feras Dayoub
Sean McMahon
Ben Talbot
Ruth Schulz
Peter Corke
Gordon F. Wyeth
Ben Upcroft

In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks. The system is made available to the community as a ROS module.

ICRA Conference 2016 Conference Paper

Sweet pepper pose detection and grasping for automated crop harvesting

Chris Lehnert
Inkyu Sa
Chris McCool
Ben Upcroft
Tristan Perez

This paper presents a method for estimating the 6DOF pose of sweet-pepper (capsicum) crops for autonomous harvesting via a robotic manipulator. The method uses the Kinect Fusion algorithm to robustly fuse RGB-D data from an eye-in-hand camera combined with a colour segmentation and clustering step to extract an accurate representation of the crop. The 6DOF pose of the sweet peppers is then estimated via a nonlinear least squares optimisation by fitting a superellipsoid to the segmented sweet pepper. The performance of the method is demonstrated on a real 6DOF manipulator with a custom gripper. The method is shown to estimate the 6DOF pose successfully enabling the manipulator to grasp sweet peppers for a range of different orientations. The results obtained improve largely on the performance of grasping when compared to a naive approach, which does not estimate the orientation of the crop.

ICRA Conference 2016 Conference Paper

Visual detection of occluded crop: For automated harvesting

Chris McCool
Inkyu Sa
Feras Dayoub
Chris Lehnert
Tristan Perez
Ben Upcroft

This paper presents a novel crop detection system applied to the challenging task of field sweet pepper (capsicum) detection. The field-grown sweet pepper crop presents several challenges for robotic systems such as the high degree of occlusion and the fact that the crop can have a similar colour to the background (green on green). To overcome these issues, we propose a two-stage system that performs per-pixel segmentation followed by region detection. The output of the segmentation is used to search for highly probable regions and declares these to be sweet pepper. We propose the novel use of the local binary pattern (LBP) to perform crop segmentation. This feature improves the accuracy of crop segmentation from an AUC of 0. 10, for previously proposed features, to 0. 56. Using the LBP feature as the basis for our two-stage algorithm, we are able to detect 69. 2% of field grown sweet peppers in three sites. This is an impressive result given that the average detection accuracy of people viewing the same colour imagery is 66. 8%.

IROS Conference 2015 Conference Paper

Learning crop models for vision-based guidance of agricultural robots

Andrew English
Patrick Ross
David Ball
Ben Upcroft
Peter Corke

This paper describes a vision-based method of guiding autonomous vehicles within crop rows in agricultural fields where the crop rows are challenging to detect or their appearance is not known a-priori. The location of the crop rows is estimated with an SVM regression algorithm using colour, texture and 3D structure descriptors from a forward facing stereo camera pair. Our system rapidly learns a model online with minimal user input, and then uses this model to track crop rows. Results demonstrate our method is able to learn and track a wide variety of crops with an RMS error of less than 3cm. We also present online control results demonstrating our system autonomously steering a robot for 3km.

IROS Conference 2015 Conference Paper

On the performance of ConvNet features for place recognition

Niko Sünderhauf
Sareh Shirazi
Feras Dayoub
Ben Upcroft
Michael Milford

After the incredible success of deep learning in the computer vision domain, there has been much interest in applying Convolutional Network (ConvNet) features in robotic fields such as visual navigation and SLAM. Unfortunately, there are fundamental differences and challenges involved. Computer vision datasets are very different in character to robotic camera data, real-time performance is essential, and performance priorities can be different. This paper comprehensively evaluates and compares the utility of three state-of-the-art ConvNets on the problems of particular relevance to navigation for robots; viewpoint-invariance and condition-invariance, and for the first time enables real-time place recognition performance using ConvNets with large maps by integrating a variety of existing (locality-sensitive hashing) and novel (semantic search space partitioning) optimization techniques. We present extensive experiments on four real world datasets cultivated to evaluate each of the specific challenges in place recognition. The results demonstrate that speed-ups of two orders of magnitude can be achieved with minimal accuracy degradation, enabling real-time performance. We confirm that networks trained for semantic place categorization also perform better at (specific) place recognition when faced with severe appearance changes and provide a reference for which networks and layers are optimal for different aspects of the place recognition problem.

ICRA Conference 2015 Conference Paper

Online novelty-based visual obstacle detection for field robotics

Patrick Ross
Andrew English
David Ball
Ben Upcroft
Peter Corke

This paper presents a novel online unsupervised vision system for obstacle detection in field environments which detects many obstacles pathological to appearance- or structure-only obstacle detection systems. Robust obstacle detection in field environments is challenging as it is infeasible to train on all possible obstacles in all conditions, and many obstacles are camouflaged in their appearance or structure. The proposed system combines novelty in structure and appearance cues to detect obstacles, can adapt over time to changes in the environment, and is suitable for long-term operation over changing lighting conditions in various environments. After an initial learning period the method exhibits very few false positives, while successfully detecting most obstacles over both daytime and nighttime datasets including challenging obstacles such as a person lying down in grass.

ICRA Conference 2015 Conference Paper

Robot navigation using human cues: A robot navigation system for symbolic goal-directed exploration

Ruth Schulz
Ben Talbot
Obadiah Lam
Feras Dayoub
Peter Corke
Ben Upcroft
Gordon F. Wyeth

In this paper we present for the first time a complete symbolic navigation system that performs goal-directed exploration to unfamiliar environments on a physical robot. We introduce a novel construct called the abstract map to link provided symbolic spatial information with observed symbolic information and actual places in the real world. Symbolic information is observed using a text recognition system that has been developed specifically for the application of reading door labels. In the study described in this paper, the robot was provided with a floor plan and a destination. The destination was specified by a room number, used both in the floor plan and on the door to the room. The robot autonomously navigated to the destination using its text recognition, abstract map, mapping, and path planning systems. The robot used the symbolic navigation system to determine an efficient path to the destination, and reached the goal in two different real-world environments. Simulation results show that the system reduces the time required to navigate to a goal when compared to random exploration.

ICRA Conference 2015 Conference Paper

TriggerSync: A time synchronisation tool

Andrew English
Patrick Ross
David Ball
Ben Upcroft
Peter Corke

This paper presents a framework for synchronising multiple triggered sensors with respect to a local clock using standard computing hardware. Providing sensor measurements with accurate and meaningful timestamps is important for many sensor fusion, state estimation and control applications. Accurately synchronising sensor timestamps can be performed with specialised hardware, however, performing sensor synchronisation using standard computing hardware and non-real-time operating systems is difficult due to inaccurate and temperature sensitive clocks, variable communication delays and operating system scheduling delays. Results show the ability of our framework to estimate time offsets to sub-millisecond accuracy. We also demonstrate how synchronising timestamps with our framework results in a tenfold reduction in image stabilisation error for a vehicle driving on rough terrain. The source code will be released as an open source tool for time synchronisation in ROS.

ICRA Conference 2014 Conference Paper

Lighting invariant urban street classification

Ben Upcroft
Colin McManus
Winston Churchill
Will Maddern
Paul Newman 0001

In this paper we propose the hybrid use of illuminant invariant and RGB images to perform image classification of urban scenes despite challenging variation in lighting conditions. Coping with lighting change (and the shadows thereby invoked) is a non-negotiable requirement for long term autonomy using vision. One aspect of this is the ability to reliably classify scene components in the presence of marked and often sudden changes in lighting. This is the focus of this paper. Posed with the task of classifying all parts in a scene from a full colour image, we propose that lighting invariant transforms can reduce the variability of the scene, resulting in a more reliable classification. We leverage the ideas of “data transfer” for classification, beginning with full colour images for obtaining candidate scene-level matches using global image descriptors. This is commonly followed by superpixellevel matching with local features. However, we show that if the RGB images are subjected to an illuminant invariant transform before computing the superpixel-level features, classification is significantly more robust to scene illumination effects. The approach is evaluated using three datasets. The first being our own dataset and the second being the KITTI dataset using manually generated ground truth for quantitative analysis. We qualitatively evaluate the method on a third custom dataset over a 750m trajectory.

ICRA Conference 2014 Conference Paper

Multiple map hypotheses for planning and navigating in non-stationary environments

Timothy Morris
Feras Dayoub
Peter Corke
Gordon F. Wyeth
Ben Upcroft

This paper presents a method to enable a mobile robot working in non-stationary environments to plan its path and localize within multiple map hypotheses simultaneously. The maps are generated using a long-term and short-term memory mechanism that ensures only persistent configurations in the environment are selected to create the maps. In order to evaluate the proposed method, experimentation is conducted in an office environment. Compared to navigation systems that use only one map, our system produces superior path planning and navigation in a non-stationary environment where paths can be blocked periodically, a common scenario which poses significant challenges for typical planners.

ICRA Conference 2014 Conference Paper

Novelty-based visual obstacle detection in agriculture

Patrick Ross
Andrew English
David Ball
Ben Upcroft
Gordon F. Wyeth
Peter Corke

This paper describes a novel obstacle detection system for autonomous robots in agricultural field environments that uses a novelty detector to inform stereo matching. Stereo vision alone erroneously detects obstacles in environments with ambiguous appearance and ground plane such as in broad-acre crop fields with harvested crop residue. The novelty detector estimates the probability density in image descriptor space and incorporates image-space positional understanding to identify potential regions for obstacle detection using dense stereo matching. The results demonstrate that the system is able to detect obstacles typical to a farm at day and night. This system was successfully used as the sole means of obstacle detection for an autonomous robot performing a long term two hour coverage task travelling 8. 5 km.

ICRA Conference 2014 Conference Paper

Online self-supervised multi-instance segmentation of dynamic objects

Alex Bewley
Vitor Guizilini
Fabio Ramos 0001
Ben Upcroft

This paper presents a method for the continuous segmentation of dynamic objects using only a vehicle mounted monocular camera without any prior knowledge of the object's appearance. Prior work in online static/dynamic segmentation [1] is extended to identify multiple instances of dynamic objects by introducing an unsupervised motion clustering step. These clusters are then used to update a multi-class classifier within a self-supervised framework. In contrast to many tracking-by-detection based methods, our system is able to detect dynamic objects without any prior knowledge of their visual appearance shape or location. Furthermore, the classifier is used to propagate labels of the same object in previous frames, which facilitates the continuous tracking of individual objects based on motion. The proposed system is evaluated using recall and false alarm metrics in addition to a new multi-instance labelled dataset to measure the performance of segmenting multiple instances of objects.

IROS Conference 2014 Conference Paper

Simultaneous localization and planning on multiple map hypotheses

Timothy Morris
Feras Dayoub
Peter Corke
Ben Upcroft

This paper presents a novel method to rank map hypotheses by the quality of localization they afford. The highest ranked hypothesis at any moment becomes the active representation that is used to guide the robot to its goal location. A single static representation is insufficient for navigation in dynamic environments where paths can be blocked periodically, a common scenario which poses significant challenges for typical planners. In our approach we simultaneously rank multiple map hypotheses by the influence that localization in each of them has on locally accurate odometry. This is done online for the current locally accurate window by formulating a factor graph of odometry relaxed by localization constraints. Comparison of the resulting perturbed odometry of each hypothesis with the original odometry yields a score that can be used to rank map hypotheses by their utility. We deploy the proposed approach on a real robot navigating a structurally noisy office environment. The configuration of the environment is physically altered outside the robots sensory horizon during navigation tasks to demonstrate the proposed approach of hypothesis selection.

IROS Conference 2013 Conference Paper

Nonparametric semantic segmentation for 3D street scenes

Hu He
Ben Upcroft

In this paper we propose a method to generate a large scale and accurate dense 3D semantic map of street scenes. A dense 3D semantic model of the environment can significantly improve a number of robotic applications such as autonomous driving, navigation or localisation. Instead of using offline trained classifiers for semantic segmentation, our approach employs a data-driven, nonparametric method to parse scenes which easily scale to a large environment and generalise to different scenes. We use stereo image pairs collected from cameras mounted on a moving car to produce dense depth maps which are combined into a global 3D reconstruction using camera poses from stereo visual odometry. Simultaneously, 2D automatic semantic segmentation using a nonparametric scene parsing method is fused into the 3D model. Furthermore, the resultant 3D semantic model is improved with the consideration of moving objects in the scene. We demonstrate our method on the publicly available KITTI dataset and evaluate the performance against manually generated ground truth.

ICRA Conference 2013 Conference Paper

Online calibration of stereo rigs for long-term autonomy

Michael Warren
David McKinnon
Ben Upcroft

Stereo-based visual odometry algorithms are heavily dependent on an accurate calibration of the rigidly fixed stereo pair. Even small shifts in the rigid transform between the cameras can impact on feature matching and 3D scene triangulation, adversely affecting pose estimates and applications dependent on long-term autonomy. In many field-based scenarios where vibration, knocks and pressure change affect a robotic vehicle, maintaining an accurate stereo calibration cannot be guaranteed over long periods. This paper presents a novel method of recalibrating overlapping stereo camera rigs from online visual data while simultaneously providing an up-to-date and up-to-scale pose estimate. The proposed technique implements a novel form of partitioned bundle adjustment that explicitly includes the homogeneous transform between a stereo camera pair to generate an optimal calibration. Pose estimates are computed in parallel to the calibration, providing online recalibration which seamlessly integrates into a stereo visual odometry framework. We present results demonstrating accurate performance of the algorithm on both simulated scenarios and real data gathered from a wide-baseline stereo pair on a ground vehicle traversing urban roads. I.

IROS Conference 2013 Conference Paper

Robust scale initialization for long-range stereo visual odometry

Michael Warren
Ben Upcroft

Achieving a robust, accurately scaled pose estimate in long-range stereo presents significant challenges. For large scene depths, triangulation from a single stereo pair is inadequate and noisy. Additionally, vibration and flexible rigs in airborne applications mean accurate calibrations are often compromised. This paper presents a technique for accurately initializing a long-range stereo VO algorithm at large scene depth, with accurate scale, without explicitly computing structure from rigidly fixed camera pairs. By performing a monocular pose estimate over a window of frames from a single camera, followed by adding the secondary camera frames in a modified bundle adjustment, an accurate, metrically scaled pose estimate can be found. To achieve this the scale of the stereo pair is included in the optimization as an additional parameter. Results are presented both on simulated and field gathered data from a fixed-wing UAV flying at significant altitude, where the epipolar geometry is inaccurate due to structural deformation and triangulation from a single pair is insufficient. Comparisons are made with more conventional VO techniques where the scale is not explicitly optimized, and demonstrated over repeated trials to indicate robustness.

IROS Conference 2013 Conference Paper

Vision-only autonomous navigation using topometric maps

Feras Dayoub
Timothy Morris
Ben Upcroft
Peter Corke

This paper presents a mapping and navigation system for a mobile robot, which uses vision as its sole sensor modality. The system enables the robot to navigate autonomously, plan paths and avoid obstacles using a vision based topometric map of its environment. The map consists of a globally-consistent pose-graph with a local 3D point cloud attached to each of its nodes. These point clouds are used for direction independent loop closure and to dynamically generate 2D metric maps for locally optimal path planning. Using this locally semi-continuous metric space, the robot performs shortest path planning instead of following the nodes of the graph — as is done with most other vision-only navigation approaches. The system exploits the local accuracy of visual odometry in creating local metric maps, and uses pose graph SLAM, visual appearance-based place recognition and point clouds registration to create the topometric map. The ability of the framework to sustain vision-only navigation is validated experimentally, and the system is provided as open-source software.

ICRA Conference 2012 Conference Paper

A semi-local method for iterative depth-map refinement

David McKinnon
Ryan N. Smith
Ben Upcroft

Building a photorealistic, 3D model of an object or a complete scene from image-based methods is a fundamental problem in computer vision, and has many applications in robotic perception, navigation, exploration and mapping. In this paper, we extend current state-of-the-art in the computation of depth maps by presenting an accurate and computationally efficient iterative hierarchical algorithm for multi-view stereo. The algorithm is designed to utilise all available contextual information to compute highly-accurate and robust depth maps by iteratively examining different image resolutions in an image-pyramid. The novelty in our approach is that we are able to incrementally improve the depth fidelity as the algorithm progresses through the image pyramid by utilising a local method. This is achieved in a computationally efficient manner by simultaneously enforcing the consistency of the depth-map by continual comparison with neighbouring depth-maps. We present a detailed description of the algorithm, and describe how each step is carried out. The proposed technique is used to analyse multi-view stereo data from two well-known, standard datasets, and presented results show a significant decrease in computation time, as well as an increase in overall accuracy of the computed depth maps.

ICRA Conference 2011 Conference Paper

Real-time volume estimation of a dragline payload

Alex Bewley
Rajiv Shekhar
Sam Leonard
Ben Upcroft
Paul Lever

This paper presents a method for measuring the in-bucket payload volume on a dragline excavator for the purpose of estimating the material's bulk density in real-time. Knowledge of the payload's bulk density can provide feedback to mine planning and scheduling to improve blasting and therefore provide a more uniform bulk density across the excavation site. This allows a single optimal bucket size to be used for maximum overburden removal per dig and in turn reduce costs and emissions in dragline operation and maintenance. The proposed solution uses a range bearing laser to locate and scan full buckets between the lift and dump stages of the dragline cycle. The bucket is segmented from the scene using cluster analysis, and the pose of the bucket is calculated using the Iterative Closest Point (ICP) algorithm. Payload points are identified using a known model and subsequently converted into a height grid for volume estimation. Results from both scaled and full scale implementations show that this method can achieve an accuracy of above 95%.

ICRA Conference 2007 Conference Paper

Building a Software Architecture for a Human-Robot Team Using the Orca Framework

Tobias Kaupp
Alex Brooks
Ben Upcroft
Alexei Makarenko

This paper considers the problem of building a software architecture for a human-robot team. The objective of the team is to build a multi-attribute map of the world by performing information fusion. A decentralized approach to information fusion is adopted to achieve the system properties of scalability and survivability. Decentralization imposes constraints on the design of the architecture and its implementation. We show how a component-based software engineering approach can address these constraints. The architecture is implemented using Orca - a component-based software framework for robotic systems. Experimental results from a deployed system comprised of an unmanned air vehicle, a ground vehicle, and two human operators are presented. A section on the lessons learned is included which may be applicable to other distributed systems with complex algorithms. We also compare Orca to the player software framework in the context of distributed systems.

IROS Conference 2006 Conference Paper

A decentralised particle filtering algorithm for multi-target tracking across multiple flight vehicles

Lee-Ling S. Ong
Ben Upcroft
Tim Bailey
Matthew Ridley
Salah Sukkarieh
Hugh F. Durrant-Whyte

This paper presents a decentralised particle filtering algorithm that enables multiple vehicles to jointly track 3D features under limited communication bandwidth. This algorithm, applied within a decentralised data fusion (DDF) framework, deals with correlated estimation errors due to common past information when fusing two discrete particle sets. Our solution is to transform the particles into Gaussian mixture models (GMMs) for communication and fusion. Not only can decentralised fusion be approximated by GMMs, but this representation also provides summaries of the particle set. Less bandwidth per communication step is required to communicate a GMM than the particle set itself hence conversion to GMMs for communication is an advantage. Real airborne data is used to demonstrate the accuracy of our decentralised particle filtering algorithm for airborne tracking and mapping

ICRA Conference 2006 Conference Paper

Gaussian Process Models for Sensor-centric Robot Localisation

Alex Brooks
Alexei Makarenko
Ben Upcroft

This paper presents an approach to building an observation likelihood function from a set of sparse, noisy training observations taken from known locations by a sensor with no obvious geometric model. The basic approach is to fit an interpolant to the training data, representing the expected observation, and to assume additive sensor noise. This paper takes a Bayesian view of the problem, maintaining a posterior over interpolants rather than simply the maximum-likelihood interpolant, giving a measure of uncertainty in the map at any point. This is done using a Gaussian process framework. To validate the approach experimentally, a model of an environment is built using observations from an omni-directional camera. After a model has been built from the training data, a particle filter is used to localise while traversing this environment

IROS Conference 2006 Conference Paper

Hierarchical Environment Model for Fusing Information from Human Operators and Robots

Tobias Kaupp
Bertrand Douillard
Ben Upcroft
Alexei Makarenko

This paper considers the problem of building environment models by fusing information gathered by robotic platforms with human perceptual information. Rich environment models are required in real applications for both autonomous operation of robots and to support human decision making. Hierarchical models are well suited to represent complex environments because they: offer multiple abstractions of the available information to support analysis and decision-making, and permit the incorporation of higher-level human observations. The contributions of this paper are two-fold: (1) development of a probabilistic three-level environment model for distributed information gathering, and (2) experimental demonstration of fully decentralized, cooperative human-robot information gathering using an outdoor sensor network comprised of an unmanned air vehicle, a ground vehicle, and two human operators. Several information exchange patterns are presented which qualitatively demonstrate human-robot information fusion

IROS Conference 2006 Conference Paper

Recognising and Segmenting Objects in Natural Environments

Fabio Ramos 0001
Ben Upcroft
Suresh Kumar
Hugh F. Durrant-Whyte

This paper presents an algorithm for recognition and segmentation of natural features in unstructured environments. By providing a Bayesian solution for the density estimation problem, the algorithm needs significantly less training data than conventional techniques and is applicable to different environments. The algorithm is based on colour and wavelet convolution of image patches to model the information contained in natural features. Dimensionality reduction techniques are applied to map data points to a lower dimensional space where Bayesian density estimation is computed. Experiments were performed in underwater, aerial and terrestrial domains demonstrating the accuracy and generalisation properties of the algorithm for recognition and segmentation. Comparisons with conventional density estimation techniques are provided to illustrate the benefits of the new approach

IROS Conference 2005 Conference Paper

A statistical framework for natural feature representation

Suresh Kumar
Fabio Ramos 0001
Ben Upcroft
Hugh F. Durrant-Whyte

This paper presents a robust stochastic framework for the incorporation of visual observations into conventional estimation, data fusion, navigation and control algorithms. The representation combines Isomap, a non-linear dimensionality reduction algorithm, with expectation maximization, a statistical learning scheme. The joint probability distribution of this representation is computed offline based on existing training data. The training phase of the algorithm results in a nonlinear and non-Gaussian likelihood model of natural features conditioned on the underlying visual states. This generative model can be used online to instantiate likelihoods corresponding to observed visual features in real-time. The instantiated likelihoods are expressed as a Gaussian mixture model and are conveniently integrated within existing non-linear filtering algorithms. Example applications based on real visual data from heterogenous, unstructured environments demonstrate the versatility of the generative models.

IROS Conference 2005 Conference Paper

Operators as information sources in sensor networks

Tobias Kaupp
Alexei Makarenko
Suresh Kumar
Ben Upcroft
Stefan B. Williams

This paper presents an approach of integrating human operators into a sensor network formed by a heterogeneous team of unmanned air and ground vehicles. Several objectives of human-network interaction are identified. The main focus of this work is on human-to-network information flow, i. e. human operators are regarded as information sources. It is argued that operators should make raw observations which are converted into the sensor network's common representation by a probabilistic model. The concepts are discussed in the context of an outdoor sensor network under development. Human operators contribute geometric feature information in the form of range and bearing observations. Visual feature properties are specified via meaningful class labels. A sensor model, represented as a Bayesian network, translates label observations into the system's representation. The model is also used to classify features as observed by robotic sensors.