Arrow Research search

Author name cluster

Xiaowei Shao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers
2 author rows

Possible papers

15

AAAI Conference 2021 Conference Paper

Social-DPF: Socially Acceptable Distribution Prediction of Futures

  • Xiaodan Shi
  • Xiaowei Shao
  • Guangming Wu
  • Haoran Zhang
  • Zhiling Guo
  • Renhe Jiang
  • Ryosuke Shibasaki

We consider long-term path forecasting problems in crowds, where future sequence trajectories are generated given a short observation. Recent methods for this problem have focused on modeling social interactions and predicting multi-modal futures. However, it is not easy for machines to successfully consider social interactions, such as avoiding collisions while considering the uncertainty of futures under a highly interactive and dynamic scenario. In this paper, we propose a model that incorporates multiple interacting motion sequences jointly and predicts multi-modal socially acceptable distributions of futures. Specifically, we introduce a new aggregation mechanism for social interactions, which selectively models long-term inter-related dynamics between movements in a shared environment through a message passing mechanism. Moreover, we propose a loss function that not only accesses how accurate the estimated distributions of the futures are but also considers collision avoidance. We further utilize mixture density functions to describe the trajectories and learn multi-modality of future paths. Extensive experiments over several trajectory prediction benchmarks demonstrate that our method is able to forecast socially acceptable distributions in complex scenarios.

AAAI Conference 2020 Conference Paper

Multimodal Interaction-Aware Trajectory Prediction in Crowded Space

  • Xiaodan Shi
  • Xiaowei Shao
  • Zipei Fan
  • Renhe Jiang
  • Haoran Zhang
  • Zhiling Guo
  • Guangming Wu
  • Wei Yuan

Accurate human path forecasting in complex and crowded scenarios is critical for collision avoidance of autonomous driving and social robots navigation. It still remains as a challenging problem because of dynamic human interaction and intrinsic multimodality of human motion. Given the observation, there is a rich set of plausible ways for an agent to walk through the circumstance. To address those issues, we propose a spatio-temporal model that can aggregate the information from socially interacting agents and capture the multimodality of the motion patterns. We use mixture density functions to describe the human path and predict the distribution of future paths with explicit density. To integrate more factors to model interacting people, we further introduce a coordinate transformation to represent the relative motion between people. Extensive experiments over several trajectory prediction benchmarks demonstrate that our method is able to forecast various plausible futures in complex scenarios and achieves state-of-the-art performance.

TIST Journal 2015 Journal Article

From RGB-D Images to RGB Images

  • Quanshi Zhang
  • Xuan Song
  • Xiaowei Shao
  • Huijing Zhao
  • Ryosuke Shibasaki

Mining object-level knowledge, that is, building a comprehensive category model base, from a large set of cluttered scenes presents a considerable challenge to the field of artificial intelligence. How to initiate model learning with the least human supervision (i.e., manual labeling) and how to encode the structural knowledge are two elements of this challenge, as they largely determine the scalability and applicability of any solution. In this article, we propose a model-learning method that starts from a single-labeled object for each category, and mines further model knowledge from a number of informally captured, cluttered scenes. However, in these scenes, target objects are relatively small and have large variations in texture, scale, and rotation. Thus, to reduce the model bias normally associated with less supervised learning methods, we use the robust 3D shape in RGB-D images to guide our model learning, then apply the properly trained category models to both object detection and recognition in more conventional RGB images. In addition to model training for their own categories, the knowledge extracted from the RGB-D images can also be transferred to guide model learning for a new category, in which only RGB images without depth information in the new category are provided for training. Preliminary testing shows that the proposed method performs as well as fully supervised learning methods.

ICRA Conference 2014 Conference Paper

Start from minimum labeling: Learning of 3D object models and point labeling from a large and complex environment

  • Quanshi Zhang
  • Xuan Song 0001
  • Xiaowei Shao
  • Huijing Zhao
  • Ryosuke Shibasaki

A large category model base can provide object-level knowledge for various perception tasks of the intelligent vehicle system. The automatic and efficient construction of such a model base is highly desirable but challenging. This paper presents a novel semi-supervised approach to discover possible prototype models of 3D object structures from the point cloud of a large and complex environment, given a limited number of seeds in an object category. Our method incrementally trains the models while simultaneously collecting object samples. Considering the bias problem of model learning caused by bias accumulation in a sample collection, we propose to gradually differentiate the standard category model into several sub-category models to represent different intra-category structural styles. Thus, new sub-categories are discovered and modeled, old models are improved, and redundant models for similar structures are deleted iteratively during the learning process. This multiple-model strategy provides several interactive options for the category boundary to deal with the bias problem. Experimental results demonstrate the effectiveness and high efficiency of our approach to model mining from “big point cloud data”.

TIST Journal 2013 Journal Article

A fully online and unsupervised system for large and high-density area surveillance

  • Xuan Song
  • Xiaowei Shao
  • Quanshi Zhang
  • Ryosuke Shibasaki
  • Huijing Zhao
  • Jinshi Cui
  • Hongbin Zha

For reasons of public security, an intelligent surveillance system that can cover a large, crowded public area has become an urgent need. In this article, we propose a novel laser-based system that can simultaneously perform tracking, semantic scene learning, and abnormality detection in a fully online and unsupervised way. Furthermore, these three tasks cooperate with each other in one framework to improve their respective performances. The proposed system has the following key advantages over previous ones: (1) It can cover quite a large area (more than 60×35m), and simultaneously perform robust tracking, semantic scene learning, and abnormality detection in a high-density situation. (2) The overall system can vary with time, incrementally learn the structure of the scene, and perform fully online abnormal activity detection and tracking. This feature makes our system suitable for real-time applications. (3) The surveillance tasks are carried out in a fully unsupervised manner, so that there is no need for manual labeling and the construction of huge training datasets. We successfully apply the proposed system to the JR subway station in Tokyo, and demonstrate that it can cover an area of 60×35m, robustly track more than 150 targets at the same time, and simultaneously perform online semantic scene learning and abnormality detection with no human intervention.

TIST Journal 2013 Journal Article

An online system for multiple interacting targets tracking

  • Xuan Song
  • Huijing Zhao
  • Jinshi Cui
  • Xiaowei Shao
  • Ryosuke Shibasaki
  • Hongbin Zha

Multitarget tracking becomes significantly more challenging when the targets are in close proximity or frequently interact with each other. This article presents a promising online system to deal with these problems. The novelty of this system is that laser and vision are integrated with tracking and online learning to complement each other in one framework: when the targets do not interact with each other, the laser-based independent trackers are employed and the visual information is extracted simultaneously to train some classifiers online for “possible interacting targets”. When the targets are in close proximity, the classifiers learned online are used alongside visual information to assist in tracking. Therefore, this mode of cooperation not only deals with various tough problems encountered in tracking, but also ensures that the entire process can be completely online and automatic. Experimental results demonstrate that laser and vision fully display their respective advantages in our system, and it is easy for us to obtain a good trade-off between tracking accuracy and the time-cost factor.

ICRA Conference 2013 Conference Paper

Unsupervised 3D category discovery and point labeling from a large urban environment

  • Quanshi Zhang
  • Xuan Song 0001
  • Xiaowei Shao
  • Huijing Zhao
  • Ryosuke Shibasaki

The building of an object-level knowledge base is the foundation of a new methodology for many perception tasks in artificial intelligence, and is an area that has received increasing attention in recent years. In this paper, we propose, for the first time, to mine category shape patterns directly from a large urban environment, thus constructing a category structure base. Conventionally, category patterns are learned from a large collection of object samples, but automatic object collection requires prior knowledge of category structures. To solve this chicken-and-egg problem, we learn shape patterns from raw segmentations, and then refine these segmentations based on the pattern knowledge. In the process, we solve two challenging problems of knowledge mining. First, as some categories have large intra-category structure variations, we design an entropy-based method to determine the structure variation for each category, in order to establish the correct range of sample collection. Second, because incorrect segmentation is unavoidable without prior knowledge, we propose a novel unsupervised method that uses a pattern competition strategy to identify and subtract shape patterns formed by incorrectly segmented objects. This ensures that shape patterns are meaningful at the object level. Experimental results demonstrated the effectiveness of the proposed method for category structure mining in a large urban environment.

ICRA Conference 2012 Conference Paper

Laser-based intelligent surveillance and abnormality detection in extremely crowded scenarios

  • Xuan Song 0001
  • Xiaowei Shao
  • Quanshi Zhang
  • Ryosuke Shibasaki
  • Huijing Zhao
  • Hongbin Zha

Abnormal activity detection plays a crucial role in surveillance applications, and a surveillance system that can perform robustly in the extremely crowded area has become an urgent need for public security. In this paper, we propose a novel laser-based system which can simultaneously perform the tracking, semantic scene learning and abnormality detection in the large and crowded environment. In our system, a novel abnormality detection model is proposed, and it considers and combines various factors that will influence human activity. Moreover, this model intensively investigate the relationship between pedestrians' social behaviors and their walking scenarios. We successfully applied the proposed system to the JR subway station of Tokyo, which can cover a 60×35m area, robustly track more than 180 targets at the same time and simultaneously perform the online semantic scene learning and abnormality detection with no human intervention.

IROS Conference 2011 Conference Paper

3D crowd surveillance and analysis using laser range scanners

  • Xiaowei Shao
  • Huijing Zhao
  • Ryosuke Shibasaki
  • Yun Shi
  • Kiyoshi Sakamoto

In this study, we present a novel system for crowd surveillance and quantified analysis based on laser range scanners. By mounting a laser scanner at a swinging platform, the spatial information of passengers inside the area of interest can be reconstructed in a form of 3D points. Multiple laser scanners are integrated together by semi-auto calibration procedures. Background map is generated through histogram analysis of scan maps, and is further applied for 3D moving object detection. An improved version of mean-shift clustering algorithm is proposed to extract individual passengers efficiently. In addition, quantified crowdness analysis is conducted from different aspects to indicate the situation inside the surveillance area according to the extraction results of passengers. The proposed system was tested in a central subway station in Tokyo and experimental results demonstrate the effectiveness of our proposed system.

ICRA Conference 2011 Conference Paper

A novel laser-based system: Fully online detection of abnormal activity via an unsupervised method

  • Xuan Song 0001
  • Xiaowei Shao
  • Ryosuke Shibasaki
  • Huijing Zhao
  • Jinshi Cui
  • Hongbin Zha

Abnormal activity detection plays a crucial role in surveillance applications, and such system has become an urgent need for public security. In this paper, we propose a novel laser-based system, which can perform the online detection of abnormal activity with an unsupervised way. The proposed system has the following key features that make it advantageous over previous ones: (1) It can cover quite a large and crowded area, such as subway station, public square, intersection and etc. (2) The overall system can vary with time period, incrementally learn the behavior pattern of pedestrians and perform the fully online detection of abnormal activity. This feature makes our system be quite suitable for the real-time applications. (3) The abnormal activity detection is carried out with a fully unsupervised way, there is no need for manual labelling and constructing the huge training datasets. We successfully applied the proposed system into the JR subway station of Tokyo, which can cover a 60×35m area, track more 150 targets at the same time and simultaneously perform the robust detection of abnormal activity with no human intervention.

ICRA Conference 2010 Conference Paper

Fusion of laser and vision for multiple targets tracking via on-line learning

  • Xuan Song 0001
  • Huijing Zhao
  • Jinshi Cui
  • Xiaowei Shao
  • Ryosuke Shibasaki
  • Hongbin Zha

Multi-target tracking becomes significantly more challenging when the targets are in close proximity or frequently interact with each other. This paper presents a promising tracking system to deal with these problems. The novelty of this system is that laser and vision, tracking and learning are integrated and can complement each other in one framework: when the targets do not interact with each other, the laser-based independent trackers are employed and the visual information is extracted simultaneously to train some classifiers for the “possible interacting targets”. When the targets are in close proximity, the learned classifiers and visual information are used to assist in tracking. Therefore, this mode of co-operation between them not only deals with various tough problems encountered in the tracking, but also ensures that the entire process can be completely on-line and automatic. Experimental results demonstrated that laser and vision fully display their respective advantages in our system, and it is easy for us to obtain a perfect trade-off between tracking accuracy and time-cost.

ICRA Conference 2008 Conference Paper

SLAM in a dynamic large outdoor environment using a laser scanner

  • Huijing Zhao
  • Masaki Chiba
  • Ryosuke Shibasaki
  • Xiaowei Shao
  • Jinshi Cui
  • Hongbin Zha

In this research, we propose a method of SLAM in a dynamic large outdoor environment using a laser scanner. Focus are cast on solving two major problems: 1) achieving global accuracy especially in non-cyclical environment, 2) tackling a mixture of data from both dynamic and static objects. Algorithms are developed, where GPS data and control inputs are used to diagnose pose error and guide to achieve a global accuracy; Classification of laser points and objects are conducted not in an independent module but across the processing in a framework of SLAM with moving object detection and tracking. Experiments are conducted using the data from two test-bed vehicles, and performance of the algorithms are demonstrated.

IROS Conference 2007 Conference Paper

Detection and tracking of multiple pedestrians by using laser range scanners

  • Xiaowei Shao
  • Huijing Zhao
  • Katsuyuki Nakamura
  • Kyoichiro Katabira
  • Ryosuke Shibasaki
  • Yuri Nakagawa

We propose a novel system for tracking multiple pedestrians in a crowded scene by exploiting single-row laser range scanners that measure distances of surrounding objects. A walking model is built to describe the periodicity of the movement of the feet in the spatial-temporal domain, and a mean-shift clustering technique in combination with spatial- temporal correlation analysis is applied to detect pedestrians. Based on the walking model, particle filter is employed to track multiple pedestrians. Compared with camera-based methods, our system provides a novel technique to track multiple pedestrians in a relatively large area. The experiments, in which over 300 pedestrians were tracked in 5 minutes, show the validity of the proposed system.

ICRA Conference 2007 Conference Paper

Monitoring a populated environment using single-row laser range scanners from a mobile platform

  • Huijing Zhao
  • Yuzhong Chen
  • Xiaowei Shao
  • Kyoichiro Katabira
  • Ryosuke Shibasaki

In this research, we proposed a system of detecting and monitoring pedestrians' motion trajectories at a populated and wide environment, such as exhibition hall, supermarket etc. , using the horizontally profiling single-row laser range scanners on a mobile platform. A simplified walking model is defined to track the rhythmic swing feet at the ground level. Pedestrians are recognized by detecting the braided styles, which is a typical appearance that could discriminate the data of moving feet with other mobile and motionless objects. Two experiments are conducted. One is at the laboratory environment, the purpose of which is to examine the algorithm in details. Another is at an exhibition hall, a populated and wide environment, the purpose is to examine whether the system could be applied for practical needs. It is a big challenge, while the system did well. Pedestrians in the exhibition hall at the moment of measurement are detected. Their motion trajectories are extracted, and associated to the background map, which is made of the motionless objects, and covers the whole exhibition hall.