Arrow Research search

Author name cluster

Bin He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

30 papers
2 author rows

Possible papers

30

AAAI Conference 2025 Conference Paper

Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems

  • Zhuohui Zhang
  • Bin He
  • Bin Cheng
  • Gang Li

Multi-agent systems must learn to communicate and understand interactions between agents to achieve cooperative goals in partially observed tasks. However, existing approaches lack a dynamic directed communication mechanism and rely on global states, thus diminishing the role of communication in centralized training. Thus, we propose the Transformer-based graph coarsening network (TGCNet), a novel multi-agent reinforcement learning (MARL) algorithm. TGCNet learns the topological structure of a dynamic directed graph to represent the communication policy and integrates graph coarsening networks to approximate the representation of global state during training. It also utilizes the Transformer decoder for feature extraction during execution. Experiments on multiple cooperative MARL benchmarks demonstrate state-of-the-art performance compared to popular MARL algorithms. Further ablation studies validate the effectiveness of our dynamic directed graph communication mechanism and graph coarsening networks.

AAAI Conference 2025 Conference Paper

Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph

  • Weihao Wang
  • Yu Lan
  • Mingyu You
  • Bin He

3D part assembly is a promising task in 3D computer vision and robotics, focusing on assembling 3D parts together by predicting their 6-DoF poses. Like most 3D shape understanding tasks, existing methods primarily address this task by memorizing the poses of parts during the training process, leading to inaccuracies in complex assemblies and poor generalization to novel categories. In order to essentially improve the performance, structure knowledge of the target assembly is indispensable before assembling, which abstracts the potential part composition and their structural relationships. An image of the target assembly can serve as a common source for constructing this structure knowledge. Nevertheless, the image is far from enough, as its knowledge can be incomplete and ambiguous due to part occlusion and varying views. To tackle these issues, we propose Imagine, a novel Image-guided 3D part assembly framework with structure knowledge graph. As a novel assembly prior, the structure knowledge graph originates from the image and is refined as understanding the 3D parts. It encodes robust part-aware structural and semantic information of the assembly, guides the 3D parts from a coarse super-structure to a fine assembly, and co-evolves progressively throughout the assembly process. Extensive experiments demonstrate the state-of-the-art performance of our framework, along with strong generalization to novel images and categories.

IJCAI Conference 2025 Conference Paper

Rotation Invariant Spatial Networks for Single-View Point Cloud Classification

  • Feng Luan
  • Jiarui Hu
  • Changshi Zhou
  • Zhipeng Wang
  • Jiguang Yue
  • Yanmin Zhou
  • Bin He

Point cloud classification is critical for three-dimensional scene understanding. However, in real-world scenarios, depth cameras often capture partial, single-view point clouds of objects with different poses, making their accurate classification a challenge. In this paper, we propose a novel point cloud classification network that captures the detailed spatial structure of objects by constructing tetrahedra, which is different from point-wise operations. Specifically, we propose a RISpaNet block to extract rotation-invariant features. A rotation-invariant property generation module is designed in RISpaNet for constructing rotation-invariant tetrahedron properties (RITPs). Meanwhile, a multi-scale pooling module and a hybrid encoder are used to process RITPs to generate integrated rotation-invariant features. Further, for single-view point clouds, a complete point cloud auxiliary branch and a part-whole correlation module are jointly employed to obtain complete point cloud features from partial point clouds. Experimental results show that this network performs better than other state-of-the-art methods, evaluated on four public datasets. We achieved an overall accuracy of 94. 7% (+2. 0%) on ModelNet40, 93. 4% (+5. 9%) on MVP, 94. 7% (+6. 3%) on PCN and 94. 8% (+1. 7%) on ScanObjectNN. Our project website is https: //luxurylf. github. io/RISpaNet_project/.

ICRA Conference 2024 Conference Paper

Ultrafast capturing in-flight objects with reprogrammable working speed ranges

  • Yongkang Jiang
  • Xin Tong
  • Zhongqing Sun
  • Yaimiin Zhou
  • Zhipeng Wang
  • Shuo Jiang
  • Zhen Yin
  • Yulong Ding

In-flight high-speed object capturing is crucial in nature to improve survival and adaptation to the environment, such as the predation of frogs, leopards, and eagles. Despite its ubiquitousness in nature, capturing fast-moving objects is extremely challenging in engineering implementations. In this paper, we report an ultrafast gripper based on tunable bistable structures. Different from current designs which are only suitable for objects with certain speed ranges once the grippers are fabricated, the working range of object speed of the proposed gripper could be reprogrammed by controlling the sensitivity of the structures. We present the design and fabrication of the proposed gripper in detail. A theoretical model is introduced to construct the energy landscape of the structures and the force response of the gripper when programmed to different states. The results show that in the original state, the gripper is capable of capturing a flying table tennis ball with a high speed of 15 m/s in only 6 ms. When the proposed gripper is controlled to the ultra-sensitive state, a flying ball with only 1 m/s could also be captured. This work broadens the frontiers of in-flight capturing design, and we envision broader promising applications.

AAAI Conference 2023 Conference Paper

3D Assembly Completion

  • Weihao Wang
  • Rufeng Zhang
  • Mingyu You
  • Hongjun Zhou
  • Bin He

Automatic assembly is a promising research topic in 3D computer vision and robotics. Existing works focus on generating assembly (e.g., IKEA furniture) from scratch with a set of parts, namely 3D part assembly. In practice, there are higher demands for the robot to take over and finish an incomplete assembly (e.g., a half-assembled IKEA furniture) with an off-the-shelf toolkit, especially in human-robot and multi-agent collaborations. Compared to 3D part assembly, it is more complicated in nature and remains unexplored yet. The robot must understand the incomplete structure, infer what parts are missing, single out the correct parts from the toolkit and finally, assemble them with appropriate poses to finish the incomplete assembly. Geometrically similar parts in the toolkit can interfere, and this problem will be exacerbated with more missing parts. To tackle this issue, we propose a novel task called 3D assembly completion. Given an incomplete assembly, it aims to find its missing parts from a toolkit and predict the 6-DoF poses to make the assembly complete. To this end, we propose FiT, a framework for Finishing the incomplete 3D assembly with Transformer. We employ the encoder to model the incomplete assembly into memories. Candidate parts interact with memories in a memory-query paradigm for final candidate classification and pose prediction. Bipartite part matching and symmetric transformation consistency are embedded to refine the completion. For reasonable evaluation and further reference, we design two standard toolkits of different difficulty, containing different compositions of candidate parts. We conduct extensive comparisons with several baseline methods and ablation studies, demonstrating the effectiveness of the proposed method.

ICRA Conference 2023 Conference Paper

Continuous-Time LiDAR-Inertial-Vehicle Odometry Method with Lateral Acceleration Constraint

  • Bin He
  • Weichen Dai 0001
  • Zeyu Wan
  • Hong Zhang
  • Yu Zhang 0018

In this paper, we propose a continuous-time-based LiDAR-inertial-vehicle odometry method, which can tightly fuse the data from Light Detection And Ranging (LiDAR), inertial measurement units (IMU), and vehicle measurements. The lateral acceleration constraint is further added to trajectory estimation to make the estimated trajectory follow the motion characteristics of vehicles. In addition, since vehicle model parameters vary with different motion conditions and tyre pressure, we estimate vehicle correction factors that rectify changes in vehicle model parameters online, and also analyze the observability of these vehicle correction factors. In experiments, the proposed method is evaluated and compared with state-of-the-art methods in the public dataset. The experimental results show that the proposed method achieves more accurate results in all sequences since we add additional sensor measurements and utilize the characteristic of vehicle motion to restrict the trajectory estimation. The ablation study also proved the effectiveness of continuous-time representation, online correction factor estimation, and incorporation of lateral acceleration constraint.

AAAI Conference 2014 Conference Paper

Representing Words as Lymphocytes

  • Jinfeng Yang
  • Yi Guan
  • Xishuang Dong
  • Bin He

Similarity between words is becoming a generic problem for many applications of computational linguistics, and computing word similarities is determined by word representations. Inspired by the analogies between words and lymphocytes, a lymphocyte-style word representation is proposed. The word representation is built on the basis of dependency syntax of sentences and represent word context as head properties and dependent properties of the word. Lymphocyte-style word representations are evaluated by computing the similarities between words, and experiments are conducted on the Penn Chinese Treebank 5. 1. Experimental results indicate that the proposed word representations are effective.