Author name cluster

Mingyu You

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

AAAI Conference 2025 Conference Paper

Imagine: Image-Guided 3D Part Assembly with Structure Knowledge Graph

Weihao Wang
Yu Lan
Mingyu You
Bin He

3D part assembly is a promising task in 3D computer vision and robotics, focusing on assembling 3D parts together by predicting their 6-DoF poses. Like most 3D shape understanding tasks, existing methods primarily address this task by memorizing the poses of parts during the training process, leading to inaccuracies in complex assemblies and poor generalization to novel categories. In order to essentially improve the performance, structure knowledge of the target assembly is indispensable before assembling, which abstracts the potential part composition and their structural relationships. An image of the target assembly can serve as a common source for constructing this structure knowledge. Nevertheless, the image is far from enough, as its knowledge can be incomplete and ambiguous due to part occlusion and varying views. To tackle these issues, we propose Imagine, a novel Image-guided 3D part assembly framework with structure knowledge graph. As a novel assembly prior, the structure knowledge graph originates from the image and is refined as understanding the 3D parts. It encodes robust part-aware structural and semantic information of the assembly, guides the 3D parts from a coarse super-structure to a fine assembly, and co-evolves progressively throughout the assembly process. Extensive experiments demonstrate the state-of-the-art performance of our framework, along with strong generalization to novel images and categories.

PDF Details DOI

AAAI Conference 2023 Conference Paper

3D Assembly Completion

Weihao Wang
Rufeng Zhang
Mingyu You
Hongjun Zhou
Bin He

Automatic assembly is a promising research topic in 3D computer vision and robotics. Existing works focus on generating assembly (e.g., IKEA furniture) from scratch with a set of parts, namely 3D part assembly. In practice, there are higher demands for the robot to take over and finish an incomplete assembly (e.g., a half-assembled IKEA furniture) with an off-the-shelf toolkit, especially in human-robot and multi-agent collaborations. Compared to 3D part assembly, it is more complicated in nature and remains unexplored yet. The robot must understand the incomplete structure, infer what parts are missing, single out the correct parts from the toolkit and finally, assemble them with appropriate poses to finish the incomplete assembly. Geometrically similar parts in the toolkit can interfere, and this problem will be exacerbated with more missing parts. To tackle this issue, we propose a novel task called 3D assembly completion. Given an incomplete assembly, it aims to find its missing parts from a toolkit and predict the 6-DoF poses to make the assembly complete. To this end, we propose FiT, a framework for Finishing the incomplete 3D assembly with Transformer. We employ the encoder to model the incomplete assembly into memories. Candidate parts interact with memories in a memory-query paradigm for final candidate classification and pose prediction. Bipartite part matching and symmetric transformation consistency are embedded to refine the completion. For reasonable evaluation and further reference, we design two standard toolkits of different difficulty, containing different compositions of candidate parts. We conduct extensive comparisons with several baseline methods and ablation studies, demonstrating the effectiveness of the proposed method.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Diverse Knowledge Distillation for End-to-End Person Search

Xinyu Zhang
Xinlong Wang
Jia-Wang Bian
Chunhua Shen
Mingyu You

Person search aims to localize and identify a specific person from a gallery of images. Recent methods can be categorized into two groups, i. e. , two-step and end-to-end approaches. The former views person search as two independent tasks and achieves dominant results using separately trained person detection and re-identification (Re-ID) models. The latter performs person search in an end-to-end fashion. Although the end-to-end approaches yield higher inference efficiency, they largely lag behind those two-step counterparts in terms of accuracy. In this paper, we argue that the gap between the two kinds of methods is mainly caused by the Re-ID subnetworks of end-to-end methods. To this end, we propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck. We also design a spatialinvariant augmentation to assist model to be invariant to inaccurate detection results. Experimental results on the CUHK- SYSU and PRW datasets demonstrate the superiority of our method against existing approaches—it achieves on par accuracy with state-of-the-art two-step methods while maintaining high efficiency due to the single joint model. Code is available at: https: //git. io/DKD-PersonSearch

PDF Details