Author name cluster

Chi Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

AAAI Conference 2026 Conference Paper

VPHO: Joint Visual-Physical Cue Learning and Aggregation for Hand-Object Pose Estimation

Jun Zhou
Chi Xu
Kaifeng Tang
Yuting Ge
Tingrui Guo
Li Cheng

Estimating the 3D poses of hands and objects from a single RGB image is a fundamental yet challenging problem, with broad applications in augmented reality and human-computer interaction. Existing methods largely rely on visual cues alone, often producing results that violate physical constraints such as interpenetration or non-contact. Recent efforts to incorporate physics reasoning typically depend on post-optimization or non-differentiable physics engines, which compromise visual consistency and end-to-end trainability. To overcome these limitations, we propose a novel framework that jointly integrates visual and physical cues for hand-object pose estimation. This integration is achieved through two key ideas: 1) joint visual-physical cue learning: The model is trained to extract 2D visual cues and 3D physical cues, thereby enabling more comprehensive representation learning for hand-object interactions; 2) candidate pose aggregation: A novel refinement process that aggregates multiple diffusion-generated candidate poses by leveraging both visual and physical predictions, yielding a final estimate that is visually consistent and physically plausible. Extensive experiments demonstrate that our method significantly outperforms existing state-of-the-art approaches in both pose accuracy and physical plausibility.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Exploring Multimodal Foundation AI and Expert-in-the-Loop for Sustainable Management of Wild Salmon Fisheries in Indigenous Rivers

Chi Xu
Yili Jin
Sami Ma
Rongsheng Qian
Hao Fang
Jiangchuan Liu
Xue Liu
Edith C. H. Ngai

Wild salmon are essential to the ecological, economic, and cultural sustainability of the North Pacific Rim. Yet climate variability, habitat loss, and data limitations in remote ecosystems that lack basic infrastructure support pose significant challenges to effective fisheries management. This project explores the integration of multimodal foundation AI and expert-in-the-loop frameworks to enhance wild salmon monitoring and sustainable fisheries management in Indigenous rivers across Pacific Northwest. By leveraging video and sonar-based monitoring, we develop AI-powered tools for automated species identification, counting, and length measurement, reducing manual effort, expediting delivery of results, and improving decision-making accuracy. Expert validation and active learning frameworks ensure ecological relevance while reducing annotation burdens. To address unique technical and societal challenges, we bring together a cross-domain, interdisciplinary team of university researchers, fisheries biologists, Indigenous stewardship practitioners, government agencies, and conservation organizations. Through these collaborations, our research fosters ethical AI co-development, open data sharing, and culturally informed fisheries management.

PDF Details DOI

ICML Conference 2024 Conference Paper

MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts

Jianan Zhou 0002
Zhiguang Cao
Yaoxin Wu
Wen Song 0004
Yining Ma 0001
Jie Zhang 0002
Chi Xu

Learning to solve vehicle routing problems (VRPs) has garnered much attention. However, most neural solvers are only structured and trained independently on a specific problem, making them less generic and practical. In this paper, we aim to develop a unified neural solver that can cope with a range of VRP variants simultaneously. Specifically, we propose a multi-task vehicle routing solver with mixture-of-experts (MVMoE), which greatly enhances the model capacity without a proportional increase in computation. We further develop a hierarchical gating mechanism for the MVMoE, delivering a good trade-off between empirical performance and computational complexity. Experimentally, our method significantly promotes zero-shot generalization performance on 10 unseen VRP variants, and showcases decent results on the few-shot setting and real-world benchmark instances. We further conduct extensive studies on the effect of MoE configurations in solving VRPs, and observe the superiority of hierarchical gating when facing out-of-distribution data. The source code is available at: https: //github. com/RoyalSkye/Routing-MVMoE.

Details

TIST Journal 2023 Journal Article

COMET: Convolutional Dimension Interaction for Collaborative Filtering

Zhuoyi Lin
Lei Feng
Xingzhi Guo
Yu Zhang
Rui Yin
Chee Keong Kwoh
Chi Xu

Representation learning-based recommendation models play a dominant role among recommendation techniques. However, most of the existing methods assume both historical interactions and embedding dimensions are independent of each other, and thus regrettably ignore the high-order interaction information among historical interactions and embedding dimensions. In this article, we propose a novel representation learning-based model called COMET ( CO nvolutional di M E nsion in T eraction), which simultaneously models the high-order interaction patterns among historical interactions and embedding dimensions. To be specific, COMET stacks the embeddings of historical interactions horizontally at first, which results in two “embedding maps”. In this way, internal interactions and dimensional interactions can be exploited by convolutional neural networks (CNN) with kernels of different sizes simultaneously. A fully connected multi-layer perceptron (MLP) is then applied to obtain two interaction vectors. Lastly, the representations of users and items are enriched by the learnt interaction vectors, which can further be used to produce the final prediction. Extensive experiments and ablation studies on various public implicit feedback datasets clearly demonstrate the effectiveness and rationality of our proposed method.

Details DOI

ICRA Conference 2023 Conference Paper

Graph-based Pose Estimation of Texture-less Surgical Tools for Autonomous Robot Control

Haozheng Xu
Mark Runciman
João Cartucho
Chi Xu
Stamatia Giannarou

In Robot-assisted Minimally Invasive Surgery (RMIS), the estimation of the pose of surgical tools is crucial for applications such as surgical navigation, visual servoing, autonomous robotic task execution and augmented reality. A plethora of hardware-based and vision-based methods have been proposed in the literature. However, direct application of these methods to RMIS has significant limitations due to partial tool visibility, occlusions and changes in the surgical scene. In this work, a novel keypoint-graph-based network is proposed to estimate the pose of texture-less cylindrical surgical tools of small diameter. To deal with the challenges in RMIS, keypoint object representation is used and for the first time, temporal information is combined with spatial information in keypoint graph representation, for keypoint refinement. Finally, stable and accurate tool pose is computed using a PnP solver. Our performance evaluation study has shown that the proposed method is able to accurately predict the pose of a textureless robotic shaft with an ADD-S score of over 98%. The method outperforms state-of-the-art pose estimation models under challenging conditions such as object occlusion and changes in the lighting of the scene.

Details

EAAI Journal 2022 Journal Article

Learning variable ordering heuristics for solving Constraint Satisfaction Problems

Wen Song
Zhiguang Cao
Jie Zhang
Chi Xu
Andrew Lim

Backtracking search algorithms are often used to solve the Constraint Satisfaction Problem (CSP), which is widely applied in various domains such as automated planning and scheduling. The efficiency of backtracking search depends greatly on the variable ordering heuristics. Currently, the most commonly used heuristics are hand-crafted based on expert knowledge. In this paper, we propose a deep reinforcement learning based approach to automatically discover new variable ordering heuristics that are better adapted for a given class of CSP instances, without the need of relying on hand-crafted features and heuristics. We show that directly optimizing the search tree size is not convenient for learning, and propose to optimize the expected cost of reaching a leaf node in the search tree. To capture the complex relations among the variables and constraints, we design a representation scheme based on Graph Neural Network that can process CSP instances with different sizes and constraint arities. Experimental results on random CSP instances show that on small and medium sized instances, the learned policies outperform classical hand-crafted heuristics with smaller search tree (up to 10. 36% reduction). Moreover, without further training, our policies directly generalize to instances of larger sizes and much harder to solve than those in training, with even larger reduction in the search tree size (up to 18. 74%).

Details DOI

IJCAI Conference 2017 Conference Paper

MRLR: Multi-level Representation Learning for Personalized Ranking in Recommendation

Zhu Sun
Jie Yang
Jie Zhang
Alessandro Bozzon
Yu Chen
Chi Xu

Representation learning (RL) has recently proven to be effective in capturing local item relationships by modeling item co-occurrence in individual user's interaction record. However, the value of RL for recommendation has not reached the full potential due to two major drawbacks: 1) recommendation is modeled as a rating prediction problem but should essentially be a personalized ranking one; 2) multi-level organizations of items are neglected for fine-grained item relationships. We design a unified Bayesian framework MRLR to learn user and item embeddings from a multi-level item organization, thus benefiting from RL as well as achieving the goal of personalized ranking. Extensive validation on real-world datasets shows that MRLR consistently outperforms state-of-the-art algorithms.

PDF Details