Arrow Research search

Author name cluster

Feng Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

27 papers
2 author rows

Possible papers

27

AAAI Conference 2026 Conference Paper

VGGTFace: Topologically Consistent Facial Geometry Reconstruction in the Wild

  • Xin Ming
  • Yuxuan Han
  • Tianyu Huang
  • Feng Xu

Reconstructing topologically consistent facial geometry is crucial for the digital avatar creation pipelines. Existing methods either require tedious manual efforts, lack generalization to in-the-wild data, or are constrained by the limited expressiveness of 3D Morphable Models. To address these limitations, we propose VGGTFace, an automatic approach that innovatively applies the 3D foundation model, i.e. VGGT, for topologically consistent facial geometry reconstruction from in-the-wild multi-view images captured by everyday users. Our key insight is that, by leveraging VGGT, our method naturally inherits strong generalization ability and expressive power from its large-scale training and point map representation. However, it is unclear how to reconstruct a topologically consistent mesh from VGGT, as the topology information is missing in its prediction. To this end, we augment VGGT with Pixel3DMM for injecting topology information via pixel-aligned UV values. In this manner, we convert the pixel-aligned point map of VGGT to a point cloud with topology. Tailored to this point cloud with known topology, we propose a novel Topology-Aware Bundle Adjustment strategy to fuse them, where we construct a Laplacian energy for the Bundle Adjustment objective. Our method achieves high-quality reconstruction in 10 seconds for 16 views on a single NVIDIA RTX 4090. Experiments demonstrate state-of-the-art results on benchmarks and impressive generalization to in-the-wild data.

JBHI Journal 2025 Journal Article

MAT: Mixing Attention Transfer from Multiple Transformers for Medical Tasks

  • Zi-Hao Bo
  • Yuchen Guo
  • Xiangru Chen
  • Jing Xie
  • Lishan Ye
  • Feng Xu

Transformer has been widely used for image analysis tasks, but in medicine, it suffers from limited data availability. To overcome this challenge, we propose a novel approach specially designed for transformers to transfer knowledge from multiple sources to target medical tasks with limited data, named Mixing Attention Transfer (MAT). MAT aims to harness and merge knowledge from multiple source transformers at the token and layer level to improve the performance of target medical tasks. The core component of MAT is the Mixing Attention layer, which encompasses: 1. token-level Routing and Fusion modules that allocate input images to adequate source modules; 2. sequence-level Aligned-Attention module that adaptively aligns outputs produced by different source modules. To the best of our knowledge, this is the first multi-source transfer learning approach specifically designed for transformers. Through extensive evaluations, we demonstrate the effectiveness of MAT on three medical scenarios: noisy-labeled, class-imbalanced, and fine-grained tasks.

IJCAI Conference 2025 Conference Paper

Simulate, Refine and Integrate: Strategy Synthesis for Efficient SMT Solving

  • Bingzhe Zhou
  • Hannan Wang
  • Yuan Yao
  • Taolue Chen
  • Feng Xu
  • Xiaoxing Ma

Satisfiability Modulo Theories (SMT) solvers are crucial in many applications, yet their performance is often a bottleneck. This paper introduces SIRISMT, a novel framework that employs machine learning techniques for the automatic synthesis of efficient SMT-solving strategies. Specifically, SIRISMT targets at Z3 and consists of three key stages. First, given a set of training SMT formulas, SIRISMT simulates the solving process by leveraging reinforcement learning to guide its exploration within the strategy space. Next, SIRISMT refines the collected strategies by pruning redundant tactics and generating augmented strategies based on the subsequence structure of the learned strategies. These refined strategies are then fed back into the reinforcement learning model. Finally, the refined and optimized strategies are integrated into one strategy, which can be directly plugged into modern SMT solvers. Extensive evaluations show the superior performance of SIRISMT over the baseline methods. For example, compared to the default Z3, it solves 26. 8% more formulas and achieves up to an 86. 3% improvement in the Par-2 score on benchmark datasets. Additionally, we show that the synthesized strategy can improve the code coverage by up to 11. 8% in a downstream symbolic execution benchmark.

AAMAS Conference 2024 Conference Paper

Foresight Distribution Adjustment for Off-policy Reinforcement Learning

  • Ruifeng Chen
  • Xu-Hui Liu
  • Tian-Shuo Liu
  • Shengyi Jiang
  • Feng Xu
  • Yang Yu

Off-policy reinforcement learning algorithms maintain a replay buffer to utilize samples obtained from earlier policies. The sampling strategy that prioritizes certain data in a buffer to train the value function or the policy, has been shown to significantly influence the sample efficiency and the final performance of the algorithm. However, which distribution for the experience prioritization is the best choice has not been explored thoroughly. In this paper, we proved that the post-update policy distribution (i. e. the visitation distribution of the policy after the current iteration of update) is the best Q training distribution to benefit the policy improvement. Nevertheless, accessing this "future" distribution is not straightforward. In this work, we find that the current experiences can be modulated by the critic information to simulate the post-update policy distribution. Technically, we derive the gradient of the visitation distribution with respect to the policy parameter and obtain an explicit expression to approximate the post-update policy distribution. The derived method is named as Foresight Distribution Adjustment (FoDA), and seamlessly integrates with conventional off-policy actor-critic algorithms. Our experiments validate FoDA’s ability to closely approximate the post-update policy distribution, and demonstrate its utility in enhancing performance across continuous control task benchmarks.

AAAI Conference 2024 Conference Paper

Inspecting Prediction Confidence for Detecting Black-Box Backdoor Attacks

  • Tong Wang
  • Yuan Yao
  • Feng Xu
  • Miao Xu
  • Shengwei An
  • Ting Wang

Backdoor attacks have been shown to be a serious security threat against deep learning models, and various defenses have been proposed to detect whether a model is backdoored or not. However, as indicated by a recent black-box attack, existing defenses can be easily bypassed by implanting the backdoor in the frequency domain. To this end, we propose a new defense DTInspector against black-box backdoor attacks, based on a new observation related to the prediction confidence of learning models. That is, to achieve a high attack success rate with a small amount of poisoned data, backdoor attacks usually render a model exhibiting statistically higher prediction confidences on the poisoned samples. We provide both theoretical and empirical evidence for the generality of this observation. DTInspector then carefully examines the prediction confidences of data samples, and decides the existence of backdoor using the shortcut nature of backdoor triggers. Extensive evaluations on six backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.

ICRA Conference 2024 Conference Paper

OptiState: State Estimation of Legged Robots using Gated Networks with Transformer-based Vision and Kalman Filtering

  • Alexander Schperberg
  • Yusuke Tanaka
  • Saviz Mowlavi
  • Feng Xu
  • Bharathan Balaji
  • Dennis W. Hong

State estimation for legged robots is challenging due to their highly dynamic motion and limitations imposed by sensor accuracy. By integrating Kalman filtering, optimization, and learning-based modalities, we propose a hybrid solution that combines proprioception and exteroceptive information for estimating the state of the robot’s trunk. Leveraging joint encoder and IMU measurements, our Kalman filter is enhanced through a single-rigid body model that incorporates ground reaction force control outputs from convex Model Predictive Control optimization. The estimation is further refined through Gated Recurrent Units, which also considers semantic insights and robot height from a Vision Transformer autoencoder applied on depth images. This framework not only furnishes accurate robot state estimates, including uncertainty evaluations, but can minimize the nonlinear errors that arise from sensor measurements and model simplifications through learning. The proposed methodology is evaluated in hardware using a quadruped robot on various terrains, yielding a 65% improvement on the Root Mean Squared Error compared to our VIO SLAM baseline. Code example: https://github.com/AlexS28/OptiState

AAAI Conference 2024 Conference Paper

Relightable and Animatable Neural Avatars from Videos

  • Wenbin Lin
  • Chengwei Zheng
  • Jun-Hai Yong
  • Feng Xu

Lightweight creation of 3D digital avatars is a highly desirable but challenging task. With only sparse videos of a person under unknown illumination, we propose a method to create relightable and animatable neural avatars, which can be used to synthesize photorealistic images of humans under novel viewpoints, body poses, and lighting. The key challenge here is to disentangle the geometry, material of the clothed body, and lighting, which becomes more difficult due to the complex geometry and shadow changes caused by body motions. To solve this ill-posed problem, we propose novel techniques to better model the geometry and shadow changes. For geometry change modeling, we propose an invertible deformation field, which helps to solve the inverse skinning problem and leads to better geometry quality. To model the spatial and temporal varying shading cues, we propose a pose-aware part-wise light visibility network to estimate light occlusion. Extensive experiments on synthetic and real datasets show that our approach reconstructs high-quality geometry and generates realistic shadows under different body poses. Code and data are available at https://wenbin-lin.github.io/RelightableAvatar-page.

AAMAS Conference 2023 Conference Paper

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement

  • Xu-Hui Liu
  • Feng Xu
  • Xinyu Zhang
  • Tianyuan Liu
  • Shengyi Jiang
  • Ruifeng Chen
  • Zongzhang Zhang
  • Yang Yu

Imitation learning aims to mimic the behavior of experts without explicit reward signals. Passive imitation learning methods which use static expert datasets typically suffer from compounding error, low sample efficiency, and high hyper-parameter sensitivity. In contrast, active imitation learning methods solicit expert interventions to address the limitations. However, recent active imitation learning methods are designed based on human intuitions or empirical experience without theoretical guarantee. In this paper, we propose a novel active imitation learning framework based on a teacher-student interaction model, in which the teacher’s goal is to identify the best teaching behavior and actively affect the student’s learning process. By solving the optimization objective of this framework, we propose a practical implementation, naming it AdapMen. Theoretical analysis shows that AdapMen can improve the error bound and avoid compounding error under mild conditions. Experiments on the MetaDrive benchmark and Atari 2600 games validate our theoretical analysis and show that our method achieves near-expert performance with much less expert involvement and total sampling steps than previous methods. The code is available at https: //github. com/liuxhym/AdapMen.

JBHI Journal 2023 Journal Article

Multi-Level Constrained Intra and Inter Subject Feature Representation for Facial Video Based BVP Signal Measurement

  • Bin Li
  • Wei Zhang
  • Hong Fu
  • Hao Liu
  • Feng Xu

Facial video-based blood volume pulse (BVP) signal measurement holds great potential for remote health monitoring, while existing methods have issues with convolutional kernel perceptual field constraints. This article proposes an end-to-end multi-level constrained spatiotemporal representation structure for facial video-based BVP signal measurement. First, an intra- and inter-subject feature representation is proposed to strengthen the BVP-related features generation at high, semantic, and shallow levels, respectively. Second, the global-local association is presented to enhance BVP signal period pattern learning, and the global temporal features are introduced into the local spatial convolution of each frame by adaptive kernel weights. Finally, the multi-dimensional fused features are mapped to one-dimensional BVP signals by the task-oriented signal estimator. The experimental results on the publicly available MMSE-HR dataset demonstrate that the proposed structure overperforms state-of-the-art methods (e. g. , AutoHR) in BVP signal measurement, with a 20% and 40% reduction in mean absolute error and root mean squared error, respectively. The proposed structure would be a powerful tool for telemedical and non-contact heart health monitoring.

NeurIPS Conference 2021 Conference Paper

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

  • Xiong-Hui Chen
  • Shengyi Jiang
  • Feng Xu
  • Zongzhang Zhang
  • Yang Yu

In visual-input sim-to-real scenarios, to overcome the reality gap between images rendered in simulators and those from the real world, domain adaptation, i. e. , learning an aligned representation space between simulators and the real world, then training and deploying policies in the aligned representation, is a promising direction. Previous methods focus on same-modal domain adaptation. However, those methods require building and running simulators that render high-quality images, which can be difficult and costly. In this paper, we consider a more cost-efficient setting of visual-input sim-to-real where only low-dimensional states are simulated. We first point out that the objective of learning mapping functions in previous methods that align the representation spaces is ill-posed, prone to yield an incorrect mapping. When the mapping crosses modalities, previous methods are easier to fail. Our algorithm, Cross-mOdal Domain Adaptation with Sequential structure (CODAS), mitigates the ill-posedness by utilizing the sequential nature of the data sampling process in RL tasks. Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

AAAI Conference 2021 Short Paper

Enhancing Context-Based Meta-Reinforcement Learning Algorithms via An Efficient Task Encoder (Student Abstract)

  • Feng Xu
  • Shengyi Jiang
  • Hao Yin
  • Zongzhang Zhang
  • Yang Yu
  • Ming Li
  • Dong Li
  • Wulong Liu

Meta-Reinforcement Learning (meta-RL) algorithms enable agents to adapt to new tasks from small amounts of exploration, based on the experience of similar tasks. Recent studies have pointed out that a good representation of a task is key to the success of off-policy context-based meta-RL. Inspired by contrastive methods in unsupervised representation learning, we propose a new method to learn the task representation based on the mutual information between transition tuples in a trajectory and the task embedding. We also propose a new estimation for task similarity based on Q-function, which can be used to form a constraint on the distribution of the encoded task variables, making the task encoder encode the task variables more effective on new tasks. Experiments on meta-RL tasks show that the newly proposed method outperforms existing meta-RL algorithms.

NeurIPS Conference 2021 Conference Paper

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

  • Xu-Hui Liu
  • Zhenghai Xue
  • Jingcheng Pang
  • Shengyi Jiang
  • Feng Xu
  • Yang Yu

In reinforcement learning, experience replay stores past samples for further reuse. Prioritized sampling is a promising technique to better utilize these samples. Previous criteria of prioritization include TD error, recentness and corrective feedback, which are mostly heuristically designed. In this work, we start from the regret minimization objective, and obtain an optimal prioritization strategy for Bellman update that can directly maximize the return of the policy. The theory suggests that data with higher hindsight TD error, better on-policiness and more accurate Q value should be assigned with higher weights during sampling. Thus most previous criteria only consider this strategy partially. We not only provide theoretical justifications for previous criteria, but also propose two new methods to compute the prioritization weight, namely ReMERN and ReMERT. ReMERN learns an error network, while ReMERT exploits the temporal ordering of states. Both methods outperform previous prioritized sampling algorithms in challenging RL benchmarks, including MuJoCo, Atari and Meta-World.

IROS Conference 2020 Conference Paper

SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks

  • Bohan Wu
  • Feng Xu
  • Zhanpeng He
  • Abhi Gupta
  • Peter K. Allen

Recent advances in deep reinforcement learning (RL) have demonstrated its potential to learn complex robotic manipulation tasks. However, RL still requires the robot to collect a large amount of real-world experience. To address this problem, recent works have proposed learning from expert demonstrations (LfD), particularly via inverse reinforcement learning (IRL), given its ability to achieve robust performance with only a small number of expert demonstrations. Nevertheless, deploying IRL on real robots is still challenging due to the large number of robot experiences it requires. This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration. First, this algorithm bootstraps the learning of a task encoder and a task-conditioned policy using behavioral cloning (BC). It then collects real-robot experiences and bypasses reward learning by directly recovering a Q-function from the combined robot and expert trajectories. Next, this algorithm uses the learned Q-function to re-evaluate all cumulative experiences collected by the robot to improve the policy quickly. In the end, the policy performs more robustly (90%+ success) than BC on new tasks while requiring no experiences at test time. Finally, our real-robot and simulated experiments demonstrate our algorithm's generality across different state spaces, action spaces, and vision-based manipulation tasks, e. g. , pick-pour-place and pick-carry-drop.

NeurIPS Conference 2020 Conference Paper

Trading Personalization for Accuracy: Data Debugging in Collaborative Filtering

  • Long Chen
  • Yuan Yao
  • Feng Xu
  • Miao Xu
  • Hanghang Tong

Collaborative filtering has been widely used in recommender systems. Existing work has primarily focused on improving the prediction accuracy mainly via either building refined models or incorporating additional side information, yet has largely ignored the inherent distribution of the input rating data. In this paper, we propose a data debugging framework to identify overly personalized ratings whose existence degrades the performance of a given collaborative filtering model. The key idea of the proposed approach is to search for a small set of ratings whose editing (e. g. , modification or deletion) would near-optimally improve the recommendation accuracy of a validation set. Experimental results demonstrate that the proposed approach can significantly improve the recommendation accuracy. Furthermore, we observe that the identified ratings significantly deviate from the average ratings of the corresponding items, and the proposed approach tends to modify them towards the average. This result sheds light on the design of future recommender systems in terms of balancing between the overall accuracy and personalization.

AAAI Conference 2019 Conference Paper

An Integral Tag Recommendation Model for Textual Content

  • Shijie Tang
  • Yuan Yao
  • Suwei Zhang
  • Feng Xu
  • Tianxiao Gu
  • Hanghang Tong
  • Xiaohui Yan
  • Jian Lu

Recommending suitable tags for online textual content is a key building block for better content organization and consumption. In this paper, we identify three pillars that impact the accuracy of tag recommendation: (1) sequential text modeling meaning that the intrinsic sequential ordering as well as different areas of text might have an important implication on the corresponding tag(s), (2) tag correlation meaning that the tags for a certain piece of textual content are often semantically correlated with each other, and (3) content-tag overlapping meaning that the vocabularies of content and tags are overlapped. However, none of the existing methods consider all these three aspects, leading to a suboptimal tag recommendation. In this paper, we propose an integral model to encode all the three aspects in a coherent encoder-decoder framework. In particular, (1) the encoder models the semantics of the textual content via Recurrent Neural Networks with the attention mechanism, (2) the decoder tackles the tag correlation with a prediction path, and (3) a shared embedding layer and an indicator function across encoder-decoder address the content-tag overlapping. Experimental results on three realworld datasets demonstrate that the proposed method significantly outperforms the existing methods in terms of recommendation accuracy.

IJCAI Conference 2019 Conference Paper

Commit Message Generation for Source Code Changes

  • Shengbin Xu
  • Yuan Yao
  • Feng Xu
  • Tianxiao Gu
  • Hanghang Tong
  • Jian Lu

Commit messages, which summarize the source code changes in natural language, are essential for program comprehension and software evolution understanding. Unfortunately, due to the lack of direct motivation, commit messages are sometimes neglected by developers, making it necessary to automatically generate such messages. State-of-the-art adopts learning based approaches such as neural machine translation models for the commit message generation problem. However, they tend to ignore the code structure information and suffer from the out-of-vocabulary issue. In this paper, we propose CoDiSum to address the above two limitations. In particular, we first extract both code structure and code semantics from the source code changes, and then jointly model these two sources of information so as to better learn the representations of the code changes. Moreover, we augment the model with copying mechanism to further mitigate the out-of-vocabulary issue. Experimental evaluations on real data demonstrate that the proposed approach significantly outperforms the state-of-the-art in terms of accurately generating the commit messages.

AAAI Conference 2019 Conference Paper

Hashtag Recommendation for Photo Sharing Services

  • Suwei Zhang
  • Yuan Yao
  • Feng Xu
  • Hanghang Tong
  • Xiaohui Yan
  • Jian Lu

Hashtags can greatly facilitate content navigation and improve user engagement in social media. Meaningful as it might be, recommending hashtags for photo sharing services such as Instagram and Pinterest remains a daunting task due to the following two reasons. On the endogenous side, posts in photo sharing services often contain both images and text, which are likely to be correlated with each other. Therefore, it is crucial to coherently model both image and text as well as the interaction between them. On the exogenous side, hashtags are generated by users and different users might come up with different tags for similar posts, due to their different preference and/or community effect. Therefore, it is highly desirable to characterize the users’ tagging habits. In this paper, we propose an integral and effective hashtag recommendation approach for photo sharing services. In particular, the proposed approach considers both the endogenous and exogenous effects by a content modeling module and a habit modeling module, respectively. For the content modeling module, we adopt the parallel co-attention mechanism to coherently model both image and text as well as the interaction between them; for the habit modeling module, we introduce an external memory unit to characterize the historical tagging habit of each user. The overall hashtag recommendations are generated on the basis of both the post features from the content modeling module and the habit influences from the habit modeling module. We evaluate the proposed approach on real Instagram data. The experimental results demonstrate that the proposed approach significantly outperforms the state-of-theart methods in terms of recommendation accuracy, and that both content modeling and habit modeling contribute significantly to the overall recommendation accuracy.

IJCAI Conference 2018 Conference Paper

Instance Weighting with Applications to Cross-domain Text Classification via Trading off Sample Selection Bias and Variance

  • Rui Xia
  • Zhenchun Pan
  • Feng Xu

Domain adaptation is an important problem in natural language processing (NLP) due to the distributional difference between the labeled source domain and the target domain. In this paper, we study the domain adaptation problem from the instance weighting perspective. By using density ratio as the instance weight, the traditional instance weighting approaches can potentially correct the sample selection bias in domain adaptation. However, researchers often failed to achieve good performance when applying instance weighting to domain adaptation in NLP and many negative results were reported in the literature. In this work, we conduct an in-depth study on the causes of the failure, and find that previous work only focused on reducing the sample selection bias, but ignored another important factor, sample selection variance, in domain adaptation. On this basis, we propose a new instance weighting framework by trading off two factors in instance weight learning. We evaluate our approach on two cross-domain text classification tasks and compare it with eight instance weighting methods. The results prove our approach's advantages in domain adaptation performance, optimization efficiency and parameter stability.

IS Journal 2018 Journal Article

Instance-based Domain Adaptation via Multiclustering Logistic Approximation

  • Feng Xu
  • Jianfei Yu
  • Rui Xia

With the explosive growth of the Internet online texts, we could nowadays easily collect a large amount of labeled training data from different source domains. However, a basic assumption in building statistical machine learning models for sentiment analysis is that the training and test data must be drawn from the same distribution. Directly training a statistical model usually results in poor performance, when the training and test data have different distributions. Faced with the massive labeled data from different domains, it is therefore important to identify the source-domain training instances that are closely relevant to the target domain, and make better use of them. In this work, we propose a new approach, called multiclustering logistic approximation (MLA), to address this problem. In MLA, we adapt the source-domain training data to the target domain via a framework of multiclustering logistic approximation. Experimental results demonstrate that MLA has significant advantages over the state-of-the-art instance adaptation methods, especially in the scenario of multidistributional training data.

AAAI Conference 2014 Conference Paper

Instance-Based Domain Adaptation in NLP via In-Target-Domain Logistic Approximation

  • Rui Xia
  • Jianfei Yu
  • Feng Xu
  • Shumei Wang

In the field of NLP, most of the existing domain adaptation studies belong to the feature-based adaptation, while the research of instance-based adaptation is very scarce. In this work, we propose a new instance-based adaptation model, called in-target-domain logistic approximation (ILA). In ILA, we adapt the source-domain data to the target domain by a logistic approximation. The normalized in-targetdomain probability is assigned as an instance weight to each of the source-domain training data. An instance-weighted classification model is trained finally for the cross-domain classification problem. Compared to the previous techniques, ILA conducts instance adaptation in a dimensionalityreduced linear feature space to ensure efficiency in highdimensional NLP tasks. The instance weights in ILA are learnt by leveraging the criteria of both maximum likelihood and minimum statistical distance. The empirical results on two NLP tasks including text categorization and sentiment classification show that our ILA model has advantages over the state-of-the-art instance adaptation methods, in crossdomain classification accuracy, parameter stability and computational efficiency.