Arrow Research search

Author name cluster

Hao Fu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
2 author rows

Possible papers

16

AAAI Conference 2026 Conference Paper

ENHash: Error Notebook-Guided Fine-Grained Learning for Unsupervised Cross-Modal Hashing

  • Hao Fu
  • Zebing Yao
  • Chuangchuang Tan
  • Guanghua Gu

Without manual annotations, unsupervised cross-modal hashing (UCMH) aims to achieve efficient clustering and retrieval by leveraging data interrelationships. However, the retrieval accuracy is constrained by two main aspects: 1) insufficient exploration of data relationships; 2) existing knowledge mining strategies are not well aligned with the architectural properties of multilayer perceptrons. Through summary and error analysis, the human brain is able to achieve fast learning through experience and minimal data. Inspired by this cognitive process, we propose a novel Error Notebook strategy, named ENHash, to more effectively capture similarity information between multi-modal data for fine-grained unsupervised clustering. Firstly, simulating the human process of summarizing experiences, ENHash gradually integrates the information from each batch into a global clustering representation. Secondly, drawing upon human error analysis capabilities, ENHash utilizes the summarized experiences to identify and record incorrectly predicted hash codes. Finally, by leveraging the knowledge derived from this analysis, ENHash guides the hash function to learn fine-grained patterns from the errors. To the best of our knowledge, ENHash represents the first attempt at integrating cognitively-inspired mechanisms into fine-grained UCMH optimization paradigms. We evaluate the proposed ENHash against eight state-of-the-art methods on three widely used datasets and one fine-grained cross-modal dataset. Experimental results show that ENHash achieves substantial improvements over existing approaches.

AAAI Conference 2026 Conference Paper

SPAN: Benchmarking and Improving Cross-Calendar Temporal Reasoning of Large Language Models

  • Zhongjian Miao
  • Hao Fu
  • Chen Wei

Temporal reasoning is a fundamental capability for large language models (LLMs) to understand real-world dynamics. Existing research on temporal reasoning has predominantly focused on the Gregorian calendar. However, as many countries and regions concurrently adopt multiple calendar systems, temporal reasoning across calendars becomes crucial for LLMs in global and multicultural contexts. Unfortunately, cross-calendar temporal reasoning remains underexplored, with no dedicated benchmark available to evaluate this capability. To bridge this gap, we introduce SPAN, a cross-calendar temporal reasoning benchmark, which requires LLMs to perform intra-calendar temporal reasoning and inter-calendar temporal conversion. SPAN features ten cross-calendar temporal reasoning directions, two reasoning types, and two question formats across six calendars. To enable time-variant and contamination-free evaluation, we propose a template-driven protocol for dynamic instance generation that enables assessment on a user-specified Gregorian date. We conduct extensive experiments on both open- and closed-source state-of-the-art (SOTA) LLMs over a range of dates spanning 100 years from 1960 to 2060. Our evaluations show that these LLMs achieve an average accuracy of only 34.5%, with none exceeding 80%, indicating that this task remains challenging. Through in-depth analysis of reasoning types, question formats, and temporal reasoning directions, we identify two key obstacles for LLMs: Future-Date Degradation and Calendar Asymmetry Bias. To strengthen LLMs' cross-calendar temporal reasoning capability, we further develop an LLM-powered Time Agent that leverages tool-augmented code generation. Empirical results show that Time Agent achieves an average accuracy of 95.31%, outperforming several competitive baselines, highlighting the potential of tool-augmented code generation to advance cross-calendar temporal reasoning. We hope this work will inspire further efforts toward more temporally and culturally adaptive LLMs.

AAAI Conference 2025 Conference Paper

FedCross: Intertemporal Federated Learning Under Evolutionary Games

  • Jianfeng Lu
  • Ying Zhang
  • Riheng Jia
  • Shuqin Cao
  • Jing Liu
  • Hao Fu

Federated Learning (FL) mitigates privacy leakage in decentralized machine learning by allowing multiple clients to train collaboratively locally. However, dynamic mobile networks with high mobility, intermittent connectivity, and bandwidth limitation severely hinder model updates to the cloud server. Although previous studies have typically addressed user mobility issue through task reassignment or predictive modeling, frequent migrations may result in high communication overhead. Addressing this challenge involves not only dealing with resource constraints, but also finding ways to mitigate the challenges posed by user migrations. We therefore propose a intertemporal incentive framework, FedCross, which ensures the continuity of FL tasks by migrating interrupted training tasks to feasible mobile devices. FedCross comprises two distinct stages: Specifically, in Stage 1, we address the task allocation problem across regions under resource constraints by employing a multi-objective migration algorithm to quantify the optimal task receivers. Moreover, we adopt evolutionary game theory to capture the dynamic decision-making of users, forecasting the evolution of user proportions across different regions to mitigate frequent migrations. In Stage 2, we utilize a procurement auction mechanism to allocate rewards among base stations, ensuring that those providing high-quality models receive optimal compensation. This approach incentivizes sustained user participation, thereby ensuring the overall feasibility of FedCross. Finally, experimental results validate the theoretical soundness of FedCross and demonstrate its significant reduction in communication overhead.

IROS Conference 2025 Conference Paper

Integrating Offline Pre-Training with Online Fine-Tuning: A Reinforcement Learning Approach for Robot Social Navigation

  • Run Su
  • Hao Fu
  • Shuai Zhou
  • Yingao Fu

Offline reinforcement learning (RL) has emerged as a promising framework for addressing robot social navigation challenges. However, inherent uncertainties in pedestrian behavior and limited environmental interaction during training often lead to suboptimal exploration and distributional shifts between offline training and online deployment. To overcome these limitations, this paper proposes a novel offline-to-online fine-tuning RL algorithm for robot social navigation by integrating Return-to-Go (RTG) prediction into a causal Transformer architecture. Our algorithm features a spatiotemporal fusion model designed to precisely estimate RTG values in real-time by jointly encoding temporal pedestrian motion patterns and spatial crowd dynamics. This RTG prediction framework mitigates distribution shift by aligning offline policy training with online environmental interactions. Furthermore, a hybrid offline-online experience sampling mechanism is built to stabilize policy updates during fine-tuning, ensuring balanced integration of pre-trained knowledge and real-time adaptation. Extensive experiments in simulated social navigation environments demonstrate that our method achieves a higher success rate and lower collision rate compared to state-of-the-art baselines. These results underscore the efficacy of our algorithm in enhancing navigation policy robustness and adaptability. This work paves the way for more reliable and adaptive robotic navigation systems in real-world applications.

AAAI Conference 2025 Conference Paper

TextToucher: Fine-Grained Text-to-Touch Generation

  • Jiahang Tu
  • Hao Fu
  • Fengyu Yang
  • Hanbin Zhao
  • Chao Zhang
  • Hui Qian

Tactile sensation plays a crucial role in the development of multi-modal large models and embodied intelligence. To collect tactile data with minimal cost as possible, a series of studies have attempted to generate tactile images by vision-to-touch image translation. However, compared to text modality, visual modality-driven tactile generation cannot accurately depict human tactile sensation. In this work, we analyze the characteristics of tactile images in detail from two granularities: object-level (tactile texture, tactile shape), and sensor-level (gel status). We model these granularities of information through text descriptions and propose a fine-grained Text-to-Touch generation method (TextToucher) to generate high-quality tactile samples. Specifically, we introduce a multimodal large language model to build the text sentences about object-level tactile information and employ a set of learnable text prompts to represent the sensor-level tactile information. To better guide the tactile generation process with the built text information, we fuse the dual grains of text information and explore various dual-grain text conditioning methods within the diffusion transformer architecture. Furthermore, we propose a Contrastive Text-Touch Pre-training (CTTP) metric to precisely evaluate the quality of text-driven generated tactile data. Extensive experiments demonstrate the superiority of our TextToucher method.

AAAI Conference 2025 Conference Paper

TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning

  • Gangqiang Hu
  • Jianfeng Lu
  • Jianmin Han
  • Shuqin Cao
  • Jing Liu
  • Hao Fu

Due to the sensitivity of data, Federated Learning (FL) is employed to enable distributed machine learning while safeguarding data privacy and accommodating the requirements of various devices. However, in the context of semidecentralized FL, clients’ communication and training states are dynamic. This variability arises from local training fluctuations, heterogeneous data distributions, and intermittent client participation. Most existing studies primarily focus on stable client states, neglecting the dynamic challenges inherent in real-world scenarios. To tackle this issue, we propose a TRust-Aware clIent scheduLing mechanism called TRAIL, which assesses client states and contributions, enhancing model training efficiency through selective client participation. We focus on a semi-decentralized FL framework where edge servers and clients train a shared global model using unreliable intra-cluster model aggregation and inter-cluster model consensus. First, we propose an adaptive hidden semi-Markov model to estimate clients’ communication states and contributions. Next, we address a client-server association optimization problem to minimize global training loss. Using convergence analysis, we propose a greedy client scheduling algorithm. Finally, our experiments conducted on real-world datasets demonstrate that TRAIL outperforms state-of-the-art baselines, achieving an improvement of 8.7% in test accuracy and a reduction of 15.3% in training loss.

NeurIPS Conference 2024 Conference Paper

Adaptive Domain Learning for Cross-domain Image Denoising

  • Zian Qian
  • Chenyang Qi
  • Ka L. Law
  • Hao Fu
  • Chenyang Lei
  • Qifeng Chen

Different camera sensors have different noise patterns, and thus an image denoising model trained on one sensor often does not generalize well to a different sensor. One plausible solution is to collect a large dataset for each sensor for training or fine-tuning, which is inevitably time-consuming. To address this cross-domain challenge, we present a novel adaptive domain learning (ADL) scheme for cross-domain RAW image denoising by utilizing existing data from different sensors (source domain) plus a small amount of data from the new sensor (target domain). The ADL training scheme automatically removes the data in the source domain that are harmful to fine-tuning a model for the target domain (some data are harmful as adding them during training lowers the performance due to domain gaps). Also, we introduce a modulation module to adopt sensor-specific information (sensor type and ISO) to understand input data for image denoising. We conduct extensive experiments on public datasets with various smartphone and DSLR cameras, which show our proposed model outperforms prior work on cross-domain image denoising, given a small amount of image data from the target domain sensor.

JBHI Journal 2022 Journal Article

HMRNet: High and Multi-Resolution Network With Bidirectional Feature Calibration for Brain Structure Segmentation in Radiotherapy

  • Hao Fu
  • Guotai Wang
  • Wenhui Lei
  • Wei Xu
  • Qianfei Zhao
  • Shichuan Zhang
  • Kang Li
  • Shaoting Zhang

Accurate segmentation of Anatomical brain Barriers to Cancer spread (ABCs) plays an important role for automatic delineation of Clinical Target Volume (CTV) of brain tumors in radiotherapy. Despite that variants of U-Net are state-of-the-art segmentation models, they have limited performance when dealing with ABCs structures with various shapes and sizes, especially thin structures (e. g. , the falx cerebri) that span only few slices. To deal with this problem, we propose a High and Multi-Resolution Network (HMRNet) that consists of a multi-scale feature learning branch and a high-resolution branch, which can maintain the high-resolution contextual information and extract more robust representations of anatomical structures with various scales. We further design a Bidirectional Feature Calibration (BFC) block to enable the two branches to generate spatial attention maps for mutual feature calibration. Considering the different sizes and positions of ABCs structures, our network was applied after a rough localization of each structure to obtain fine segmentation results. Experiments on the MICCAI 2020 ABCs challenge dataset showed that: 1) Our proposed two-stage segmentation strategy largely outperformed methods segmenting all the structures in just one stage; 2) The proposed HMRNet with two branches can maintain high-resolution representations and is effective to improve the performance on thin structures; 3) The proposed BFC block outperformed existing attention methods using monodirectional feature calibration. Our method won the second place of ABCs 2020 challenge and has a potential for more accurate and reasonable delineation of CTV of brain tumors.

AAAI Conference 2021 Conference Paper

LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding

  • Hao Fu
  • Shaojun Zhou
  • Qihong Yang
  • Junjie Tang
  • Guiquan Liu
  • Kaikui Liu
  • Xiaolong Li

The pre-training models such as BERT have achieved great results in various natural language processing problems. However, a large number of parameters need significant amounts of memory and the consumption of inference time, which makes it difficult to deploy them on edge devices. In this work, we propose a knowledge distillation method LRC- BERT based on contrastive learning to fit the output of the intermediate layer from the angular distance aspect, which is not considered by the existing distillation methods. Furthermore, we introduce a gradient perturbation-based training architecture in the training phase to increase the robustness of LRC-BERT, which is the first attempt in knowledge distillation. Additionally, in order to better capture the distribution characteristics of the intermediate layer, we design a twostage training method for the total distillation loss. Finally, by verifying 9 datasets on the General Language Understanding Evaluation (GLUE) benchmark, the performance of the proposed LRC-BERT exceeds the existing state-of-the-art methods, which proves the effectiveness of our method.

TIST Journal 2019 Journal Article

Personalized Reason Generation for Explainable Song Recommendation

  • Guoshuai Zhao
  • Hao Fu
  • Ruihua Song
  • Tetsuya Sakai
  • Zhongxia Chen
  • Xing Xie
  • Xueming Qian

Personalized recommendation has received a lot of attention as a highly practical research topic. However, existing recommender systems provide the recommendations with a generic statement such as “Customers who bought this item also bought…”. Explainable recommendation, which makes a user aware of why such items are recommended, is in demand. The goal of our research is to make the users feel as if they are receiving recommendations from their friends. To this end, we formulate a new challenging problem called personalized reason generation for explainable recommendation for songs in conversation applications and propose a solution that generates a natural language explanation of the reason for recommending a song to that particular user. For example, if the user is a student, our method can generate an output such as “Campus radio plays this song at noon every day, and I think it sounds wonderful,” which the student may find easy to relate to. In the offline experiments, through manual assessments, the gain of our method is statistically significant on the relevance to songs and personalization to users comparing with baselines. Large-scale online experiments show that our method outperforms manually selected reasons by 8.2% in terms of click-through rate. Evaluation results indicate that our generated reasons are relevant to songs and personalized to users, and they attract users to click the recommendations.

TIST Journal 2017 Journal Article

Robust Spammer Detection in Microblogs

  • Hao Fu
  • Xing Xie
  • Yong Rui
  • Neil Zhenqiang Gong
  • Guangzhong Sun
  • Enhong Chen

Microblogging Web sites, such as Twitter and Sina Weibo, have become popular platforms for socializing and sharing information in recent years. Spammers have also discovered this new opportunity to unfairly overpower normal users with unsolicited content, namely social spams. Although it is intuitive for everyone to follow legitimate users, recent studies show that both legitimate users and spammers follow spammers for different reasons. Evidence of users seeking spammers on purpose is also observed. We regard this behavior as useful information for spammer detection. In this article, we approach the problem of spammer detection by leveraging the “carefulness” of users, which indicates how careful a user is when she is about to follow a potential spammer. We propose a framework to measure the carefulness and develop a supervised learning algorithm to estimate it based on known spammers and legitimate users. We illustrate how the robustness of the detection algorithms can be improved with aid of the proposed measure. Evaluation on two real datasets from Sina Weibo and Twitter with millions of users are performed, as well as an online test on Sina Weibo. The results show that our approach indeed captures the carefulness, and it is effective for detecting spammers. In addition, we find that our measure is also beneficial for other applications, such as link prediction.

TIST Journal 2015 Journal Article

Effective Social Graph Deanonymization Based on Graph Structure and Descriptive Information

  • Hao Fu
  • Aston Zhang
  • Xing Xie

The study of online social networks has attracted increasing interest. However, concerns are raised for the privacy risks of user data since they have been frequently shared among researchers, advertisers, and application developers. To solve this problem, a number of anonymization algorithms have been recently developed for protecting the privacy of social graphs. In this article, we proposed a graph node similarity measurement in consideration with both graph structure and descriptive information, and a deanonymization algorithm based on the measurement. Using the proposed algorithm, we evaluated the privacy risks of several typical anonymization algorithms on social graphs with thousands of nodes from Microsoft Academic Search, LiveJournal, and the Enron email dataset, and a social graph with millions of nodes from Tencent Weibo. Our results showed that the proposed algorithm was efficient and effective to deanonymize social graphs without any initial seed mappings. Based on the experiments, we also pointed out suggestions on how to better maintain the data utility while preserving privacy.