Arrow Research search

Author name cluster

Fan Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

56 papers
2 author rows

Possible papers

56

AAAI Conference 2026 Conference Paper

EcoAgent: An Efficient Device-Cloud Collaborative Multi-Agent Framework for Mobile Automation

  • Biao Yi
  • Xueyu Hu
  • Yurun Chen
  • Shengyu Zhang
  • Hongxia Yang
  • Fan Wu

To tackle increasingly complex tasks, recent research on mobile agents has shifted towards multi-agent collaboration. Current mobile multi-agent systems are primarily deployed in the cloud, leading to high latency and operational costs. A straightforward idea is to deploy a device–cloud collaborative multi-agent system, which is nontrivial, as directly extending existing systems introduces new challenges: (1) reliance on cloud-side verification requires uploading mobile screenshots, compromising user privacy; and (2) open-loop cooperation lacking device-to-cloud feedback, underutilizing device resources and increasing latency. To overcome these limitations, we propose EcoAgent, a closed-loop device-cloud collaborative multi-agent framework designed for privacy-aware, efficient, and responsive mobile automation. EcoAgent integrates a novel reasoning approach, Dual-ReACT, into the cloud-based Planning Agent, fully exploiting cloud reasoning to compensate for limited on-device capacity, thereby enabling device-side verification and lightweight feedback. Furthermore, the device-based Observation Agent leverages a Pre-understanding Module to summarize screen content into concise textual descriptions, significantly reducing token usage and device-cloud communication overhead while preserving privacy. Experiments on AndroidWorld demonstrate that EcoAgent matches the task success rates of fully cloud-based agents, while reducing resource consumption and response latency.

JBHI Journal 2026 Journal Article

Privacy-Preserving Data Augmentation for Digital Pathology Using Improved DCGAN

  • Fengjun Hu
  • Fan Wu
  • Dongping Zhang
  • Hanjie Gu

The intelligent analysis of Whole Slide Images (WSI) in digital pathology is critical for advancing precision medicine, particularly in oncology. However, the availability of WSI datasets is often limited by privacy regulations, which constrains the performance and generalizability of deep learning models. To address this challenge, this paper proposes an improved data augmentation method based on Deep Convolutional Generative Adversarial Network (DCGAN). Our approach leverages self-supervised pretraining with the CTransPath model to extract diverse and representationally rich WSI features, which guide the generation of high-quality synthetic images. We further enhance the model by introducing a least-squares adversarial loss and a frequency domain loss to improve pixel-level accuracy and structural fidelity, while incorporating residual blocks and skip connections to increase network depth, mitigate gradient vanishing, and improve training stability. Experimental results on the PatchCamelyon dataset demonstrate that our improved DCGAN achieves superior SSIM and FID scores compared to traditional models. The augmented datasets significantly enhance the performance of downstream classification tasks, improving accuracy, AUC, and F1 scores.

NeurIPS Conference 2025 Conference Paper

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

  • Yuzheng Hu
  • Fan Wu
  • Haotian Ye
  • David Forsyth
  • James Zou
  • Nan Jiang
  • Jiaqi Ma
  • Han Zhao

Online reinforcement learning (RL) excels in complex, safety-critical domains but suffers from sample inefficiency, training instability, and limited interpretability. Data attribution provides a principled way to trace model behavior back to training samples, yet existing methods assume fixed datasets, which is violated in online RL where each experience both updates the policy and shapes future data collection. In this paper, we initiate the study of data attribution for online RL, focusing on the widely used Proximal Policy Optimization (PPO) algorithm. We start by establishing a local attribution framework, interpreting model checkpoints with respect to the records in the recent training buffer. We design two target functions, capturing agent action and cumulative return respectively, and measure each record's contribution through gradient similarity between its training loss and these targets. We demonstrate the power of this framework through three concrete applications: diagnosis of learning, temporal analysis of behavior formation, and targeted intervention during training. Leveraging this framework, we further propose an algorithm, iterative influence-based filtering (IIF), for online RL training that iteratively performs experience filtering to refine policy updates. Across standard RL benchmarks (classic control, navigation, locomotion) to RLHF for large language models, IIF reduces sample complexity, speeds up training, and achieves higher returns. Together, these results open a new direction for making online RL more interpretable, efficient, and effective.

AAAI Conference 2025 Conference Paper

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference

  • Zhuomin He
  • Yizhen Yao
  • Pengfei Zuo
  • Bin Gao
  • Qinya Li
  • Zhenzhe Zheng
  • Fan Wu

Long-context large language models (LLMs) inference is increasingly critical, motivating a number of studies devoted to alleviating the substantial storage and computational costs in such scenarios. Layer-wise skipping methods are promising optimizations but rarely explored in long-context inference. We observe that existing layer-wise skipping strategies have several limitations when applied in long-context inference, including the inability to adapt to model and context variability, disregard for sublayer significance, and inapplicability for the prefilling phase. This paper proposes AdaSkip, an adaptive sublayer skipping method specifically designed for long-context inference. AdaSkip adaptively identifies less important layers by leveraging on-the-fly similarity information, enables sublayer-wise skipping, and accelerates both the prefilling and decoding phases. The effectiveness of AdaSkip is demonstrated through extensive experiments on various long-context benchmarks and models, showcasing its superior inference performance over existing baselines.

JBHI Journal 2025 Journal Article

Comparative Efficacy of Commercial Wearables for Circadian Rhythm Home Monitoring From Activity, Heart Rate, and Core Body Temperature

  • Fan Wu
  • Patrick Langer
  • Jinjoo Shim
  • Elgar Fleisch
  • Filipe Barata

Circadian rhythms govern biological patterns that follow a 24-hour cycle. Dysfunctions in circadian rhythms can contribute to various health problems, such as sleep disorders. Current circadian rhythm assessment methods, often invasive or subjective, limit circadian rhythm monitoring to laboratories. Hence, this study aims to investigate scalable consumer-centric wearables for circadian rhythm monitoring outside traditional laboratories. In a two-week longitudinal study conducted in real-world settings, 36 participants wore an Actigraph, a smartwatch, and a core body temperature sensor to collect activity, temperature, and heart rate data. We evaluated circadian rhythms calculated from commercial wearables by comparing them with circadian rhythm reference measures, i. e. , Actigraph activities and chronotype questionnaire scores. The circadian rhythm metric acrophases, determined from commercial wearables using activity, heart rate, and temperature data, significantly correlated with the acrophase derived from Actigraph activities (r = 0. 96, r = 0. 87, r = 0. 79; all p $< $ 0. 001) and chronotype questionnaire (r = −0. 66, r = −0. 73, r = −0. 61; all p $< $ 0. 001). The acrophases obtained concurrently from consumer sensors significantly predicted the chronotype ( $R^{2}$ = 0. 64; p $< $ 0. 001). Our study validates commercial sensors for circadian rhythm assessment, highlighting their potential to support maintaining healthy rhythms and provide scalable and timely health monitoring in real-life scenarios.

NeurIPS Conference 2025 Conference Paper

CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs

  • Gucongcong Fan
  • Chaoyue Niu
  • Chengfei Lyu
  • Fan Wu
  • Guihai Chen

Mobile agents rely on Large Language Models (LLMs) to plan and execute tasks on smartphone user interfaces (UIs). While cloud-based LLMs achieve high task accuracy, they require uploading the full UI state at every step, exposing unnecessary and often irrelevant information. In contrast, local LLMs avoid UI uploads but suffer from limited capacity, resulting in lower task success rates. We propose $\textbf{CORE}$, a $\textbf{CO}$llaborative framework that combines the strengths of cloud and local LLMs to $\textbf{R}$educe UI $\textbf{E}$xposure, while maintaining task accuracy for mobile agents. CORE comprises three key components: (1) $\textbf{Layout-aware block partitioning}$, which groups semantically related UI elements based on the XML screen hierarchy; (2) $\textbf{Co-planning}$, where local and cloud LLMs collaboratively identify the current sub-task; and (3) $\textbf{Co-decision-making}$, where the local LLM ranks relevant UI blocks, and the cloud LLM selects specific UI elements within the top-ranked block. CORE further introduces a multi-round accumulation mechanism to mitigate local misjudgment or limited context. Experiments across diverse mobile apps and tasks show that CORE reduces UI exposure by up to 55. 6\% while maintaining task success rates slightly below cloud-only agents, effectively mitigating unnecessary privacy exposure to the cloud. The code is available at https: //github. com/Entropy-Fighter/CORE.

IJCAI Conference 2025 Conference Paper

Device-Cloud Collaborative Correction for On-Device Recommendation

  • Tianyu Zhan
  • Shengyu Zhang
  • Zheqi Lv
  • Jieming Zhu
  • Jiwei Li
  • Fan Wu
  • Fei Wu

With the rapid development of recommendation models and device computing power, device-based recommendation has become an important research area due to its better real-time performance and privacy protection. Previously, Transformer-based sequential recommendation models have been widely applied in this field because they outperform Recurrent Neural Network (RNN)-based recommendation models in terms of performance. However, as the length of interaction sequences increases, Transformer-based models introduce significantly more space and computational overhead compared to RNN-based models, posing challenges for device-based recommendation. To balance real-time performance and high performance on devices, we propose Device-Cloud Collaborative Correction Framework for On-Device Recommendation (CoCorrRec). CoCorrRec uses a self-correction network (SCN) to correct parameters with extremely low time cost. By updating model parameters during testing based on the input token, it achieves performance comparable to current optimal but more complex Transformer-based models. Furthermore, to prevent SCN from overfitting, we design a global correction network (GCN) that processes hidden states uploaded from devices and provides a global correction solution. Extensive experiments on multiple datasets show that CoCorrRec outperforms existing Transformer-based and RNN-based device recommendation models in terms of performance, with fewer parameters and lower FLOPs, thereby achieving a balance between real-time performance and high efficiency. Code is available at https: //github. com/Yuzt-zju/CoCorrRec.

NeurIPS Conference 2025 Conference Paper

FedRAM: Federated Reweighting and Aggregation for Multi-Task Learning

  • Fan Wu
  • Xinyu Yan
  • Jiabei Liu
  • Wei Yang Bryan Lim

Federated Multi-Task Learning (FL-MTL) enables clients with heterogeneous data to collaboratively train models capable of handling multiple downstream tasks. However, FL-MTL faces key challenges, including statistical heterogeneity, task interference, and the need to balance local learning with global knowledge sharing. Traditional methods like FedAvg struggle in such settings due to the lack of explicit mechanisms to address these issues. In this paper, we propose FedRAM, a three-step framework that progressively updates two scalar hyperparameters: the task importance weight and the client aggregation coefficient. FedRAM introduces a reference-proxy-agent strategy, where the proxy model serves as an intermediate between the local reference model and the global agent model. This design reduces the need for repeated local training while preserving local performance. Extensive experiments on six real-world FL-MTL benchmarks show that FedRAM improves performance by at least 3$\%$ over the most baseline on both in-domain and out-of-domain tasks, while reducing computational cost by 15$\times$. These results make FedRAM a robust and practical solution for large-scale FL-MTL applications. The code is available at \url{https: //github. com/wwffvv/FedRAM}.

ICLR Conference 2025 Conference Paper

Generative Monoculture in Large Language Models

  • Fan Wu
  • Emily Black
  • Varun Chandrasekaran

We introduce {\em generative monoculture}, a behavior observed in large language models (LLMs) characterized by a significant narrowing of model output diversity relative to available training data for a given task: for example, generating only positive book reviews for books with a mixed reception. While in some cases, generative monoculture enhances performance (e.g., LLMs more often produce efficient code), the dangers are exacerbated in others (e.g., LLMs refuse to share diverse opinions). As LLMs are increasingly used in high-impact settings such as education and web search, careful maintenance of LLM output diversity is essential to ensure a variety of facts and perspectives are preserved over time. We experimentally demonstrate the prevalence of generative monoculture through analysis of book review and code generation tasks, and find that simple countermeasures such as altering sampling or prompting strategies are insufficient to mitigate the behavior. Moreover, our results suggest that the root causes of generative monoculture are likely embedded within the LLM's alignment processes, suggesting a need for developing fine-tuning paradigms that preserve or promote diversity.

AAAI Conference 2025 Conference Paper

How Does the Smoothness Approximation Method Facilitate Generalization for Federated Adversarial Learning?

  • Wenjun Ding
  • Ying An
  • Lixing Chen
  • Shichao Kan
  • Fan Wu
  • Zhe Qu

Federated Adversarial Learning (FAL) is a robust framework for resisting adversarial attacks on federated learning. Although some FAL studies have developed efficient algorithms, they primarily focus on convergence performance and overlook generalization. Generalization is crucial for evaluating algorithm performance on unseen data. However, generalization analysis is more challenging due to non-smooth adversarial loss functions. A common approach to addressing this issue is to leverage smoothness approximation. In this paper, we develop algorithm stability measures to evaluate the generalization performance of two popular FAL algorithms: Vanilla FAL (VFAL) and Slack FAL (SFAL), using three different smooth approximation methods: 1) Surrogate Smoothness Approximation (SSA), (2) Randomized Smoothness Approximation (RSA), and (3) Over-Parameterized Smoothness Approximation (OPSA). Based on our in-depth analysis, we answer how to properly set the smoothness approximation method to mitigate generalization error in FAL. Moreover, we identify RSA as the most effective generalization error reduction method. In highly data-heterogeneous scenarios, we also recommend employing SFAL to mitigate the deterioration of generalization performance caused by heterogeneity. Based on our theoretical results, we provide insights to help develop more efficient FAL algorithms, such as designing new metrics and dynamic aggregation rules to mitigate heterogeneity.

ICML Conference 2025 Conference Paper

It's My Data Too: Private ML for Datasets with Multi-User Training Examples

  • Arun Ganesh
  • Ryan McKenna
  • H. Brendan McMahan
  • Adam Smith 0006
  • Fan Wu

We initiate a study of algorithms for model training with user-level differential privacy (DP), where each example may be attributed to multiple users, which we call the multi-attribution model. We first provide a carefully chosen definition of user-level DP under the multi-attribution model. Training in the multi-attribution model is facilitated by solving the contribution bounding problem, i. e. the problem of selecting a subset of the dataset for which each user is associated with a limited number of examples. We propose a greedy baseline algorithm for the contribution bounding problem. We then empirically study this algorithm for a synthetic logistic regression task and a transformer training task, including studying variants of this baseline algorithm that optimize the subset chosen using different techniques and criteria. We find that the baseline algorithm remains competitive with its variants in most settings, and build a better understanding of the practical importance of a bias-variance tradeoff inherent in solutions to the contribution bounding problem.

AAAI Conference 2025 Conference Paper

Learning Optimal Auctions with Correlated Value Distributions

  • Da Huo
  • Zhenzhe Zheng
  • Fan Wu

The correlation of values commonly exists in auctions, which can be further exploited to improve revenue. However, the complex correlation structure makes it hard to manually design the optimal auction mechanism. Data-driven auction mechanisms, powered by machine learning, enable to design auctions directly from historical auction data, without relying on specific value distributions. In this work, we synthesize the learning-based auction and the characteristics of strategy-proofness in the correlated value setting, and propose a new auction mechanism, namely Conditional Auction Net (CAN). The CAN can encode the correlation of values into the rank score of each bidder, and further adjust the allocation rule to approach the optimal revenue. The property of strategy-proofness is guaranteed by encoding the game theoretical condition into the neural network structure. Furthermore, all operations in the designed auctions are differentiable to enable an end-to-end training paradigm. We also present CAN can provide a large solution space to adequately encode the correlation of values. Experimental results demonstrate that the proposed auction mechanism can represent almost any strategy-proof auction mechanism, and outperforms the auction mechanisms wildly used in the correlated value settings.

NeurIPS Conference 2025 Conference Paper

Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance

  • Meng Wang
  • Fan Wu
  • Ruihui Li
  • Qin Yunchuan
  • Zhuo Tang
  • Li Ken Li

3D Semantic Scene Completion (SSC) provides comprehensive scene geometry and semantics for autonomous driving perception, which is crucial for enabling accurate and reliable decision-making. However, existing SSC methods are limited to capturing sparse information from the current frame or naively stacking multi-frame temporal features, thereby failing to acquire effective scene context. These approaches ignore critical motion dynamics and struggle to achieve temporal consistency. To address the above challenges, we propose a novel temporal SSC method FlowScene: Learning Temporal 3D Semantic Scene Completion via Optical Flow Guidance. By leveraging optical flow, FlowScene can integrate motion, different viewpoints, occlusions, and other contextual cues, thereby significantly improving the accuracy of 3D scene completion. Specifically, our framework introduces two key components: (1) a Flow-Guided Temporal Aggregation module that aligns and aggregates temporal features using optical flow, capturing motion-aware context and deformable structures; and (2) an Occlusion-Guided Voxel Refinement module that injects occlusion masks and temporally aggregated features into 3D voxel space, adaptively refining voxel representations for explicit geometric modeling. Experimental results demonstrate that FlowScene achieves state-of-the-art performance, with mIoU of 17. 70 and 20. 81 on the SemanticKITTI and SSCBench-KITTI-360 benchmarks.

AAAI Conference 2025 Conference Paper

MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities

  • Kunxi Li
  • Tianyu Zhan
  • Kairui Fu
  • Shengyu Zhang
  • Kun Kuang
  • Jiwei Li
  • Zhou Zhao
  • Fan Wu

In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models, facilitating the direct interaction, extraction, and application of knowledge within these parameter spaces. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters and adeptly learning to identify and map parameters into the target model. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage, including the training trajectory knowledge of the source model. Extensive experiments on heterogeneous knowledge transfer demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.

NeurIPS Conference 2025 Conference Paper

On the Stability and Generalization of Meta-Learning: the Impact of Inner-Levels

  • Wenjun Ding
  • Jingling Liu
  • Lixing Chen
  • Xiu Su
  • Tao Sun
  • Fan Wu
  • Zhe Qu

Meta-learning has achieved significant advancements, with generalization emerging as a key metric for evaluating meta-learning algorithms. While recent studies have mainly focused on training strategies, data-split methods, and tightening generalization bounds, they often ignore the impact of inner-levels on generalization. To bridge this gap, this paper focuses on several prominent meta-learning algorithms and establishes two generalization analytical frameworks for them based on their inner-processes: the Gradient Descent Framework (GDF) and the Proximal Descent Framework (PDF). Within these frameworks, we introduce two novel algorithmic stability definitions and derive the corresponding generalization bounds. Our findings reveal a trade-off of inner-levels under GDF, whereas PDF exhibits a beneficial relationship. Moreover, we highlight the critical role of the meta-objective function in minimizing generalization error. Inspired by this, we propose a new, simplified meta-objective function definition to enhance generalization performance. Many real-world experiments support our findings and show the improvement of the new meta-objective function.

IJCAI Conference 2025 Conference Paper

Optimizing the Battery-Swapping Problem in Urban E-Bike Systems with Reinforcement Learning

  • Wenjing Li
  • Zhao Li
  • Xuanwu Liu
  • Ruihao Zhu
  • Zhenzhe Zheng
  • Fan Wu

E-bikes (EBs) are a key transportation mode in urban area, especially for couriers of delivery platforms, but underdeveloped EB systems can hinder courier's productivity due to limited battery capacity. Battery-swapping stations address this issue by enabling riders to exchange depleted batteries for fully charged ones. However, managing supply and demand (SnD) imbalances at these stations has become increasingly complex. To address this, we introduce a new approach that formulates the Battery-Swapping Problem (BSP) as a discrete-time Markov Decision Process (MDP) to capture the dynamics of SnD imbalances. Building on it, we propose a Wasserstein-enhanced Proximal Policy Optimization (W-PPO) algorithm, which integrates Wasserstein distance with reinforcement learning to improve the robustness against uncertainty in forecasting SnD. W-PPO provides a BSP-specific, accurate loss function that reflects reward variations between two policies under real-world simulation. The algorithm’s effectiveness is assessed using key metrics: Shared Battery Utilization Ratio (SBUR) and Battery Supply Ratio (BSR). Simulations on real-world datasets show that W-PPO achieves a 30. 59% improvement in SBUR and a 16. 09% increase in BSR ensures practical applicability. By optimizing battery utilization and improving EB delivery systems, this work highlights the potential of AI for creating efficient and sustainable urban transportation solutions.

NeurIPS Conference 2025 Conference Paper

PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation

  • Wei Zhou
  • Guoliang Li
  • Haoyu Wang
  • Yuxing Han
  • Xufei Wu
  • Fan Wu
  • Xuanhe Zhou

Large language models (LLMs) have shown increasing effectiveness in Text-to-SQL tasks. However, another closely related problem, Cross-System SQL Translation (a. k. a. , SQL-to-SQL), which adapts a query written for one database system (e. g. , MySQL) into its equivalent one for another system (e. g. , ClickHouse), is of great practical importance but remains underexplored. Existing SQL benchmarks are not well-suited for SQL-to-SQL evaluation, which (1) focus on a limited set of database systems (often just SQLite) and (2) cannot capture many system-specific SQL dialects (e. g. , customized functions, data types, and syntax rules). Thus, in this paper, we introduce PARROT, a Practical And Realistic BenchmaRk for CrOss-System SQL Translation. PARROT comprises 598 translation pairs from 38 open-source benchmarks and real-world business services, specifically prepared to challenge system-specific SQL understanding (e. g. , LLMS achieve lower than 38. 53% accuracy on average). We also provide multiple benchmark variants, including PARROT-Diverse with 28, 003 translation (for extensive syntax testing) and PARROT-Simple with 5, 306 representative samples (for focused stress testing), covering 22 production-grade database systems. To promote future research, we release a public leaderboard and source code at: https: //code4db. github. io/parrot-bench/.

NeurIPS Conference 2025 Conference Paper

RAGRouter: Learning to Route Queries to Multiple Retrieval-Augmented Language Models

  • Jiarui Zhang
  • Xiangyu Liu
  • Yong Hu
  • Chaoyue Niu
  • Fan Wu
  • Guihai Chen

Retrieval-Augmented Generation (RAG) significantly improves the performance of Large Language Models (LLMs) on knowledge-intensive tasks. However, varying response quality across LLMs under RAG necessitates intelligent routing mechanisms, which select the most suitable model for each query from multiple retrieval-augmented LLMs via a dedicated router model. We observe that external documents dynamically affect LLMs' ability to answer queries, while existing routing methods, which rely on static parametric knowledge representations, exhibit suboptimal performance in RAG scenarios. To address this, we formally define the new retrieval-augmented LLM routing problem, incorporating the influence of retrieved documents into the routing framework. We propose RAGRouter, a RAG-aware routing design, which leverages document embeddings and RAG capability embeddings with contrastive learning to capture knowledge representation shifts and enable informed routing decisions. Extensive experiments on diverse knowledge-intensive tasks and retrieval settings, covering open and closed-source LLMs, show that RAGRouter outperforms the best individual LLM and existing routing methods. With an extended score-threshold-based mechanism, it also achieves strong performance-efficiency trade-offs under low-latency constraints. The code and data are available at https: //github. com/OwwO99/RAGRouter.

AAAI Conference 2024 Conference Paper

G-NAS: Generalizable Neural Architecture Search for Single Domain Generalization Object Detection

  • Fan Wu
  • Jinling Gao
  • Lanqing Hong
  • Xinbing Wang
  • Chenghu Zhou
  • Nanyang Ye

In this paper, we focus on a realistic yet challenging task, Single Domain Generalization Object Detection (S-DGOD), where only one source domain's data can be used for training object detectors, but have to generalize multiple distinct target domains. In S-DGOD, both high-capacity fitting and generalization abilities are needed due to the task's complexity. Differentiable Neural Architecture Search (NAS) is known for its high capacity for complex data fitting and we propose to leverage Differentiable NAS to solve S-DGOD. However, it may confront severe over-fitting issues due to the feature imbalance phenomenon, where parameters optimized by gradient descent are biased to learn from the easy-to-learn features, which are usually non-causal and spuriously correlated to ground truth labels, such as the features of background in object detection data. Consequently, this leads to serious performance degradation, especially in generalizing to unseen target domains with huge domain gaps between the source domain and target domains. To address this issue, we propose the Generalizable loss (G-loss), which is an OoD-aware objective, preventing NAS from over-fitting by using gradient descent to optimize parameters not only on a subset of easy-to-learn features but also the remaining predictive features for generalization, and the overall framework is named G-NAS. Experimental results on the S-DGOD urban-scene datasets demonstrate that the proposed G-NAS achieves SOTA performance compared to baseline methods. Codes are available at https://github.com/wufan-cse/G-NAS.

AAAI Conference 2024 Conference Paper

OCEAN-MBRL: Offline Conservative Exploration for Model-Based Offline Reinforcement Learning

  • Fan Wu
  • Rui Zhang
  • Qi Yi
  • Yunkai Gao
  • Jiaming Guo
  • Shaohui Peng
  • Siming Lan
  • Husheng Han

Model-based offline reinforcement learning (RL) algorithms have emerged as a promising paradigm for offline RL. These algorithms usually learn a dynamics model from a static dataset of transitions, use the model to generate synthetic trajectories, and perform conservative policy optimization within these trajectories. However, our observations indicate that policy optimization methods used in these model-based offline RL algorithms are not effective at exploring the learned model and induce biased exploration, which ultimately impairs the performance of the algorithm. To address this issue, we propose Offline Conservative ExplorAtioN (OCEAN), a novel rollout approach to model-based offline RL. In our method, we incorporate additional exploration techniques and introduce three conservative constraints based on uncertainty estimation to mitigate the potential impact of significant dynamic errors resulting from exploratory transitions. Our work is a plug-in method and can be combined with classical model-based RL algorithms, such as MOPO, COMBO, and RAMBO. Experiment results of our method on the D4RL MuJoCo benchmark show that OCEAN significantly improves the performance of existing algorithms.

ICLR Conference 2024 Conference Paper

Privately Aligning Language Models with Reinforcement Learning

  • Fan Wu
  • Huseyin A. Inan
  • Arturs Backurs
  • Varun Chandrasekaran
  • Janardhan Kulkarni
  • Robert Sim

Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT. In this work, we initiate the study of privacy-preserving alignment of LLMs through Differential Privacy (DP) in conjunction with RL. Following the influential work of Ziegler et al. (2020), we study two dominant paradigms: (i) alignment via RL without human in the loop (e.g., positive review generation) and (ii) alignment via RL from human feedback (RLHF) (e.g., summarization in a human-preferred way). We give a new DP framework to achieve alignment via RL, and prove its correctness. Our experimental results validate the effectiveness of our approach, offering competitive utility while ensuring strong privacy protections.

IROS Conference 2024 Conference Paper

Revolutionizing Battery Disassembly: The Design and Implementation of a Battery Disassembly Autonomous Mobile Manipulator Robot(BEAM-1)

  • Yanlong Peng
  • Zhigang Wang
  • Yisheng Zhang
  • Shengmin Zhang
  • Nan Cai
  • Fan Wu
  • Ming Chen

The efficient disassembly of end-of-life electric vehicle batteries(EOL-EVBs) is crucial for green manufacturing and sustainable development. The current pre-programmed disassembly conducted by the Autonomous Mobile Manipulator Robot(AMMR) struggles to meet the disassembly requirements in dynamic environments, complex scenarios, and unstructured processes. In this paper, we propose a Battery Disassembly AMMR(BEAM-1) system based on NeuralSymbolic AI. It detects the environmental state by leveraging a combination of multi-sensors and neural predicates and then translates this information into a quasi-symbolic space. In real-time, it identifies the optimal sequence of action primitives through LLM-heuristic tree search, ensuring high-precision execution of these primitives. Additionally, it employs positional speculative sampling using intuitive networks and achieves the disassembly of various bolt types with a meticulously designed end-effector. Importantly, BEAM-1 is a continuously learning embodied intelligence system capable of subjective reasoning like a human, and possessing intuition. A large number of real scene experiments have proved that it can autonomously perceive, decide, and execute to complete the continuous disassembly of bolts in multiple, multi-category, and complex situations, with a success rate of 98. 78%. This research attempts to use NeuroSymbolic AI to give robots real autonomous reasoning, planning, and learning capabilities. BEAM-1 realizes the revolution of battery disassembly. Its framework can be easily ported to any robotic system to realize different application scenarios, which provides a ground-breaking idea for the design and implementation of future embodied intelligent robotic systems.

AAMAS Conference 2024 Conference Paper

Towards Efficient Auction Design with ROI Constraints

  • Xinyu Tang
  • Hongtao Lv
  • Yingjie Gao
  • Fan Wu
  • Lei Liu
  • Lizhen Cui

Online advertising stands as a significant revenue source of the Internet. Recently, the trend among advertisers tilting towards the use of auto-bidding tools has heralded the emergence of a new model of bidders operating with constraints related to return on investment (ROI). However, most of the current research on ROIconstrained bidders in auction design only focuses on either the ROI constraints or values of bidders being private, while it is more practical to keep them both private in reality. Designing a truthful mechanism for bidders with both private values and ROI constraints introduces complexities because of the characteristics of designing mechanisms with multiple parameters. To remedy this, we divide bidders into binary classes: the traditional utility maximizers (UMs) who can be viewed as having an ROI constraint of 1, and the ROIconstrained bidders (RBs) who share a fixed ROI constraint denoted as 𝛾. This framework retains the essence of multi-parameter mechanism design but transitions this into a more tractable form. Then we introduce a novel auction mechanism, cleverly combining the conventional VCG mechanism and an existing mechanism for public ROI-constrained bidders which is called Cavallo’s mechanism. Our mechanism can achieve an approximation ratio of 3 2 on social welfare. Additionally, we unearth new insights into the limitations posed by ROI constraints. When the ROI constraint 𝛾 exceeds 2, the lower bound of social welfare is 5 4; when it falls below 2, the lower bound becomes 3+𝛾 2+3𝛾−𝛾2.

AAAI Conference 2024 Conference Paper

VIGC: Visual Instruction Generation and Correction

  • Bin Wang
  • Fan Wu
  • Xiao Han
  • Jiahui Peng
  • Huaping Zhong
  • Pan Zhang
  • Xiaoyi Dong
  • Weijia Li

The integration of visual encoders and large language models (LLMs) has driven recent progress in multimodal large language models (MLLMs). However, the scarcity of high-quality instruction-tuning data for vision-language tasks remains a challenge. The current leading paradigm, such as LLaVA, relies on language-only GPT-4 to generate data, which requires pre-annotated image captions and detection bounding boxes, suffering from understanding image details. A practical solution to this problem would be to utilize the available multimodal large language models to generate instruction data for vision-language tasks. However, it's worth noting that the currently accessible MLLMs are not as powerful as their LLM counterparts, as they tend to produce inadequate responses and generate false information. As a solution for addressing the current issue, this paper proposes the Visual Instruction Generation and Correction (VIGC) framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly. Specifically, Visual Instruction Generation (VIG) guides the vision-language model to generate diverse instruction-tuning data. To ensure generation quality, Visual Instruction Correction (VIC) adopts an iterative update mechanism to correct any inaccuracies in data produced by VIG, effectively reducing the risk of hallucination. Leveraging the diverse, high-quality data generated by VIGC, we finetune mainstream models and validate data quality based on various evaluations. Experimental results demonstrate that VIGC not only compensates for the shortcomings of language-only data generation methods, but also effectively enhances the benchmark performance. The models, datasets, and code are available at https://opendatalab.github.io/VIGC

NeurIPS Conference 2023 Conference Paper

Context Shift Reduction for Offline Meta-Reinforcement Learning

  • Yunkai Gao
  • Rui Zhang
  • Jiaming Guo
  • Fan Wu
  • Qi Yi
  • Shaohui Peng
  • Siming Lan
  • Ruizhi Chen

Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks. However, the context shift problem arises due to the distribution discrepancy between the contexts used for training (from the behavior policy) and testing (from the exploration policy). The context shift problem leads to incorrect task inference and further deteriorates the generalization ability of the meta-policy. Existing OMRL methods either overlook this problem or attempt to mitigate it with additional information. In this paper, we propose a novel approach called Context Shift Reduction for OMRL (CSRO) to address the context shift problem with only offline datasets. The key insight of CSRO is to minimize the influence of policy in context during both the meta-training and meta-test phases. During meta-training, we design a max-min mutual information representation learning mechanism to diminish the impact of the behavior policy on task representation. In the meta-test phase, we introduce the non-prior context collection strategy to reduce the effect of the exploration policy. Experimental results demonstrate that CSRO significantly reduces the context shift and improves the generalization ability, surpassing previous methods across various challenging domains.

NeurIPS Conference 2023 Conference Paper

Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning

  • Siming Lan
  • Rui Zhang
  • Qi Yi
  • Jiaming Guo
  • Shaohui Peng
  • Yunkai Gao
  • Fan Wu
  • Ruizhi Chen

In the field of multi-task reinforcement learning, the modular principle, which involves specializing functionalities into different modules and combining them appropriately, has been widely adopted as a promising approach to prevent the negative transfer problem that performance degradation due to conflicts between tasks. However, most of the existing multi-task RL methods only combine shared modules at the task level, ignoring that there may be conflicts within the task. In addition, these methods do not take into account that without constraints, some modules may learn similar functions, resulting in restricting the model's expressiveness and generalization capability of modular methods. In this paper, we propose the Contrastive Modules with Temporal Attention(CMTA) method to address these limitations. CMTA constrains the modules to be different from each other by contrastive learning and combining shared modules at a finer granularity than the task level with temporal attention, alleviating the negative transfer within the task and improving the generalization ability and the performance for multi-task RL. We conducted the experiment on Meta-World, a multi-task RL benchmark containing various robotics manipulation tasks. Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements over the baselines.

NeurIPS Conference 2023 Conference Paper

Decompose a Task into Generalizable Subtasks in Multi-Agent Reinforcement Learning

  • Zikang Tian
  • Ruizhi Chen
  • Xing Hu
  • Ling Li
  • Rui Zhang
  • Fan Wu
  • Shaohui Peng
  • Jiaming Guo

In recent years, Multi-Agent Reinforcement Learning (MARL) techniques have made significant strides in achieving high asymptotic performance in single task. However, there has been limited exploration of model transferability across tasks. Training a model from scratch for each task can be time-consuming and expensive, especially for large-scale Multi-Agent Systems. Therefore, it is crucial to develop methods for generalizing the model across tasks. Considering that there exist task-independent subtasks across MARL tasks, a model that can decompose such subtasks from the source task could generalize to target tasks. However, ensuring true task-independence of subtasks poses a challenge. In this paper, we propose to \textbf{d}ecompose a \textbf{t}ask in\textbf{to} a series of \textbf{g}eneralizable \textbf{s}ubtasks (DT2GS), a novel framework that addresses this challenge by utilizing a scalable subtask encoder and an adaptive subtask semantic module. We show that these components endow subtasks with two properties critical for task-independence: avoiding overfitting to the source task and maintaining consistent yet scalable semantics across tasks. Empirical results demonstrate that DT2GS possesses sound zero-shot generalization capability across tasks, exhibits sufficient transferability, and outperforms existing methods in both multi-task and single-task problems.

TIST Journal 2023 Journal Article

Enough Waiting for the Couriers: Learning to Estimate Package Pick-up Arrival Time from Couriers’ Spatial-Temporal Behaviors

  • Haomin Wen
  • Youfang Lin
  • Fan Wu
  • Huaiyu Wan
  • Zhongxiang Sun
  • Tianyue Cai
  • Hongyu Liu
  • Shengnan Guo

In intelligent logistics systems, predicting the Estimated Time of Pick-up Arrival (ETPA) of packages is a crucial task, which aims to predict the courier’s arrival time to all the unpicked-up packages at any time. Accurate prediction of ETPA can help systems alleviate customers’ waiting anxiety and improve their experience. We identify three main challenges of this problem. First, unlike the travel time estimation problem in other fields like ride-hailing, the ETPA task is distinctively a multi-destination and path-free prediction problem. Second, an intuitive idea for solving ETPA is to predict the pick-up route and then the time in two stages. However, it is difficult to accurately and efficiently predict couriers’ future routes in the route prediction step since their behaviors are affected by multiple complex factors. Third, furthermore, in the time prediction step, the requirement for providing a courier’s all unpicked-up packages’ ETPA at once in real time makes the problem even more challenging. To tackle the preceding challenges, we propose RankETPA, which integrates the route inference into the ETPA prediction. First, a learning-based pick-up route predictor is designed to learn the route-ranking strategies of couriers from their massive spatial-temporal behaviors. Then, a spatial-temporal attention-based arrival time predictor is designed for real-time ETPA inference via capturing the spatial-temporal correlations between the unpicked-up packages. Extensive experiments on two real-world datasets and a synthetic dataset demonstrate that RankETPA achieves significant performance improvement against the baseline models.

JMLR Journal 2023 Journal Article

Factor Graph Neural Networks

  • Zhen Zhang
  • Mohammed Haroon Dupty
  • Fan Wu
  • Javen Qinfeng Shi
  • Wee Sun Lee

In recent years, we have witnessed a surge of Graph Neural Networks (GNNs), most of which can learn powerful representations in an end-to-end fashion with great success in many real-world applications. They have resemblance to Probabilistic Graphical Models (PGMs), but break free from some limitations of PGMs. By aiming to provide expressive methods for representation learning instead of computing marginals or most likely configurations, GNNs provide flexibility in the choice of information flowing rules while maintaining good performance. Despite their success and inspirations, they lack efficient ways to represent and learn higher-order relations among variables/nodes. More expressive higher-order GNNs which operate on k-tuples of nodes need increased computational resources in order to process higher-order tensors. We propose Factor Graph Neural Networks (FGNNs) to effectively capture higher-order relations for inference and learning. To do so, we first derive an efficient approximate Sum-Product loopy belief propagation inference algorithm for discrete higher-order PGMs. We then neuralize the novel message passing scheme into a Factor Graph Neural Network (FGNN) module by allowing richer representations of the message update rules; this facilitates both efficient inference and powerful end-to-end learning. We further show that with a suitable choice of message aggregation operators, our FGNN is also able to represent Max-Product belief propagation, providing a single family of architecture that can represent both Max and Sum-Product loopy belief propagation. Our extensive experimental evaluation on synthetic as well as real datasets demonstrates the potential of the proposed model. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2023. ( edit, beta )

JBHI Journal 2023 Journal Article

Fetal Ultrasound Standard Plane Detection With Coarse-to-Fine Multi-Task Learning

  • Juncheng Guo
  • Guanghua Tan
  • Fan Wu
  • Huaxuan Wen
  • Kenli Li

The ultrasound standard plane plays an important role in prenatal fetal growth parameter measurement and disease diagnosis in prenatal screening. However, obtaining standard planes in a fetal ultrasound video is not only laborious and time-consuming but also depends on the clinical experience of sonographers to a certain extent. To improve the acquisition efficiency and accuracy of the ultrasound standard plane, we propose a novel detection framework that utilizes both the coarse-to-fine detection strategy and multi-task learning mechanism for feature-fused images. First, traditional manually-designed features and deep learning-based features are fused to obtain low-level shared features, which can enhance the model's feature expression ability. Inspired by the process of human recognition, ultrasound standard plane detection is divided into a coarse process of plane type classification and a fine process of standard-or-not detection, which is implemented via an end-to-end multi-task learning network. The region-of-interest area is also recognised in our detection framework to suppress the influence of a variable maternal background. Extensive experiments are conducted on three ultrasound planes of the first-class fetal examination, i. e. , the femur, thalamus, and abdomen ultrasound images. The experiment results show that our method outperforms competing methods in terms of accuracy, which demonstrates the efficacy of the proposed method and can reduce the workload of sonographers in prenatal screening.

AAAI Conference 2023 Conference Paper

GMDNet: A Graph-Based Mixture Density Network for Estimating Packages’ Multimodal Travel Time Distribution

  • Xiaowei Mao
  • Huaiyu Wan
  • Haomin Wen
  • Fan Wu
  • Jianbin Zheng
  • Yuting Qiang
  • Shengnan Guo
  • Lixia Wu

In the logistics network, accurately estimating packages' Travel Time Distribution (TTD) given the routes greatly benefits both consumers and platforms. Although recent works perform well in predicting an expected time or a time distribution in a road network, they could not be well applied to estimate TTD in logistics networks. Because TTD prediction in the logistics network requires modeling packages' multimodal TTD (MTTD, i.e., there can be more than one likely output with a given input) while leveraging the complex correlations in the logistics network. To this end, this work opens appealing research opportunities in studying MTTD learning conditioned on graph-structure data by investigating packages' travel time distribution in the logistics network. We propose a Graph-based Mixture Density Network, named GMDNet, which takes the benefits of both graph neural network and mixture density network for estimating MTTD conditioned on graph-structure data (i.e., the logistics network). Furthermore, we adopt the Expectation-Maximization (EM) framework in the training process to guarantee local convergence and thus obtain more stable results than gradient descent. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed model. Corrigendum Notice In the initial publication of this article, the authors (Mao et al. 2023) acknowledged that although it referred to an earlier paper already presented and published in ICML-21 (Errica et al. 2021), it insufficiently acknowledged the extent to which it incorporated and made extensive use of techniques therein. We are providing a Corrigendum Note, "PDF (2024-09-25)," alongside the original published version. The Corrigendum Note summarizes the main novel contributions of this paper. Errica, F.; Bacciu, D.; and Micheli, A. 2021. Graph Mixture Density Networks. In Proceedings of the 38th International Conference on Machine Learning (PMLR-28), 3025–3035. PMLR. Mao, X.; Wan, H.; Wen, H.; Wu, F.; Zheng, J.; Qiang, Y.; Guo, S.; Wu, L.; Hu, H.; and Lin, Y. 2023. GMDNet: A Graph-Based Mixture Density Network for Estimating Packages’ Multimodal Travel Time Distribution. In Proceedings of the 37th AAAI Conference on Artificial Intelligence.

IJCAI Conference 2023 Conference Paper

Truthful Auctions for Automated Bidding in Online Advertising

  • Yidan Xing
  • Zhilin Zhang
  • Zhenzhe Zheng
  • Chuan Yu
  • Jian Xu
  • Fan Wu
  • Guihai Chen

Automated bidding, an emerging intelligent decision-making paradigm powered by machine learning, has become popular in online advertising. Advertisers in automated bidding evaluate the cumulative utilities and have private financial constraints over multiple ad auctions in a long-term period. Based on these distinct features, we consider a new ad auction model for automated bidding: the values of advertisers are public while the financial constraints, such as budget and return on investment (ROI) rate, are private types. We derive the truthfulness conditions with respect to private constraints for this multi-dimensional setting, and demonstrate any feasible allocation rule could be equivalently reduced to a series of non-decreasing functions on budget. However, the resulted allocation mapped from these non-decreasing functions generally follows an irregular shape, making it difficult to obtain a closed-form expression for the auction objective. To overcome this design difficulty, we propose a family of truthful automated bidding auction with personalized rank scores, similar to the Generalized Second-Price (GSP) auction. The intuition behind our design is to leverage personalized rank scores as the criteria to allocate items, and compute a critical ROI to transforms the constraints on budget to the same dimension as ROI. The experimental results demonstrate that the proposed auction mechanism outperforms the widely used ad auctions, such as first-price auction and second-price auction, in various automated bidding environments.

AAAI Conference 2023 Conference Paper

Utility Maximizer or Value Maximizer: Mechanism Design for Mixed Bidders in Online Advertising

  • Hongtao Lv
  • Zhilin Zhang
  • Zhenzhe Zheng
  • Jinghan Liu
  • Chuan Yu
  • Lei Liu
  • Lizhen Cui
  • Fan Wu

Digital advertising constitutes one of the main revenue sources for online platforms. In recent years, some advertisers tend to adopt auto-bidding tools to facilitate advertising performance optimization, making the classical utility maximizer model in auction theory not fit well. Some recent studies proposed a new model, called value maximizer, for auto-bidding advertisers with return-on-investment (ROI) constraints. However, the model of either utility maximizer or value maximizer could only characterize partial advertisers in real-world advertising platforms. In a mixed environment where utility maximizers and value maximizers coexist, the truthful ad auction design would be challenging since bidders could manipulate both their values and affiliated classes, leading to a multi-parameter mechanism design problem. In this work, we address this issue by proposing a payment rule which combines the corresponding ones in classical VCG and GSP mechanisms in a novel way. Based on this payment rule, we propose a truthful auction mechanism with an approximation ratio of 2 on social welfare, which is close to the lower bound of at least 5/4 that we also prove. The designed auction mechanism is a generalization of VCG for utility maximizers and GSP for value maximizers.

TIST Journal 2022 Journal Article

DeepRoute+: Modeling Couriers’ Spatial-temporal Behaviors and Decision Preferences for Package Pick-up Route Prediction

  • Haomin Wen
  • Youfang Lin
  • Huaiyu Wan
  • Shengnan Guo
  • Fan Wu
  • Lixia Wu
  • Chao Song
  • Yinghui Xu

Over 10 billion packages are picked up every day in China. A fundamental task raised in the emerging intelligent logistics systems is the couriers’ package pick-up route prediction, which is beneficial for package dispatching, arrival-time estimation and overdue-risk evaluation, by leveraging the predicted routes to improve those downstream tasks. In the package pick-up scene, the decision-making of a courier is affected by strict spatial-temporal constraints (e.g., package location, promised pick-up time, current time, and courier’s current location). Furthermore, couriers have different decision preferences on various factors (e.g., time factor, distance factor, and balance of both), based on their own perception of the environments and work experience. In this article, we propose a novel model, named DeepRoute+, to predict couriers’ future package pick-up routes according to the couriers’ decision experience and preference learned from the historical behaviors. Specifically, DeepRoute+ consists of three layers: (1) The representation layer produces experience- and preference-aware representations for the unpicked-up packages, in which a decision preference module can dynamically adjust the importance of factors that affects the courier’s decision under the current situation. (2) The transformer encoder layer encodes the representations of packages while considering the spatial-temporal correlations among them. (3) The attention-based decoder layer uses the attention mechanism to generate the whole pick-up route recurrently. Experiments on a real-world logistics dataset demonstrate the state-of-the-art performance of our model.

NeurIPS Conference 2022 Conference Paper

Federated Submodel Optimization for Hot and Cold Data Features

  • Yucheng Ding
  • Chaoyue Niu
  • Fan Wu
  • Shaojie Tang
  • Chengfei Lyu
  • Yanghe Feng
  • Guihai Chen

We focus on federated learning in practical recommender systems and natural language processing scenarios. The global model for federated optimization typically contains a large and sparse embedding layer, while each client’s local data tend to interact with part of features, updating only a small submodel with the feature-related embedding vectors. We identify a new and important issue that distinct data features normally involve different numbers of clients, generating the differentiation of hot and cold features. We further reveal that the classical federated averaging algorithm (FedAvg) or its variants, which randomly selects clients to participate and uniformly averages their submodel updates, will be severely slowed down, because different parameters of the global model are optimized at different speeds. More specifically, the model parameters related to hot (resp. , cold) features will be updated quickly (resp. , slowly). We thus propose federated submodel averaging (FedSubAvg), which introduces the number of feature-related clients as the metric of feature heat to correct the aggregation of submodel updates. We prove that due to the dispersion of feature heat, the global objective is ill-conditioned, and FedSubAvg works as a suitable diagonal preconditioner. We also rigorously analyze FedSubAvg’s convergence rate to stationary points. We finally evaluate FedSubAvg over several public and industrial datasets. The evaluation results demonstrate that FedSubAvg significantly outperforms FedAvg and its variants.

AAMAS Conference 2022 Conference Paper

The Spoofing Resistance of Frequent Call Markets

  • Buhong Liu
  • Maria Polukarov
  • Carmine Ventre
  • Lingbo Li
  • Leslie Kanthan
  • Fan Wu
  • Michail Basios

We study the effects of spoofing attacks on frequent call markets (FCMs). Spoofing—a strategic manipulation to mislead market participants by creating spurious limit orders—could harm the market efficiency and has been declared illegal in many countries. However, this practice is still very common, which inspired extensive research on measuring, detecting and curbing such manipulation in the popular market model of continuous double auctions (CDAs). In this paper, we extend this research to frequent call markets and use agent-based modelling to simulate spoofing in this context. Specifically, we apply empirical game-theoretic analysis (EGTA) to compute equilibria of agent-based markets, and demonstrate that while spoofing may be profitable in both market models, it has less impact on FCMs as opposed to CDAs. Finally, we explore several FCM mechanism designs to help to curb this type of market manipulation even further.

IJCAI Conference 2021 Conference Paper

Heterogeneous Graph Information Bottleneck

  • Liang Yang
  • Fan Wu
  • Zichen Zheng
  • Bingxin Niu
  • Junhua Gu
  • Chuan Wang
  • Xiaochun Cao
  • Yuanfang Guo

Most attempts on extending Graph Neural Networks (GNNs) to Heterogeneous Information Networks (HINs) implicitly take the direct assumption that the multiple homogeneous attributed networks induced by different meta-paths are complementary. The doubts about the hypothesis of complementary motivate an alternative assumption of consensus. That is, the aggregated node attributes shared by multiple homogeneous attributed networks are essential for node representations, while the specific ones in each homogeneous attributed network should be discarded. In this paper, a novel Heterogeneous Graph Information Bottleneck (HGIB) is proposed to implement the consensus hypothesis in an unsupervised manner. To this end, information bottleneck (IB) is extended to unsupervised representation learning by leveraging self-supervision strategy. Specifically, HGIB simultaneously maximizes the mutual information between one homogeneous network and the representation learned from another homogeneous network, while minimizes the mutual information between the specific information contained in one homogeneous network and the representation learned from this homogeneous network. Model analysis reveals that the two extreme cases of HGIB correspond to the supervised heterogeneous GNN and the infomax on homogeneous graph, respectively. Extensive experiments on real datasets demonstrate that the consensus-based unsupervised HGIB significantly outperforms most semi-supervised SOTA methods based on complementary assumption.

NeurIPS Conference 2021 Conference Paper

Implicit Regularization in Matrix Sensing via Mirror Descent

  • Fan Wu
  • Patrick Rebeschini

We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analysis in terms of the Bregman divergence allows us to establish convergence of mirror descent---with different choices of the mirror maps---to a matrix that, among all global minimizers of the empirical risk, minimizes a quantity explicitly related to the nuclear norm, the Frobenius norm, and the von Neumann entropy. In both cases, this characterization implies that mirror descent, a first-order algorithm minimizing the unregularized empirical risk, recovers low-rank matrices under the same set of assumptions that are sufficient to guarantee recovery for nuclear-norm minimization. When the sensing matrices are symmetric and commute, we show that gradient descent with full-rank factorized parametrization is a first-order approximation to mirror descent, in which case we obtain an explicit characterization of the implicit bias of gradient flow as a by-product.

AAAI Conference 2021 Conference Paper

Toward Understanding the Influence of Individual Clients in Federated Learning

  • Yihao Xue
  • Chaoyue Niu
  • Zhenzhe Zheng
  • Shaojie Tang
  • Chengfei Lyu
  • Fan Wu
  • Guihai Chen

Federated learning allows mobile clients to jointly train a global model without sending their private data to a central server. Extensive works have studied the performance guarantee of the global model, however, it is still unclear how each individual client influences the collaborative training process. In this work, we defined a new notion, called Fed- Influence, to quantify this influence over the model parameters, and proposed an effective and efficient algorithm to estimate this metric. In particular, our design satisfies several desirable properties: (1) it requires neither retraining nor retracing, adding only linear computational overhead to clients and the server; (2) it strictly maintains the tenets of federated learning, without revealing any client’s local private data; and (3) it works well on both convex and non-convex loss functions, and does not require the final model to be optimal. Empirical results on a synthetic dataset and the FEMNIST dataset demonstrate that our estimation method can approximate Fed-Influence with small bias. Further, we show an application of Fed-Influence in model debugging.

NeurIPS Conference 2020 Conference Paper

A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval

  • Fan Wu
  • Patrick Rebeschini

We analyze continuous-time mirror descent applied to sparse phase retrieval, which is the problem of recovering sparse signals from a set of magnitude-only measurements. We apply mirror descent to the unconstrained empirical risk minimization problem (batch setting), using the square loss and square measurements. We provide a full convergence analysis of the algorithm in this non-convex setting and prove that, with the hypentropy mirror map, mirror descent recovers any $k$-sparse vector $\mathbf{x}^\star\in\mathbb{R}^n$ with minimum (in modulus) non-zero entry on the order of $\| \mathbf{x}^\star \|_2/\sqrt{k}$ from $k^2$ Gaussian measurements, modulo logarithmic terms. This yields a simple algorithm which, unlike most existing approaches to sparse phase retrieval, adapts to the sparsity level, without including thresholding steps or adding regularization terms. Our results also provide a principled theoretical understanding for Hadamard Wirtinger flow [54], as Euclidean gradient descent applied to the empirical risk problem with Hadamard parametrization can be recovered as a first-order approximation to mirror descent in discrete time.

NeurIPS Conference 2020 Conference Paper

Factor Graph Neural Networks

  • Zhen Zhang
  • Fan Wu
  • Wee Sun Lee

Most of the successful deep neural network architectures are structured, often consisting of elements like convolutional neural networks and gated recurrent neural networks. Recently, graph neural networks (GNNs) have been successfully applied to graph-structured data such as point cloud and molecular data. These networks often only consider pairwise dependencies, as they operate on a graph structure. We generalize the GNN into a factor graph neural network (FGNN) providing a simple way to incorporate dependencies among multiple variables. We show that FGNN is able to represent Max-Product belief propagation, an approximate inference method on probabilistic graphical models, providing a theoretical understanding on the capabilities of FGNN and related GNNs. Experiments on synthetic and real datasets demonstrate the potential of the proposed architecture.

AAAI Conference 2020 Conference Paper

Mechanism Design with Predicted Task Revenue for Bike Sharing Systems

  • Hongtao Lv
  • Chaoli Zhang
  • Zhenzhe Zheng
  • Tie Luo
  • Fan Wu
  • Guihai Chen

Bike sharing systems have been widely deployed around the world in recent years. A core problem in such systems is to reposition the bikes so that the distribution of bike supply is reshaped to better match the dynamic bike demand. When the bike-sharing company or platform is able to predict the revenue of each reposition task based on historic data, an additional constraint is to cap the payment for each task below its predicted revenue. In this paper, we propose an incentive mechanism called TruPreTar to incentivize users to park bicycles at locations desired by the platform toward rebalancing supply and demand. TruPreTar possesses four important economic and computational properties such as truthfulness and budget feasibility. Furthermore, we prove that even when the payment budget is tight, the total revenue still exceeds or equals the budget. Otherwise, TruPre- Tar achieves 2-approximation as compared to the optimal (revenue-maximizing) solution, which is close to the lower bound of at least √ 2 that we also prove. Using an industrial dataset obtained from a large bike-sharing company, our experiments show that TruPreTar is effective in rebalancing bike supply and demand and, as a result, generates high revenue that outperforms several benchmark mechanisms.

AAAI Conference 2019 Conference Paper

DeepETA: A Spatial-Temporal Sequential Neural Network Model for Estimating Time of Arrival in Package Delivery System

  • Fan Wu
  • Lixia Wu

Over 100 million packages are delivered every day in China due to the fast development of e-commerce. Precisely estimating the time of packages’ arrival (ETA) is significantly important to improving customers’ experience and raising the efficiency of package dispatching. Existing methods mainly focus on predicting the time from an origin to a destination. However, in package delivery problem, one trip contains multiple destinations and the delivery time of all destinations should be predicted at any time. Furthermore, the ETA is affected by many factors especially the sequence of the latest route, the regularity of the delivery pattern and the sequence of packages to be delivered, which are difficult to learn by traditional models. This paper proposed a novel spatial-temporal sequential neural network model (Deep- ETA) to take fully advantages of the above factors. Deep- ETA is an end-to-end network that mainly consists of three parts. First, the spatial encoding and the recurrent cells are proposed to capture the spatial-temporal and sequential features of the latest delivery route. Then, two attention-based layers are designed to indicate the most possible ETA from historical frequent and relative delivery routes based on the similarity of the latest route and the future destinations. Finally, a fully connected layer is utilized to jointly learn the delivery time. Experiments on real logistics dataset demonstrate that the proposed approach has outperforming results.

IJCAI Conference 2019 Conference Paper

Masked Graph Convolutional Network

  • Liang Yang
  • Fan Wu
  • Yingkui Wang
  • Junhua Gu
  • Yuanfang Guo

Semi-supervised classification is a fundamental technology to process the structured and unstructured data in machine learning field. The traditional attribute-graph based semi-supervised classification methods propagate labels over the graph which is usually constructed from the data features, while the graph convolutional neural networks smooth the node attributes, i. e. , propagate the attributes, over the real graph topology. In this paper, they are interpreted from the perspective of propagation, and accordingly categorized into symmetric and asymmetric propagation based methods. From the perspective of propagation, both the traditional and network based methods are propagating certain objects over the graph. However, different from the label propagation, the intuition ``the connected data samples tend to be similar in terms of the attributes", in attribute propagation is only partially valid. Therefore, a masked graph convolution network (Masked GCN) is proposed by only propagating a certain portion of the attributes to the neighbours according to a masking indicator, which is learned for each node by jointly considering the attribute distributions in local neighbourhoods and the impact on the classification results. Extensive experiments on transductive and inductive node classification tasks have demonstrated the superiority of the proposed method.

AAAI Conference 2019 Conference Paper

Unsupervised Fake News Detection on Social Media: A Generative Approach

  • Shuo Yang
  • Kai Shu
  • Suhang Wang
  • Renjie Gu
  • Fan Wu
  • Huan Liu

Social media has become one of the main channels for people to access and consume news, due to the rapidness and low cost of news dissemination on it. However, such properties of social media also make it a hotbed of fake news dissemination, bringing negative impacts on both individuals and society. Therefore, detecting fake news has become a crucial problem attracting tremendous research effort. Most existing methods of fake news detection are supervised, which require an extensive amount of time and labor to build a reliably annotated dataset. In search of an alternative, in this paper, we investigate if we could detect fake news in an unsupervised manner. We treat truths of news and users’ credibility as latent random variables, and exploit users’ engagements on social media to identify their opinions towards the authenticity of news. We leverage a Bayesian network model to capture the conditional dependencies among the truths of news, the users’ opinions, and the users’ credibility. To solve the inference problem, we propose an efficient collapsed Gibbs sampling approach to infer the truths of news and the users’ credibility without any labelled data. Experiment results on two datasets show that the proposed method significantly outperforms the compared unsupervised methods.

AAMAS Conference 2018 Conference Paper

Efficient Auctions with Identity-Dependent Negative Externalities

  • Chaoli Zhang
  • Xiang Wang
  • Fan Wu
  • Xiaohui Bei

We investigate a class of single-item multi-supply auctions (including digital goods auctions with unlimited supply) with bidders who have identity-based negative externalities. In such an auction, each bidder has a set of competitors. Her private valuation from winning an item decreases with the number of her winning competitors. Negative externalities are prevalent in many applications, in which the auctioned goods play a role in future interactions among the auction’s participants, such as patent licensing and sponsored search auctions. However, the development of auctions with such externalities is stymied by the computational difficulty of the underlying welfare maximization allocation problem; even without consideration of truthfulness, the problem of social welfare maximization with general competition relations is NP-hard and even hard to approximate within a constant factor (unless P=NP). In this work, we design polynomial time and strategy-proof mechanisms under different restrictions on the underlying competition graph structure. Our results can be summarized as follows. (1) When each bidder has only one competitor, we propose a truthful and welfare maximizing mechanism. (2) We design a truthful and (1 + ϵ)-approximation mechanism when the underlying competition graph is planar. (3) We give two truthful mechanisms when bidders have arbitrary competition relations, with welfare approximation ratio (n/ logn) and ⌈(d + 1)/3⌉, respectively, where d is the maximum degree of the “undirected” competition graph.

AAMAS Conference 2018 Conference Paper

On Designing Optimal Data Purchasing Strategies for Online Ad Auctions

  • Zun Li
  • Zhenzhe Zheng
  • Fan Wu
  • Guihai Chen

In online advertising, advertisers can purchase consumer relevant data from data marketplaces with a certain expenditure, and exploit the purchased data to guide the bidding process in ad auctions. One of the pressing problem faced by advertisers is to design the optimal data purchasing strategy (how much data to purchase to be competitive in bidding process) in online ad auctions. In this paper, we model the data purchasing strategy design as a convex optimization problem, jointly considering the expenditure paid during data purchasing and the benefits obtained from ad auctions. Using the techniques from Baysian game theory and convex analysis, we derive the optimal purchasing strategies for advertisers in different market scenarios. We also theoretically prove that the resulting strategy profile is the unique one that achieves Nash Equilibrium. Our analysis shows that the proposed data purchasing strategy can handle diverse ad auctions and valuation learning models. Our numerical results empirically reveal how the equilibrium state changes with variation of the strategic environment.

IJCAI Conference 2018 Conference Paper

Online Pricing for Revenue Maximization with Unknown Time Discounting Valuations

  • Weichao Mao
  • Zhenzhe Zheng
  • Fan Wu
  • Guihai Chen

Online pricing mechanisms have been widely applied to resource allocation in multi-agent systems. However, most of the existing online pricing mechanisms assume buyers have fixed valuations over the time horizon, which cannot capture the dynamic nature of valuation in emerging applications. In this paper, we study the problem of revenue maximization in online auctions with unknown time discounting valuations, and model it as non-stationary multi-armed bandit optimization. We design an online pricing mechanism, namely Biased-UCB, based on unique features of the discounting valuations. We use competitive analysis to theoretically evaluate the performance guarantee of our pricing mechanism, and derive the competitive ratio. Numerical results show that our design achieves good performance in terms of revenue maximization on a real-world bidding dataset.

AAMAS Conference 2016 Conference Paper

Strategy-Proof Data Auctions with Negative Externalities (Extended Abstract)

  • Xiang Wang
  • Zhenzhe Zheng
  • Fan Wu
  • Xiaoju Dong
  • Shaojie Tang
  • Guihai Chen

Data has appeared to be a new kind of commodity with distinctive characteristics, which make it fundamentally different from physical goods as well as traditional digital goods. Therefore, new trading mechanisms for data need to be designed. In this paper, we model the data market as an auction with negative externalities, and design practical mechanisms for data trading. Specifically, we propose a family of Data auctIons in CompetiTive mArkets, namely DIC- TA. DICTA contains two mechanisms, including DICTA- FUL and DICTA-PAR. DICTA-FUL is a direct revelation auction mechanism in full competition markets, achieving strategy-proofness and optimal social welfare. In the partial competition markets, we show that the allocation problem is NP-hard. Therefore, we present an approximation algorithm for winner determination. By carefully integrating this approximation allocation algorithm and a charging scheme, DICTA-PAR achieves both strategy-proofness and d-approximation, where d is the maximum degree of the underlying undirected graph of the competition graph. General Terms Algorithms, Theory, Economics

AAAI Conference 2014 Conference Paper

A Strategy-Proof Online Auction with Time Discounting Values

  • Fan Wu
  • Junming Liu
  • Zhenzhe Zheng
  • Guihai Chen

Online mechanism design has been widely applied to various practical applications. However, designing a strategy-proof online mechanism is much more challenging than that in a static scenario due to short of knowledge of future information. In this paper, we investigate online auctions with time discounting values, in contrast to the flat values studied in most of existing work. We present a strategy-proof 2-competitive online auction mechanism despite of time discounting values. We also implement our design and compare it with offline optimal solution. Our numerical results show that our design achieves good performance in terms of social welfare, revenue, average winning delay, and average valuation loss.