Arrow Research search

Author name cluster

Jun Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

50 papers
2 author rows

Possible papers

50

AAAI Conference 2026 Conference Paper

Model-Agnostic Sentiment Distribution Stability Analysis for Robust LLM-Generated Texts Detection

  • Siyuan Li
  • Xi Lin
  • Guangyan Li
  • Zehao Liu
  • Aodu Wulianghai
  • Li Ding
  • Jun Wu
  • Jianhua Li

The rapid advancement of large language models (LLMs) has resulted in increasingly sophisticated AI-generated content, posing significant challenges in distinguishing LLM-generated text from human-written language. Existing detection methods, primarily based on lexical heuristics or fine-tuned classifiers, often suffer from limited generalizability and are vulnerable to paraphrasing, adversarial perturbations, and cross-domain shifts. In this work, we propose SentiDetect, a model-agnostic framework for detecting LLM-generated text by analyzing the divergence in sentiment distribution stability. Our method is motivated by the empirical observation that LLM outputs tend to exhibit emotionally consistent patterns, whereas human-written texts display greater emotional variability. To capture this phenomenon, we define two complementary metrics: sentiment distribution consistency and sentiment distribution preservation, which quantify stability under sentiment-altering and semantic-preserving transformations. We evaluate SentiDetect on five diverse domains and a range of advanced LLMs, including Gemini-1.5-Pro, Claude-3, GPT-4-0613, and LLaMa-3.3. Experimental results demonstrate its superiority over state-of-the-art baselines, with over 16% and 11% F1 score improvements on Gemini-1.5-Pro and GPT-4-0613, respectively. Moreover, SentiDetect also shows greater robustness to paraphrasing, adversarial attacks, and text length variations, outperforming existing detectors in challenging scenarios.

AAAI Conference 2026 Conference Paper

Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization

  • Yuliang Chen
  • Xi Lin
  • Jun Wu
  • Xiangrui Cai
  • Qiaolun Zhang
  • Xichun Fan
  • Jiapeng Xu
  • Xiu Su

Federated Domain Generalization (FDG) aims to collaboratively train a global model across distributed clients that can generalize well on unseen domains. However, existing FDG methods typically struggle with cross-client data heterogeneity and incur significant communication and computation overhead. To address these challenges, this paper presents a new FDG framework, dubbed FaST-PT, which facilitates local feature augmentation and efficient unseen domain adaptation in a distributed manner. First, we propose a lightweight Multi-Modal Style Transfer (MST) method to transform image embedding under text supervision, which could expand the training data distribution and mitigate domain shift. We then design a dual-prompt module that decomposes the prompt into global and domain prompts. Specifically, global prompts capture general knowledge from augmented embedding across clients, while domain prompts capture domain-specific knowledge from local data. Besides, Domain-aware Prompt Generation (DPG) is introduced to adaptively generate suitable prompts for each sample, which facilitates unseen domain adaptation through knowledge fusion. Extensive experiments on four cross-domain benchmark datasets, e.g., PACS and DomainNet, demonstrate the superior performance of FaST-PT over SOTA FDG methods such as FedDG-GA and DiPrompt. Ablation studies further validate the effectiveness and efficiency of FaST-PT.

AAAI Conference 2025 Conference Paper

Addressing Cold-Start Problem in Click-Through Rate Prediction via Supervised Diffusion Modeling

  • Wenqiao Zhu
  • Lulu Wang
  • Jun Wu

Predicting Click-Through Rates is a crucial function within recommendation and advertising platforms, as the output of CTR prediction determines the order of items shown to users. The Embedding and MLP paradigm has become a standard approach for industrial recommendation systems and has been widely deployed. However, this paradigm suffers from cold-start problems, where there is either no or only limited user action data available, leading to poorly learned ID embeddings. The cold-start problem hampers the performance of new items. To address this problem, we design a novel diffusion model to generate a warmed-up embedding for new items. Specifically, we define a novel diffusion process between the ID embedding space and the side information space. In addition, we can derive a sub-sequence from the diffusion steps to expedite training, given that our diffusion model is non-Markovian. Our diffusion model is supervised by both the variational inference and binary cross-entropy objectives, enabling it to generate warmed-up embeddings for items in both the cold-start and warm-up phases. Additionally, we have conducted extensive experiments on three recommendation datasets. The results confirmed the effectiveness of our approach.

AAAI Conference 2025 Short Paper

Alleviating Dual Biases in Recommendation (Student Abstract)

  • Sijin Lu
  • Fangyuan Luo
  • Jun Wu

Causal Inference (CI) plays a crucial role in building unbiased recommender systems. However, most current CI-based debiasing methods only pay attention on either popularity bias or conformity bias. This paper presents a Disentangled Counterfactual Reasoning framework to alleviate dual biases in recommendation, so called DCR. Concretely, we consider the impact of both item popularity and user conformity during training, and separate their indirect effects by disentangling user and item embeddings into biased and unbiased components. In the inference stage, we perform counterfactual reasoning to simultaneously mitigate the indirect and direct effects of bias factors. Experimental results demonstrate the effectiveness of our DCR.

ICML Conference 2025 Conference Paper

BackSlash: Rate Constrained Optimized Training of Large Language Models

  • Jun Wu
  • Jiangtao Wen
  • Yuxing Han 0001

The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored. In this work, we introduce Rate-Constrained Training (BackSlash), a novel training-time compression approach based on rate-distortion optimization (RDO). BackSlash enables a flexible trade-off between model accuracy and complexity, significantly reducing parameter redundancy while preserving performance. Experiments in various architectures and tasks demonstrate that BackSlash can reduce memory usage by 60% - 90% without accuracy loss and provides significant compression gain compared to compression after training. Moreover, BackSlash proves to be highly versatile: it enhances generalization with small Lagrange multipliers, improves model robustness to pruning (maintaining accuracy even at 80% pruning rates), and enables network simplification for accelerated inference on edge devices.

AAAI Conference 2025 Conference Paper

KGCRR: An Effective Metric-Driven Knowledge Graph Completion Framework by Designing a Novel Upper Bound Function with Adaptive Approximation to Reciprocal Rank

  • Kuan Xu
  • Kuo Yang
  • Jian Liu
  • Xiangkui Lu
  • Jun Wu
  • Xuezhong Zhou

Knowledge Graph Embedding (KGE) methods have achieved great success in predicting missing links in knowledge graphs, a task also known as Knowledge Graph Completion (KGC). Under this task, the Reciprocal Rank (RR) of ground-truth items serve as a key indicator for evaluating the method’s performance. However, most existing studies have overlooked the inconsistency between the ranking metric, RR, and the optimization objective functions, resulting in sub-optimal KGC performance. To address this issue, we propose a KGC framework called KGCRR by designing a novel upper bound function named CRR. By introducing the parameter-pressure ρ to shift the sigmoid function, CRR achieves a better approximation to RR compared with existing objective functions. We theoretically proved that by adjusting ρ, CRR can achieve a more effective approximation to RR. By narrowing the discrepancy with RR and alleviating the gradient vanishing issue associated with the direct optimization of RR loss, CRR demonstrates an advantage in optimizing RR. CRR serves as a plug-and-play objective, capable of seamless integration into various KGE methods. Through extensive experiments conducted on FB15k-237 and WN18RR datasets, we have obtained promising results, with an average improvement of 19.06% in MRR, indicating that CRR significantly enhances the performance of existing methods.

NeurIPS Conference 2025 Conference Paper

MobileODE: An Extra Lightweight Network

  • Le Yu
  • Jun Wu
  • Bo Gou
  • Xiangde Min
  • Lei Zhang
  • Zhang Yi
  • Tao He

Depthwise-separable convolution has emerged as a significant milestone in the lightweight development of Convolutional Neural Networks (CNNs) over the past decade. This technique consists of two key components: depthwise convolution, which captures spatial information, and pointwise convolution, which enhances channel interactions. In this paper, we propose a novel method to lightweight CNNs through the discretization of Ordinary Differential Equations (ODEs). Specifically, we optimize depthwise-separable convolution by replacing the pointwise convolution with a discrete ODE module, termed the \emph{\textbf{C}hannelwise \textbf{O}DE \textbf{S}olver (COS)}. The COS module is constructed by a simple yet efficient direct differentiation Euler algorithm, using learnable increment parameters. This replacement reduces parameters by over $98. 36$\% compared to conventional pointwise convolution. By integrating COS into MobileNet, we develop a new extra lightweight network called MobileODE. With carefully designed basic and inverse residual blocks, the resulting MobileODEV1 and MobileODEV2 reduce channel interaction parameters by $71. 0$\% and $69. 2$\%, respectively, compared to MobileNetV1, while achieving higher accuracy across various tasks, including image classification, object detection, and semantic segmentation. The code is available at {\url{https: //github. com/cashily/MobileODE}}.

IROS Conference 2025 Conference Paper

NeuroLoc: Encoding Navigation Cells for 6-DOF Camera Localization

  • Xun Li
  • Jian Yang 0034
  • Fenli Jia
  • Muyu Wang
  • Jun Wu
  • Jinpeng Mi
  • Jilin Hu
  • Peidong Liang

Recently, camera localization has been widely adopted in autonomous robotic navigation due to its efficiency and convenience. However, autonomous navigation in unknown environments often suffers from scene ambiguity, environmental disturbances, and dynamic object transformation in camera localization. To address this problem, inspired by the brain cognitive navigation mechanism (such as grid cells, place cells, and head direction cells), we propose a novel neurobiological camera location method, namely NeuroLoc. Firstly, we designed a Hebbian learning module driven by place cells to save and replay historical information, aiming to restore the details of historical representations and solve the issue of scene fuzziness. Secondly, we utilized the head direction cell-inspired internal direction learning as multi-head attention embedding to help restore the true orientation in similar scenes. Finally, we added a 3D grid center prediction in the pose regression module to reduce the final wrong prediction. We evaluate the proposed NeuroLoc on commonly used benchmark indoor and outdoor datasets. The experimental results show that our NeuroLoc can enhance the robustness in complex environments and improve the performance of pose regression by using only a single image.

AAMAS Conference 2025 Conference Paper

OGS-SLAM: Hybrid ORB-Gaussian Splatting SLAM

  • Xiaohan Li
  • Wenxiang Shen
  • Dong Liu
  • Jun Wu

Traditional visual SLAM systems (dense and sparse) focus on building metric maps, but the internal representations are misaligned with human vision, making it insufficient for assisting robots in scene perception and interpretation. Conversely, aligning robot scene representation with human vision enables more intuitive human-to-robot commands and improves the generalization capability of deployed neural networks trained on natural images. Neural scene representation-based visual SLAM system, with its consistent and high-fidelity mapping, provides a novel way to assist robots in detailed scene depiction and comprehensive perception. However, end-to-end methods suffer from low accuracy in robot localization, which inevitably degrades mapping quality and limits their practical applications. In this paper, we propose a robust hybrid SLAM system, named OGS-SLAM, which integrates traditional visual SLAM with 3D Gaussian Splatting (3D GS) mapping. This system inherits the high localization accuracy of traditional SLAM while providing a scene model that aligns with human cognition, thereby offering a reliable foundation for downstream human-robot interaction tasks. Experiments demonstrate that our method outperforms state-of-the-art (SOTA) end-to-end SLAM systems in localization, mapping, and map semantic segmentation. Code will be available at: https: //github. com/realXiaohan/OGS-SLAM.

NeurIPS Conference 2025 Conference Paper

Purest Quantum State Identification

  • Yingqi Yu
  • Honglin Chen
  • Jun Wu
  • Wei Xie
  • Xiangyang Li

Quantum noise constitutes a fundamental obstacle to realizing practical quantum technologies. To address the pivotal challenge of identifying quantum systems least affected by noise, we introduce the purest quantum state identification, which can be used to improve the accuracy of quantum computation and communication. We formulate a rigorous paradigm for identifying the purest quantum state among $K$ unknown $n$-qubit quantum states using total $N$ quantum state copies. For incoherent strategies, we derive the first adaptive algorithm achieving error probability $\exp\left(- \Omega\left(\frac{N H_1}{\log(K) 2^n }\right) \right)$, fundamentally improving quantum property learning through measurement optimization. By developing a coherent measurement protocol with error bound $\exp\left(- \Omega\left(\frac{N H_2}{\log(K) }\right) \right)$, we demonstrate a significant separation from incoherent strategies, formally quantifying the power of quantum memory and coherent measurement. Furthermore, we establish a lower bound by demonstrating that all strategies with fixed two-outcome incoherent POVM must suffer error probability exceeding $ \exp\left( - O\left(\frac{NH_1}{2^n}\right)\right)$. This research advances the characterization of quantum noise through efficient learning frameworks. Our results establish theoretical foundations for noise-adaptive quantum property learning while delivering practical protocols for enhancing the reliability of quantum hardware.

AAAI Conference 2025 Short Paper

Top-one Recommendation with Anonymous User Behaviors (Student Abstract)

  • Xiangkui Lu
  • Jun Wu

Top-one recommendation with anonymous user behaviors, also known as session-based recommendation (SBR), faces challenges of top-one ranking and short anonymous sequences. To this end, we propose a novel objective that combines (1) a reciprocal rank loss to directly optimize the benchmark metric of top-one recommendation, with (2) a listwise contrastive loss to handle short sequences through listwise augmented consistency regularization. Empirical studies demonstrate that optimizing the proposed objective significantly improves the performance of existing SBR baselines.

AAAI Conference 2025 Conference Paper

Towards Trustworthy Machine Learning Under Distribution Shifts

  • Jun Wu

Transfer learning aims to transfer knowledge or information from a source domain to a relevant target domain. It involves two key challenges: distribution shifts and trustworthiness concerns. Having these challenges in mind, my research focuses on understanding transfer learning from the perspective of knowledge transferability (e.g., IID and non-IID learning tasks) and trustworthiness (e.g., adversarial robustness, data privacy, and performance fairness).

JAIR Journal 2025 Journal Article

Trustworthy Transfer Learning: A Survey

  • Jun Wu
  • Jingrui He

Transfer learning aims to transfer knowledge or information from a source domain to a relevant target domain. In this paper, we understand transfer learning from the perspectives of knowledge transferability and trustworthiness. This involves two research questions: How is knowledge transferability quantitatively measured and enhanced across domains? Can we trust the transferred knowledge in the transfer learning process? To answer these questions, this paper provides a comprehensive review of trustworthy transfer learning from various aspects, including problem definitions, theoretical analysis, empirical algorithms, and real-world applications. Specifically, we summarize recent theories and algorithms for understanding knowledge transferability under (within-domain) IID and non-IID assumptions. In addition to knowledge transferability, we review the impact of trustworthiness on transfer learning, e.g., whether the transferred knowledge is adversarially robust or algorithmically fair, how to transfer the knowledge under privacy-preserving constraints, etc. Beyond discussing the current advancements, we highlight the open questions and future directions for understanding transfer learning in a reliable and trustworthy manner.

IJCAI Conference 2024 Conference Paper

A Lightweight U-like Network Utilizing Neural Memory Ordinary Differential Equations for Slimming the Decoder

  • Quansong He
  • Xiaojun Yao
  • Jun Wu
  • Zhang Yi
  • Tao He

In recent years, advanced U-like networks have demonstrated remarkable performance in medical image segmentation tasks. However, their drawbacks, including excessive parameters, high computational complexity, and slow inference speed, pose challenges for practical implementation in scenarios with limited computational resources. Existing lightweight U-like networks have alleviated some problems, but they often have pre-designed structures and consist of non-detachable modules, limiting their application scenarios. In this paper, we propose three plug-and-play decoders by employing different discretization methods of the neural memory Ordinary Differential Equation (nmODE). These decoders integrate features at various levels of abstraction by processing information from skip connections and performing numerical operations on upward paths. Through experiments on the PH2, ISIC2017, and ISIC2018 datasets, we embed these decoders into different U-like networks, demonstrating their effectiveness in significantly reducing the number of parameters and computation while maintaining performance. In summary, the proposed discretized nmODE decoder is capable of reducing the number of parameters by about 20% ~ 50% and computation by up to 74%, while being adaptive to all U-like networks. Our code is available at https: //github. com/nayutayuki/Lightweight-nmODE-Decoders-For-U-like-networks.

JBHI Journal 2024 Journal Article

An Interpretable Deep Embedding Model for Few and Imbalanced Biomedical Data

  • Haishuai Wang
  • Jianjun Yang
  • Guangyu Tao
  • JiaLi Ma
  • Lianhua Chi
  • Jun Wu
  • Ziping Zhao

In healthcare, training examples are usually hard to obtain (e. g. , cases of a rare disease), or the cost of labelling data is high. With a large number of features ( $p$ ) be measured in a relatively small number of samples ( $N$ ), the “big $p$, small $N$ ” problem is an important subject in healthcare studies, especially on the genomic data. Another major challenge of effectively analyzing medical data is the skewed class distribution caused by the imbalance between different class labels. In addition, feature importance and interpretability play a crucial role in the success of solving medical problems. Therefore, in this paper, we present an interpretable deep embedding model (IDEM) to classify new data having seen only a few training examples with highly skewed class distribution. IDEM model consists of a feature attention layer to learn the informative features, a feature embedding layer to directly deal with both numerical and categorical features, a siamese network with contrastive loss to compare the similarity between learned embeddings of two input samples. Experiments on both synthetic data and real-world medical data demonstrate that our IDEM model has better generalization power than conventional approaches with few and imbalanced training medical samples, and it is able to identify which features contribute to the classifier in distinguishing case and control.

AAAI Conference 2024 Conference Paper

CTO-SLAM: Contour Tracking for Object-Level Robust 4D SLAM

  • Xiaohan Li
  • Dong Liu
  • Jun Wu

The demand for 4D ( 3D+time ) SLAM system is increasingly urgent, especially for decision-making and scene understanding. However, most of the existing simultaneous localization and mapping ( SLAM ) systems primarily assume static environments. They fail to represent dynamic scenarios due to the challenge of establishing robust long-term spatiotemporal associations in dynamic object tracking. We address this limitation and propose CTO-SLAM, a monocular and RGB-D object-level 4D SLAM system to track moving objects and estimate their motion simultaneously. In this paper, we propose contour tracking, which introduces contour features to enhance the keypoint representation of dynamic objects and coupled with pixel tracking to achieve long-term robust object tracking. Based on contour tracking, we propose a novel sampling-based object pose initialization algorithm and the following adapted bundle adjustment ( BA ) optimization algorithm to estimate dynamic object poses with high accuracy. The CTO-SLAM system is verified on both KITTI and VKITTI datasets. The experimental results demonstrate that our system effectively addresses cumulative errors in long-term spatiotemporal association and hence obtains substantial improvements over the state-of-the-art systems. The source code is available at https://github.com/realXiaohan/CTO-SLAM.

AAAI Conference 2024 Short Paper

Optimizing Recall in Deep Graph Hashing Framework for Item Retrieval (Student Abstract)

  • Fangyuan Luo
  • Jun Wu

Hashing-based recommendation (HR) methods, whose core idea is mapping users and items into hamming space, are common practice to improve item retrieval efficiency. However, existing HR fails to align optimization objective (i.e., Bayesian Personalized Ranking) and evaluation metric (i.e., Recall), leading to suboptimal performance. In this paper, we propose a smooth recall loss (termed as SRLoss), which targets Recall as the optimization objective. Due to the existence of discrete constraints, the optimization problem is NP-hard. To this end, we propose an approximation-adjustable gradient estimator to solve our problem. Experimental Results demonstrate the effectiveness of our proposed method.

NeurIPS Conference 2023 Conference Paper

Graph-Structured Gaussian Processes for Transferable Graph Learning

  • Jun Wu
  • Lisa Ainsworth
  • Andrew Leakey
  • Haixun Wang
  • Jingrui He

Transferable graph learning involves knowledge transferability from a source graph to a relevant target graph. The major challenge of transferable graph learning is the distribution shift between source and target graphs induced by individual node attributes and complex graph structures. To solve this problem, in this paper, we propose a generic graph-structured Gaussian process framework (GraphGP) for adaptively transferring knowledge across graphs with either homophily or heterophily assumptions. Specifically, GraphGP is derived from a novel graph structure-aware neural network in the limit on the layer width. The generalization analysis of GraphGP explicitly investigates the connection between knowledge transferability and graph domain similarity. Extensive experiments on several transferable graph learning benchmarks demonstrate the efficacy of GraphGP over state-of-the-art Gaussian process baselines.

AAAI Conference 2023 Conference Paper

Non-IID Transfer Learning on Graphs

  • Jun Wu
  • Jingrui He
  • Elizabeth Ainsworth

Transfer learning refers to the transfer of knowledge or information from a relevant source domain to a target domain. However, most existing transfer learning theories and algorithms focus on IID tasks, where the source/target samples are assumed to be independent and identically distributed. Very little effort is devoted to theoretically studying the knowledge transferability on non-IID tasks, e.g., cross-network mining. To bridge the gap, in this paper, we propose rigorous generalization bounds and algorithms for cross-network transfer learning from a source graph to a target graph. The crucial idea is to characterize the cross-network knowledge transferability from the perspective of the Weisfeiler-Lehman graph isomorphism test. To this end, we propose a novel Graph Subtree Discrepancy to measure the graph distribution shift between source and target graphs. Then the generalization error bounds on cross-network transfer learning, including both cross-network node classification and link prediction tasks, can be derived in terms of the source knowledge and the Graph Subtree Discrepancy across domains. This thereby motivates us to propose a generic graph adaptive network (GRADE) to minimize the distribution shift between source and target graphs for cross-network transfer learning. Experimental results verify the effectiveness and efficiency of our GRADE framework on both cross-network node classification and cross-domain recommendation tasks.

AAAI Conference 2023 Short Paper

Semi-supervised Review-Aware Rating Regression (Student Abstract)

  • Xiangkui Lu
  • Jun Wu

Semi-supervised learning is a promising solution to mitigate data sparsity in review-aware rating regression (RaRR), but it bears the risk of learning with noisy pseudo-labelled data. In this paper, we propose a paradigm called co-training-teaching (CoT2), which integrates the merits of both co-training and co-teaching towards the robust semi-supervised RaRR. Concretely, CoT2 employs two predictors and each of them alternately plays the roles of "labeler" and "validator" to generate and validate pseudo-labelled instances. Extensive experiments show that CoT2 considerably outperforms state-of-the-art RaRR techniques, especially when training data is severely insufficient.

JAAMAS Journal 2022 Journal Article

A Bayesian optimal social law synthesizing mechanism for strategical agents

  • Jun Wu
  • Jie Cao
  • Chongjun Wang

Abstract One of the effective and well studied approaches for coordinating multiagent systems is to synthesize social laws which restrict the behavior of individual agents. We show that when rational behavior of the agents and private information are considered, the optimal social law synthesizing problem naturally evolves into a setting which can be handled by the framework of algorithmic mechanism design. We focus on the Bayesian case in this paper, that is, the probability distribution of each agent’s private cost is known by the public. In this case, our problem closely relates to path/spanning-tree auctions and Myerson’s optimal auction mechanism, but the optimization objective is new, that is, we focus on profit maximization instead of payment maximization in a reverse auction, and as far as we know in this setting no existing mechanism can be directly applied. By studying this problem: We further extend the logic-based framework of social law optimization problem to the strategic case, and show that it becomes a new problem of algorithmic mechanism design; We find out a mechanism that is incentive compatible, individually rational and maximize the expected profit for all input cost profiles; However, we can show that this mechanism is computational intractable (FP \(^{NP}\) -complete); So, we try to specify Computation Tree Logic semantics as a set of linear-integer constraints, design a Integer-Linear Programming based algorithm for computing the proposed mechanism, enabling it to be handled by current ILP solvers, and finally find out a tractable 2-approximation mechanism.

AAAI Conference 2022 Short Paper

A Hybrid Evolutionary Algorithm for the Diversified Top-k Weight Clique Search Problem (Student Abstract)

  • Jun Wu
  • Minghao Yin

The diversified top-k weight clique (DTKWC) search problem is an important generalization of the diversified top-k clique search problem, which extends the DTKC search problem by taking into account the weight of vertices. This problem involves finding at most k maximal weighted cliques that cover maximum weight of vertices with low overlapping in a given graph. In this study, a mixed integer linear program constraint formulation is proposed to model DTKWC search problem and an efficient hybrid evolutionary algorithm (HEA-D) based on some heuristic strategies is proposed to tackle it. Experiments on two sets of 110 graphs show that HEA-D outperforms the state-of-art methods.

IJCAI Conference 2022 Conference Paper

A Unified Meta-Learning Framework for Dynamic Transfer Learning

  • Jun Wu
  • Jingrui He

Transfer learning refers to the transfer of knowledge or information from a relevant source task to a target task. However, most existing works assume both tasks are sampled from a stationary task distribution, thereby leading to the sub-optimal performance for dynamic tasks drawn from a non-stationary task distribution in real scenarios. To bridge this gap, in this paper, we study a more realistic and challenging transfer learning setting with dynamic tasks, i. e. , source and target tasks are continuously evolving over time. We theoretically show that the expected error on the dynamic target task can be tightly bounded in terms of source knowledge and consecutive distribution discrepancy across tasks. This result motivates us to propose a generic meta-learning framework L2E for modeling the knowledge transferability on dynamic tasks. It is centered around a task-guided meta-learning problem with a group of meta-pairs of tasks, based on which we are able to learn the prior model initialization for fast adaptation on the newest target task. L2E enjoys the following properties: (1) effective knowledge transferability across dynamic tasks; (2) fast adaptation to the new target task; (3) mitigation of catastrophic forgetting on historical target tasks; and (4) flexibility in incorporating any existing static transfer learning algorithms. Extensive experiments on various image data sets demonstrate the effectiveness of the proposed L2E framework.

TMLR Journal 2022 Journal Article

Can You Win Everything with A Lottery Ticket?

  • Tianlong Chen
  • Zhenyu Zhang
  • Jun Wu
  • Randy Huang
  • Sijia Liu
  • Shiyu Chang
  • Zhangyang Wang

$\textit{Lottery ticket hypothesis}$ (LTH) has demonstrated to yield independently trainable and highly sparse neural networks (a.k.a. $\textit{winning tickets}$), whose test set accuracies can be surprisingly on par or even better than dense models. However, accuracy is far from the only evaluation metric, and perhaps not always the most important one. Hence it might be myopic to conclude that a sparse subnetwork can replace its dense counterpart, even if the accuracy is preserved. Spurred by that, we perform the first comprehensive assessment of lottery tickets from diverse aspects beyond test accuracy, including $\textit{(i)}$ generalization to distribution shifts, $\textit{(ii)}$ prediction uncertainty, $\textit{(iii)}$ interpretability, and $\textit{(iv)}$ geometry of loss landscapes. With extensive experiments across datasets {CIFAR-10, CIFAR-100, and ImageNet}, model architectures, as well as tens of sparsification methods, we thoroughly characterize the trade-off between model sparsity and the all-dimension model capabilities. We find that an appropriate sparsity (e.g., $20\%\sim99.08\%$) can yield the winning ticket to perform comparably or even better $\textbf{in all above four aspects}$, although some aspects (generalization to certain distribution shifts, and uncertainty) appear more sensitive to the sparsification than others. We term it as a $\texttt{LTH-PASS}$. Overall, our results endorse choosing a good sparse subnetwork of a larger dense model, over directly training a small dense model of similar parameter counts. We hope that our study can offer more in-depth insights on pruning, for researchers and engineers who seek to incorporate sparse neural networks for user-facing deployments. Codes are available in: https://github.com/VITA-Group/LTH-Pass.

IJCAI Conference 2022 Conference Paper

Discrete Listwise Personalized Ranking for Fast Top-N Recommendation with Implicit Feedback

  • Fangyuan Luo
  • Jun Wu
  • Tao Wang

We address the efficiency problem of personalized ranking from implicit feedback by hashing users and items with binary codes, so that top-N recommendation can be fast executed in a Hamming space by bit operations. However, current hashing methods for top-N recommendation fail to align their learning objectives (such as pointwise or pairwise loss) with the benchmark metrics for ranking quality (e. g. Average Precision, AP), resulting in sub-optimal accuracy. To this end, we propose a Discrete Listwise Personalized Ranking (DLPR) model that optimizes AP under discrete constraints for fast and accurate top-N recommendation. To resolve the challenging DLPR problem, we devise an efficient algorithm that can directly learn binary codes in a relaxed continuous solution space. Specifically, theoretical analysis shows that the optimal solution to the relaxed continuous optimization problem is exactly the same as that of the original discrete DLPR problem. Through extensive experiments on two real-world datasets, we show that DLPR consistently surpasses state-of-the-art hashing methods for top-N recommendation.

NeurIPS Conference 2022 Conference Paper

Distribution-Informed Neural Networks for Domain Adaptation Regression

  • Jun Wu
  • Jingrui He
  • Sheng Wang
  • Kaiyu Guan
  • Elizabeth Ainsworth

In this paper, we study the problem of domain adaptation regression, which learns a regressor for a target domain by leveraging the knowledge from a relevant source domain. We start by proposing a distribution-informed neural network, which aims to build distribution-aware relationship of inputs and outputs from different domains. This allows us to develop a simple domain adaptation regression framework, which subsumes popular domain adaptation approaches based on domain invariant representation learning, reweighting, and adaptive Gaussian process. The resulting findings not only explain the connections of existing domain adaptation approaches, but also motivate the efficient training of domain adaptation approaches with overparameterized neural networks. We also analyze the convergence and generalization error bound of our framework based on the distribution-informed neural network. Specifically, our generalization bound focuses explicitly on the maximum mean discrepancy in the RKHS induced by the neural tangent kernel of distribution-informed neural network. This is in sharp contrast to the existing work which relies on domain discrepancy in the latent feature space heuristically formed by one or several hidden neural layers. The efficacy of our framework is also empirically verified on a variety of domain adaptation regression benchmarks.

IJCAI Conference 2022 Conference Paper

HEA-D: A Hybrid Evolutionary Algorithm for Diversified Top-k Weight Clique Search Problem

  • Jun Wu
  • Chu-Min Li
  • Yupeng Zhou
  • Minghao Yin
  • Xin Xu
  • Dangdang Niu

The diversified top-k weight clique (DTKWC) search problem is an important generalization of the diversified top-k clique (DTKC) search problem with extensive applications, which extends the DTKC search problem by taking into account the weight of vertices. In this paper, we formulate DTKWC search problem using mixed integer linear program constraints and propose an efficient hybrid evolutionary algorithm (HEA-D) that combines a clique-based crossover operator and an effective simulated annealing-based local optimization procedure to find high-quality local optima. The experimental results show that HEA-D performs much better than the existing methods on two representative real-world benchmarks.

JBHI Journal 2022 Journal Article

Interpretability Analysis of One-Year Mortality Prediction for Stroke Patients Based on Deep Neural Network

  • Shuo Zhang
  • Jing Wang
  • Lulu Pei
  • Kai Liu
  • Yuan Gao
  • Hui Fang
  • Rui Zhang
  • Lu Zhao

Clinically, physicians collect the benchmark medical data to establish archives for a stroke patient and then add the follow up data regularly. It has great significance on prognosis prediction for stroke patients. In this paper, we present an interpretable deep learning model to predict the one-year mortality risk on stroke. We design sub-modules to reconstruct features from original clinical data that highlight the dissimilarity and temporality of different variables. The model consists of Bidirectional Long Short-Term Memory (Bi-LSTM), in which a novel correlation attention module is proposed that takes the correlation of variables into consideration. In experiments, datasets are collected clinically from the department of neurology in a local AAA hospital. It consists of 2, 275 stroke patients hospitalized in the department of neurology from 2014 to 2016. Our model achieves a precision of 0. 9414, a recall of 0. 9502 and an F1-score of 0. 9415. In addition, we provide the analysis of the interpretability by visualizations with reference to clinical professional guidelines.

AAAI Conference 2021 Conference Paper

DeHiB: Deep Hidden Backdoor Attack on Semi-supervised Learning via Adversarial Perturbation

  • Zhicong Yan
  • Gaolei Li
  • Yuan Tian
  • Jun Wu
  • Shenghong Li
  • Mingzhe Chen
  • H. Vincent Poor

The threat of data-poisoning backdoor attacks on learning algorithms typically comes from the labeled data used for learning. However, in deep semi-supervised learning (SSL), unknown threats mainly stem from unlabeled data. In this paper, we propose a novel deep hidden backdoor (DeHiB) attack for SSL-based systems. In contrast to the conventional attacking methods, the DeHiB can feed malicious unlabeled training data to the semi-supervised learner so as to enable the SSL model to output premeditated results. In particular, a robust adversarial perturbation generator regularized by a unified objective function is proposed to generate poisoned data. To alleviate the negative impact of trigger patterns on model accuracy and improve the attack success rate, a novel contrastive data poisoning strategy is designed. Using the proposed data poisoning scheme, one can implant the backdoor into the SSL model using the raw data without handcrafted labels. Extensive experiments based on CIFAR10 and CIFAR100 datasets demonstrates the effectiveness and crypticity of the proposed scheme.

AAAI Conference 2021 Conference Paper

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

  • Xin Chen
  • Lingxi Xie
  • Jun Wu
  • Longhui Wei
  • Yuhui Xu
  • Qi Tian

Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from which exponentially many sub-networks can be sampled and efficiently evaluated. These methods enjoy great advantages in terms of computational costs, but the sampled sub-networks are not guaranteed to be estimated precisely unless an individual training process is taken. This paper owes such inaccuracy to the inevitable mismatch between assembled network layers, so that there is a random error term added to each estimation. We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates, which consequently leads to better performance of the final architecture. In addition, our approach also enjoys the flexibility of being used under different hardware constraints, since the graph convolutional network has provided an efficient lookup table of the performance of architectures in the entire search space.

AAAI Conference 2021 Short Paper

Local Search for Diversified Top-k s-plex Search Problem (Student Abstract)

  • Jun Wu
  • Minghao Yin

The diversified top-k s-plex (DTKSP) search problem aims to find k maximal s-plexes that cover the maximum number of vertices with lower overlapping in a given graph. In this paper, we first formalize the diversified top-k s-plex search problem and prove the NP-hardness of it. Second, we proposed a local search algorithm for solving the diversified topk s-plex search problem based on some novel ideas. Experiments on real-world massive graphs show the effectiveness of our algorithm.

AAAI Conference 2021 Short Paper

Semi-Discrete Social Recommendation (Student Abstract)

  • Fangyuan Luo
  • Jun Wu
  • Haishuai Wang

Combining matrix factorization (MF) with network embedding (NE) has been a promising solution to social recommender systems. However, such a scheme suffers from the online predictive efficiency issue due to the ever-growing users and items. In this paper, we propose a novel hashingbased social recommendation model, called semi-discrete socially embedded matrix factorization (S2MF), which leverages the dual advantages of social information for recommendation effectiveness and hashing trick for online predictive efficiency. Experimental results demonstrate the advantages of S2MF over state-of-the-art discrete recommendation models and its real-valued competitors.

AAAI Conference 2020 Conference Paper

A Multi-Unit Profit Competitive Mechanism for Cellular Traffic Offloading

  • Jun Wu
  • Yu Qiao
  • Lei Zhang
  • Chongjun Wang
  • Meilin Liu

Cellular traffic offloading is nowadays an important problem in mobile networking. We model it as a procurement problem where each agent sells multi-units of a homogeneous item with privately known capacity and unit cost, and the auctioneer’s demand valuation function is symmetric submodular. Based on the framework of random sampling and profit extraction, we aim to design a prior-free mechanism which guarantees a profit competitive to the omniscient single-price auction. However, the symmetric submodular demand valuation function and 2-parameter setting present new challenges. By adopting the highest feasible clear price, we successfully design a truthful profit extractor, and then we propose a mechanism which is proved to be truthful, individually rational and constant-factor competitive in a fixed market.

IS Journal 2020 Journal Article

Semi-discrete Matrix Factorization

  • Jun Wu
  • Fangyuan Luo
  • Yujia Zhang
  • Haishuai Wang

Discrete matrix factorization (DMF) has been a promising solution to improve the inferring efficiency of matrix factorization (MF) against the rapidly growing numbers of users and items. However, DMF suffers from a serious encoding loss due to its oversimplified modeling on the original data geometry. In this article, we propose a semi-discrete matrix factorization (SDMF) model to combine the predicting efficacy of MF with the inferring efficiency of DMF. It first learns real-valued latent features by MF, and then, taking them as group-wise and point-wise smoothness, learns binary codes in the DMF framework, for preserving the geometrical structures collectively hidden in users and items, as well as aligning binary codes originated from Hamming space with their real-valued counterparts learned from vector space. Particularly, we devise a computationally efficient optimization algorithm to estimate model parameters. Extensive evaluations on three real-world datasets clearly demonstrate the superiority of our SDMF model over state-of-the-art hash-based recommendation methods.

UAI Conference 2020 Conference Paper

Structure Learning for Cyclic Linear Causal Models

  • Carlos Améndola
  • Philipp Dettling
  • Mathias Drton
  • Federica Onori
  • Jun Wu

We consider the problem of structure learning for linear causal models based on observational data. We treat models given by possibly cyclic mixed graphs, which allow for feedback loops and effects of latent confounders. Generalizing related work on bow-free acyclic graphs, we assume that the underlying graph is simple. This entails that any two observed variables can be related through at most one direct causal effect and that (confounding-induced) correlation between error terms in structural equations occurs only in absence of direct causal effects. We show that, despite new subtleties in the cyclic case, the considered simple cyclic models are of expected dimension and that a previously considered criterion for distributional equivalence of bow-free acyclic graphs has an analogue in the cyclic case. Our result on model dimension justifies in particular score-based methods for structure learning of linear Gaussian mixed graph models, which we implement via greedy search.

AAMAS Conference 2019 Conference Paper

Multi-unit Budget Feasible Mechanisms for Cellular Traffic Offloading

  • Jun Wu
  • Yuan Zhang
  • Yu Qiao
  • Lei Zhang
  • Chongjun Wang
  • Junyuan Xie

Cellular traffic offloading is nowadays an important problem in mobile networking. Since the offloading resource owners (agents) are self-interested and have private costs, it is highly challenging to design procurement mechanisms that motivate agents to reveal their true costs and achieve guaranteed performance under the constraint of a strict budget. In this paper, we model cellular traffic offloading as a multi-unit budget feasible procurement auction design problem with diminishing return valuations. We design a novel greedy-based randomized mechanism, and prove it is budget-feasible, truthful, individually rational and a (3 + 2 ln 𝑁)-approximation, where 𝑁 is the total number of available resource units. We also propose a deterministic mechanism which achieves (2 + ln 𝑁 + √︀ 2 + 3 ln 𝑁 + ln2 𝑁) - approximation. We prove no budget-feasible and truthful mechanism can do better than ln 𝑁-approximation in our setting, thus our mechanism approaches the optimal to a constant factor. In addition to solving the cellular traffic offloading problem, our work successfully extends solvable valuation class of greedy-based multi-unit budget-feasible mechanism with performance guarantees from the concave-additive valuations to more general local diminishing return valuations.

AAMAS Conference 2018 Conference Paper

Optimal Constraint Collection for Core-Selecting Path Mechanism

  • Hao Cheng
  • Lei Zhang
  • Yi Zhang
  • Jun Wu
  • Chongjun Wang

In path auctions, strategic bidders make bids for commodities. Each edge of the graph stands for a commodity and the weight on the edge represents the prime cost. Auctioneer needs to purchase a sequence of edges in order to get a path from one vertex to another at a low cost. Path auctions can be considered as a kind of combinatorial reverse-auctions. Computing prices in core-selecting combinatorial auctions is a computationally hard problem, the same is true in core-selecting path auctions. This problem can be solved by core constraint generation(CCG) algorithm. However, we find that there are many redundant constraints and the constraint collection can be conciser in core-selecting path mechanism. In this paper, 1) we put forward a new approach to get the constraint collection, and reduce the constraint number from exponential O(2n) to polynomial O(n2), where n is the network diameter; 2) we prove that the new constraint collection is not only equivalent to the original collection, but also has no redundant constraint in the worst case; 3) we validate our approach on real-world datasets and obtain excellent results. Furthermore, we provide new insights to think over the core-selecting mechanism in combinatorial auctions.

AAAI Conference 2017 Short Paper

Boosting for Real-Time Multivariate Time Series Classification

  • Haishuai Wang
  • Jun Wu

Multivariate time series (MTS) is useful for detecting abnormity cases in healthcare area. In this paper, we propose an ensemble boosting algorithm to classify abnormality surgery time series based on learning shapelet features. Specifically, we first learn shapelets by logistic regression from multivariate time series. Based on the learnt shapelets, we propose a MTS ensemble boosting approach when the time series arrives as stream fashion. Experimental results on a real-world medical dataset demonstrate the effectiveness of the proposed methods.

AAMAS Conference 2017 Conference Paper

Mechanism Design for Social Law Synthesis under Incomplete Information

  • Jun Wu
  • Lei Zhang
  • Chongjun Wang
  • Junyuan Xie

For the social law synthesis problem, when the agents are rational in the sense of game theory and hold some information we need as private information, it naturally evolves into a setting that is perfectly addressed by the framework of algorithmic mechanism design. In this strategic setting, we are not only required to find out the feasible social law for the objective, but also required to formulate the right payment to the agents to induce incentive compatibility and individual rationality. We design a mechanism for this setting, prove that it satisfies all the required formal properties, and characterize the conditions for the existence of feasible mechanisms. Moreover, we show that the upper-bound of the total payment of the proposed mechanism is high.

AAMAS Conference 2017 Conference Paper

Synthesizing Optimal Social Laws for Strategical Agents via Bayesian Mechanism Design

  • Jun Wu
  • Lei Zhang
  • Chongjun Wang
  • Junyuan Xie

When rational behavior of the agents and private information are considered, the optimal social law synthesizing problem naturally evolves into a setting which can be handled by the framework of algorithmic mechanism design. We focus on the Bayesian case in this paper, that is, the probability distribution of each agent’s cost is known. It is easy to see that in this case our problem closely relates to path/spanning-tree auctions and Myerson’s optimal auction mechanism, but the optimization objective is new, that is, we focus on profit maximization instead of payment maximization. By studying this problem: we further extend the logic-based framework of social law optimization problem to the strategic case, and show that it becomes a new problem of algorithmic mechanism design; we find out a mechanism that is incentive compatible, individually rational and maximizes the expected profit for all input cost profiles; however, we can show that this mechanism is computational intractable; so, we finally find out a tractable constant-factor approximation mechanism. CCS Concepts •Computing methodologies → Multi-agent systems;

AAMAS Conference 2013 Conference Paper

On the Complexity of Undominated Core and Farsighted Solution Concepts in Coalitional Games

  • Yusen Zhan
  • Jun Wu
  • Chongjun Wang
  • Meilin Liu
  • Junyuan Xie

In this paper, we study the computational complexity of solution concepts in the context of coalitional games. Firstly, we distinguish two different kinds of core, the undominated core and excess core, and investigate the difference and relationship between them. Secondly, we thoroughly investigate the computational complexity of undominated core and three farsighted solution concepts—farsighted core, farsighted stable set and largest consistent set.

ICRA Conference 2011 Conference Paper

Control of upper-limb power-assist exoskeleton based on motion intention recognition

  • Weiguang Huo
  • Jian Huang 0001
  • Yongji Wang 0001
  • Jun Wu
  • Lei Cheng

Recognizing the user motion intention plays an important role in the study of power-assist robots. An intention-guided control strategy is proposed for the upper-limb power-assist exoskeleton. A force sensor system comprised of force sensing resistors (FSRs) is designed to online estimate the motion intention of user upper limb. A new concept called "intentional reaching direction (IRD)" is proposed to quantitatively describe this intention. Both the state model and the observation model of IRD are obtained by enumerating the upper limb behavior modes and analyzing the relationship between the measured force signals and the motion intention. Based on these two models, the IRD can be online inferred by applying filtering technology. Guided by the estimated IRD, an admittance control strategy is assumed to control the motions of three DC motors in the joints of the robotic arm. The effectiveness of the proposed approaches is finally confirmed by the experiments on a 3-DOF robotic exoskeleton.