Author name cluster

Zhao Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

28 papers

2 author rows

AAAI Conference 2026 Conference Paper

EnViT: Enhancing the Performance of Early-Exit Vision Transformers via Exit-Aware Structured Dropout-Enabled Self-Distillation

Yonghao Dong
Qiang He
Penghong Rui
Zhenzhe Zheng
Zhao Li
Feifei Chen
Hai Jin
Yun Yang

Vision Transformers (ViTs) have gained significant attention and widespread adoption due to their impressive performance in various computer vision tasks. However, in practice, their substantial computational overhead often leads to high inference latency and increased overheads when deployed on resource-constrained edge devices like smartphones, autonomous vehicles, and robots. To address these challenges, Early Exit (EE) has emerged as a promising approach for lightweight inference on edge devices. It accelerates inference and reduces computational overhead by adaptively producing predictions through early exits based on sample complexity. Existing EE methods typically suffer from substantial accuracy decreases in late exits while providing only marginal accuracy improvements to early exits. This paper presents EnViT, an exit-aware structured dropout-enabled self-distillation approach that enhances the performance of early exits without compromising late exits. EnViT leverages structured dropout to enable self-distillation, where the full model serves as the teacher and its own virtual sub-models generated by structured dropout as students. This mechanism effectively distills knowledge from the full model to early exits and avoids performance degradation in late exits by mitigating parameter conflicts across exits during training. Evaluation on five datasets shows that our EnViT achieves accuracy improvements ranging from 0.36% to 7.92% while maintaining competitive speed-up ratios of 1.72x to 2.23x.

PDF Details DOI

AAAI Conference 2026 Conference Paper

LGAN: An Efficient High-Order Graph Neural Network via the Line Graph Aggregation

Lin Du
Lu Bai
Jincheng Li
Lixin Cui
Hangyuan Du
Lichi Zhang
Yuting Chen
Zhao Li

Graph Neural Networks (GNNs) have emerged as a dominant paradigm for graph classification. Specifically, most existing GNNs mainly rely on the message passing strategy between neighbor nodes, where the expressivity is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. Although a number of k-WL-based GNNs have been proposed to overcome this limitation, their computational cost increases rapidly with k, significantly restricting the practical applicability. Moreover, since the k-WL models mainly operate on node tuples, these k-WL-based GNNs cannot retain fine-grained node- or edge-level semantics required by attribution methods (e.g., Integrated Gradients), leading to the less interpretable problem. To overcome the above shortcomings, in this paper, we propose a novel Line Graph Aggregation Network (LGAN), that constructs a line graph from the induced subgraph centered at each node to perform the higher-order aggregation. We theoretically prove that the LGAN not only possesses the greater expressive power than the 2-WL under injective aggregation assumptions, but also has lower time complexity. Empirical evaluations on benchmarks demonstrate that the LGAN outperforms state-of-the-art k-WL-based GNNs, while offering better interpretability.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Multi-Granular Graph Learning with Fine-Grained Behavioral Pattern Awareness for Session-Based Recommendation

Ming Li
Zihao Yan
Yuting Chen
Lixin Cui
Lu Bai
Feilong Cao
Ke Lv
Zhao Li

Session-based recommendation aims to predict users’ next actions by modeling their ongoing interaction sequences, particularly in scenarios where long-term user profiles are unavailable. While existing methods have achieved promising results by leveraging sequential and graph-based structures, they often rely on global aggregation strategies that emphasize dominant user interests while overlooking the transient and fine-grained behavior patterns embedded in sessions. In practice, user intent evolves across sessions and is reflected through diverse behavioral patterns, ranging from immediate preferences to segmented co-occurrence interests and long-range goals. To address these limitations, we propose GraphFine, a novel multi-granular graph learning framework that achieves fine-grained behavioral pattern awareness for session-based recommendation. Our approach models user behavior at different temporal and semantic granularities through a combination of graph and hypergraph neural networks. Specifically, we employ a position-aware graph to capture short-term item transitions, and construct segmented co-occurrence hypergraphs to uncover high-order semantic relations among co-occurred items. To preserve diverse user intents, we further introduce a multi-view intent readout mechanism that extracts and adaptively integrates intent signals from short-term actions, segmented co-occurrence patterns, and entire sessions. Extensive experiments on benchmark datasets demonstrate that GraphFine consistently outperforms existing state-of-the-art methods, confirming its effectiveness in capturing fine-grained and dynamic user preferences for more accurate recommendation.

PDF Details DOI

AAAI Conference 2026 Conference Paper

PriAgent: A Collaborative Multi-Agent Framework for Auditing Android Privacy Compliance

Ziwei Zhang
Zhao Li
Zhuojun Jiang
Jiangyi Yin
Xuebin Wang
Jiangchao Chen
Qingyun Liu

Stringent regulations like General Data Protection Regulation (GDPR) mandate that an application's code-level data handling must align with its natural-language privacy policy, creating a critical auditing challenge. However, existing methods, predominantly reliant on static analysis, suffer from a critical limitation: in their pursuit of soundness via over-approximation, they exhibit "semantic blindness"—detecting what data flows exist but not why. This leads to an overwhelming volume of false positives, rendering automated auditing impractical. To bridge this gap, we introduce PriAgent, a novel framework that approaches compliance auditing as a multi-stage, AI-driven reasoning task. Instead of a monolithic model, PriAgent deploys a team of specialized agents that execute a divide-and-conquer strategy. They systematically prune the analysis space by abstracting data flows, pinpoint semantic loci critical for inspection, and perform on-demand summarization of large code blocks to ensure scalability. PriAgent leverages Retrieval-Augmented Generation (RAG) with a curated knowledge base of Android APIs, equipping agents to discern potentially non-compliant behavior from benign functionality. By correlating code-level evidence with the app's stated privacy policy, PriAgent delivers a holistic and explainable verdict for each potential violation. Our evaluations demonstrate that PriAgent significantly reduces false positives, enabling a more scalable and precise compliance audit.

PDF Details DOI

AAAI Conference 2026 Conference Paper

rMMEA: Robust Multi-Modal Entity Alignment with Missing and Noise Visual Modality

Lingbing Guo
Zhuo Chen
Yichi Zhang
Wenbin Guo
Haonan Yang
Zhao Li
Zirui Chen
Xin Wang

Recently, multi-modal embedding methods have flourished in entity alignment. As state-of-the-art approaches evolve rapidly, visual modality (i.e., images) missing emerges as a critical challenge. While visual modality typically offers the most informative signals in multi-modal entity alignment (MMEA), it is frequently unavailable for many entities. The existing methods commonly use dummy vectors to represent visual-missing embeddings, which negatively impacts both model training and inference. In this paper, we propose robust multi-modal entity alignment (rMMEA), which leverages ranking-based knowledge distillation and mutual information (MI) estimation to address missing modalities while enhancing noise robustness. Unlike conventional teacher-student distillation that requires the student to replicate teacher outputs, our rMMEA learns soft rankings from pure and complete modality sides while capturing implicit key semantics of teacher embeddings through mutual information maximization, allowing rMMEA to avoid strict point-to-point alignment. The experimental results across multiple benchmarks and settings demonstrate that rMMEA significantly outperforms the state-of-the-art anti-modality-missing methods in terms of effectiveness and efficiency.

PDF Details DOI

ICML Conference 2025 Conference Paper

EGPlace: An Efficient Macro Placement Method via Evolutionary Search with Greedy Repositioning Guided Mutation

Ji Deng
Zhao Li
Ji Zhang
Jun Gao

Macro placement, which involves optimizing the positions of modules, is a critical phase in modern integrated circuit design and significantly influences chip performance. The growing complexity of integrated circuits demands increasingly sophisticated placement solutions. Existing approaches have evolved along two primary paths (e. g. , constructive and adjustment methods), but they face significant practical limitations that affect real-world chip design. Recent hybrid frameworks such as WireMask-EA have attempted to combine these strategies, but significant technical barriers still remain, including the computational overhead from separated layout adjustment and reconstruction that often require complete layout rebuilding, the inefficient exploration of design spaces due to random mutation operations, and the computational complexity of mask-based construction methods that limit scalability. To overcome these limitations, we introduce EGPlace, a novel evolutionary optimization framework that combines guided mutation strategies with efficient layout reconstruction. EGPlace introduces two key innovations: a greedy repositioning-guided mutation operator that systematically identifies and optimizes critical layout regions, and an efficient mask computation algorithm that accelerates layout evaluation. Our extensive evaluation using ISPD2005 and Ariane RISC-V CPU benchmarks demonstrate that EGPlace reduces wirelength by 10. 8% and 9. 3% compared to WireMask-EA and the state-of-the-art reinforcement learning-based constructive method EfficientPlace, respectively, while achieving speedups of 7. 8$\times$ and 2. 8$\times$ over these methods.

AAAI Conference 2025 Conference Paper

Learning Together Securely: Prototype-Based Federated Multi-Modal Hashing for Safe and Efficient Multi-Modal Retrieval

Ruifan Zuo
Chaoqun Zheng
Lei Zhu
Wenpeng Lu
Yuanyuan Xiang
Zhao Li
Xiaofeng Qu

With the proliferation of multi-modal data, safe and efficient multi-modal hashing retrieval has become a pressing research challenge, particularly due to concerns over data privacy during centralized processing. To address this, we propose Prototype-based Federated Multi-modal Hashing (PFMH), an innovative framework that seamlessly integrates federated learning with multi-modal hashing techniques. PFMH achieves fine-grained fusion of heterogeneous multi-modal data, enhancing retrieval accuracy while ensuring data privacy through prototype-based communication, thereby reducing communication costs and mitigating risks of data leakage. Furthermore, using a prototype completion strategy, PFMH tackles class imbalance and statistical heterogeneity in multi-modal data, improving model generalization and performance across diverse data distributions. Extensive experiments demonstrate the efficiency and effectiveness of PFMH within the federated learning framework, enabling distributed training for secure and precise multi-modal retrieval in real-world scenarios.

PDF Details DOI

JBHI Journal 2025 Journal Article

Multi-View Fused Nonnegative Matrix Completion Methods for Drug-Target Interaction Prediction

Ting Li
Chuanqi Lao
Zhao Li
Hongyang Chen

Accurate prediction of drug-target interactions (DTIs) is crucial for accelerating drug discovery and reducing experimental costs. However, challenges such as sparse interactions and heterogeneous datasets complicate this prediction. In this study, we hypothesize that leveraging nonnegative matrix completion and integrating heterogeneous similarity information from multiple biological views can improve the accuracy, interpretability, and scalability of DTI prediction. To validate this, we propose two multi-view fused nonnegative matrix completion methods that combine three key components: (1) a nonnegative matrix completion framework that avoids heuristic rank selection and ensures biologically interpretable predictions; (2) a linear multi-view fusion mechanism, where weights over multiple drug and target similarity matrices are jointly learned through linearly constrained quadratic programming; and (3) multi-graph Laplacian regularization to preserve structural properties within each view. The optimization is performed using two efficient proximal linearization-incorporated block coordinate descent algorithms. Extensive experiments on four gold-standard datasets and a larger real-world dataset demonstrate that our models consistently outperform state-of-the-art single-view, multi-view and deep learning-based DTI prediction methods. Furthermore, ablation studies confirm the contribution of each model component, and scalability analysis highlights the computational efficiency of our approach.

IJCAI Conference 2025 Conference Paper

Optimizing the Battery-Swapping Problem in Urban E-Bike Systems with Reinforcement Learning

Wenjing Li
Zhao Li
Xuanwu Liu
Ruihao Zhu
Zhenzhe Zheng
Fan Wu

E-bikes (EBs) are a key transportation mode in urban area, especially for couriers of delivery platforms, but underdeveloped EB systems can hinder courier's productivity due to limited battery capacity. Battery-swapping stations address this issue by enabling riders to exchange depleted batteries for fully charged ones. However, managing supply and demand (SnD) imbalances at these stations has become increasingly complex. To address this, we introduce a new approach that formulates the Battery-Swapping Problem (BSP) as a discrete-time Markov Decision Process (MDP) to capture the dynamics of SnD imbalances. Building on it, we propose a Wasserstein-enhanced Proximal Policy Optimization (W-PPO) algorithm, which integrates Wasserstein distance with reinforcement learning to improve the robustness against uncertainty in forecasting SnD. W-PPO provides a BSP-specific, accurate loss function that reflects reward variations between two policies under real-world simulation. The algorithm’s effectiveness is assessed using key metrics: Shared Battery Utilization Ratio (SBUR) and Battery Supply Ratio (BSR). Simulations on real-world datasets show that W-PPO achieves a 30. 59% improvement in SBUR and a 16. 09% increase in BSR ensures practical applicability. By optimizing battery utilization and improving EB delivery systems, this work highlights the potential of AI for creating efficient and sustainable urban transportation solutions.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Sharpness-aware Zeroth-order Optimization for Graph Transformers

Yang Liu
Chuan Zhou
Yuhan Lin
Shuai Zhang
Yang Gao
Zhao Li
Shirui Pan

Graph Transformers (GTs) have emerged as powerful tools for handling graph-structured data through global attention mechanisms. While GTs can effectively capture long-range dependencies, they introduce difficulties in optimization due to their complex, non-differentiable operators, which cannot be directly handled by standard gradient-based optimizers (such as Adam or AdamW). To investigate the above issues, this work adopts the line of Zeroth-Order Optimization (ZOO) technique. However, direct integration of ZOO incurs considerable challenges due to the sharp loss landscape and steep gradients within the GT parameter space. Under the above observations, we propose a Sharpness-aware Zeroth-order Optimizer (SZO) that combines Sharpness-Aware Minimization (SAM) technique facilitating convergence within a flatter neighborhood, and leverages parallel computing for efficient gradient estimation. Theoretically, we provide a comprehensive analysis of the optimizer from both convergence and generalization perspectives. Empirically, we conduct extensive experiments on various classical GTs across a wide range of benchmark datasets, which underscore the superior performance of SZO over the state-of-the-art optimizers.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

T2S: High-resolution Time Series Generation with Text-to-Series Diffusion Models

Yunfeng Ge
Jiawei Li
Yiji Zhao
Haomin Wen
Zhao Li
Meikang Qiu
Hongyan Li
Ming Jin

Text-to-Time Series generation holds significant potential to address challenges such as data sparsity, imbalance, and limited availability of multimodal time series data across domains. While diffusion models have achieved remarkable success in Text-to-X (e. g. , vision and audio data) generation, their use in time series generation remains limit. Existing approaches face two critical limitations: (1) reliance on domain-specific captions that generalize poorly, and (2) inability to generate time series of arbitrary length, limiting real-world use. In this work, we first introduce a new multimodal dataset containing over 600, 000 high-resolution text-time series pairs. Second, we propose Text-to-Series (T2S), a diffusion-based framework that bridges the gap between natural language and time series in a domain-agnostic manner. It employs a length-adaptive VAE to encode time series of varying lengths into consistent latent embeddings. On top of that, T2S effectively aligns textual representations with latent embeddings by utilizing Flow Matching and employing DiT as the denoiser. We train T2S in an interleaved paradigm across multiple lengths, allowing it to generate sequences of arbitrary lengths. Extensive evaluations demonstrate that T2S achieves state-of-the-art performance across 13 datasets spanning 12 domains.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Understanding PII Leakage in Large Language Models: A Systematic Survey

Shuai Cheng
Zhao Li
Shu Meng
Mengxia Ren
Haitao Xu
Shuai Hao
Chuan Yue
Fan Zhang

Large Language Models (LLMs) have demonstrated exceptional success across a variety of tasks, particularly in natural language processing, leading to their growing integration into numerous facets of daily life. However, this widespread deployment has raised substantial privacy concerns, especially regarding personally identifiable information (PII), which can be directly associated with specific individuals. The leakage of such information presents significant real-world privacy threats. In this paper, we conduct a systematic investigation into existing research on PII leakage in LLMs, encompassing commonly utilized PII datasets, evaluation metrics, and current studies on both PII leakage attacks and defensive strategies. Finally, we identify unresolved challenges in the current research landscape and suggest future research directions.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on

Chao Lin
Zhao Li
Sheng Zhou
Shichang Hu
Jialun Zhang
Linhao Luo
Jiarun Zhang
Longtao Huang

Virtual try-on (VTON) aims at fitting target clothes to reference person images, which is widely adopted in e-commerce. Existing VTON approaches can be narrowly categorized into Parser-Based (PB) and Parser-Free (PF) by whether relying on the parser information to mask the persons’clothes and synthesize try-on images. Although abandoning parser information has improved the applicability of PF methods, the ability of detail synthesizing has also been sacrificed. As a result, the distraction from original cloth may persist in synthesized images, especially in complicated postures and high resolution applications. To address the aforementioned issue, we propose a novel PF method named Regional Mask Guided Network (RMGN). More specifically, a regional mask is proposed to explicitly fuse the features of target clothes and reference persons so that the persisted distraction can be eliminated. A posture awareness loss and a multi-level feature extractor are further proposed to handle the complicated postures and synthesize high resolution images. Extensive experiments demonstrate that our proposed RMGN outperforms both state-of-the-art PB and PF methods. Ablation studies further verify the effectiveness of modules in RMGN. Code is available at https: //github. com/jokerlc/RMGN-VITON.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

MDNN: A Multimodal Deep Neural Network for Predicting Drug-Drug Interaction Events

Tengfei Lyu
Jianliang Gao
Ling Tian
Zhao Li
Peng Zhang
Ji Zhang

The interaction of multiple drugs could lead to serious events, which causes injuries and huge medical costs. Accurate prediction of drug-drug interaction (DDI) events can help clinicians make effective decisions and establish appropriate therapy programs. Recently, many AI-based techniques have been proposed for predicting DDI associated events. However, most existing methods pay less attention to the potential correlations between DDI events and other multimodal data such as targets and enzymes. To address this problem, we propose a Multimodal Deep Neural Network (MDNN) for DDI events prediction. In MDNN, we design a two-pathway framework including drug knowledge graph (DKG) based pathway and heterogeneous feature (HF) based pathway to obtain drug multimodal representations. Finally, a multimodal fusion neural layer is designed to explore the complementary among the drug multimodal representations. We conduct extensive experiments on real-world dataset. The results show that MDNN can accurately predict DDI events and outperform the state-of-the-art models.

PDF Details DOI

TIST Journal 2021 Journal Article

TARA-Net: A Fusion Network for Detecting Takeaway Rider Accidents

Yifan He
Zhao Li
Lei Fu
Anhui Wang
Peng Zhang
Shuigeng Zhou
Ji Zhang
Ting Yu

In the emerging business of food delivery, rider traffic accidents raise financial cost and social traffic burden. Although there has been much effort on traffic accident forecasting using temporal-spatial prediction models, none of the existing work studies the problem of detecting the takeaway rider accidents based on food delivery trajectory data. In this article, we aim to detect whether a takeaway rider meets an accident on a certain time period based on trajectories of food delivery and riders’ contextual information. The food delivery data has a heterogeneous information structure and carries contextual information such as weather and delivery history, and trajectory data are collected as a spatial-temporal sequence. In this article, we propose a TakeAway Rider Accident detection fusion network TARA-Net to jointly model these heterogeneous and spatial-temporal sequence data. We utilize the residual network to extract basic contextual information features and take advantage of a transformer encoder to capture trajectory features. These embedding features are concatenated into a pyramidal feed-forward neural network. We jointly train the above three components to combine the benefits of spatial-temporal trajectory data and sparse basic contextual data for early detecting traffic accidents. Furthermore, although traffic accidents rarely happen in food delivery, we propose a sampling mechanism to alleviate the imbalance of samples when training the model. We evaluate the model on a transportation mode classification dataset Geolife and a real-world Ele.me dataset with over 3 million riders. The experimental results show that the proposed model is superior to the state-of-the-art.

IJCAI Conference 2020 Conference Paper

Collaboration Based Multi-Label Propagation for Fraud Detection

Haobo Wang
Zhao Li
Jiaming Huang
Pengrui Hui
Weiwei Liu
Tianlei Hu
Gang Chen

Detecting fraud users, who fraudulently promote certain target items, is a challenging issue faced by e-commerce platforms. Generally, many fraud users have different spam behaviors simultaneously, e. g. spam transactions, clicks, reviews and so on. Existing solutions have two main limitations: 1) the correlations among multiple spam behaviors are neglected; 2) large-scale computations are intractable when dealing with an enormous user set. To remedy these problems, this work proposes a collaboration based multi-label propagation (CMLP) algorithm. We first introduce a general-purpose version that involves collaboration technique to exploit label correlations. Specifically, it breaks the final prediction into two parts: 1) its own prediction part; 2) the prediction of others, i. e. collaborative part. Then, to accelerate it on large-scale e-commerce data, we propose a heterogeneous graph based variant that detects communities on the user-item graph directly. Both theoretical analysis and empirical results clearly validate the effectiveness and scalability of our proposals.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Multi-Question Learning for Visual Question Answering

Chenyi Lei
Lei Wu
Dong Liu
Zhao Li
Guoxin Wang
Haihong Tang
Houqiang Li

Visual Question Answering (VQA) raises a great challenge for computer vision and natural language processing communities. Most of the existing approaches consider videoquestion pairs individually during training. However, we observe that there are usually multiple (either sequentially generated or not) questions for the target video in a VQA task, and the questions themselves have abundant semantic relations. To explore these relations, we propose a new paradigm for VQA termed Multi-Question Learning (MQL). Inspired by the multi-task learning, MQL learns from multiple questions jointly together with their corresponding answers for a target video sequence. The learned representations of videoquestion pairs are then more general to be transferred for new questions. We further propose an effective VQA framework and design a training procedure for MQL, where the speciﬁcally designed attention network models the relation between input video and corresponding questions, enabling multiple video-question pairs to be co-trained. Experimental results on public datasets show the favorable performance of the proposed MQL-VQA framework compared to state-of-the-arts.

IJCAI Conference 2020 Conference Paper

TransRHS: A Representation Learning Method for Knowledge Graphs with Relation Hierarchical Structure

Fuxiang Zhang
Xin Wang
Zhao Li
Jianxin Li

Representation learning of knowledge graphs aims to project both entities and relations as vectors in a continuous low-dimensional space. Relation Hierarchical Structure (RHS), which is constructed by a generalization relationship named subRelationOf between relations, can improve the overall performance of knowledge representation learning. However, most of the existing methods ignore this critical information, and a straightforward way of considering RHS may have a negative effect on the embeddings and thus reduce the model performance. In this paper, we propose a novel method named TransRHS, which is able to incorporate RHS seamlessly into the embeddings. More specifically, TransRHS encodes each relation as a vector together with a relation-specific sphere in the same space. Our TransRHS employs the relative positions among the vectors and spheres to model the subRelationOf, which embodies the inherent generalization relationships among relations. We evaluate our model on two typical tasks, i. e. , link prediction and triple classification. The experimental results show that our TransRHS model significantly outperforms all baselines on both tasks, which verifies that the RHS information is significant to representation learning of knowledge graphs, and TransRHS can effectively and efficiently fuse RHS into knowledge graph embeddings.

PDF Details DOI

IJCAI Conference 2019 Conference Paper

ActiveHNE: Active Heterogeneous Network Embedding

Xia Chen
Guoxian Yu
Jun Wang
Carlotta Domeniconi
Zhao Li
Xiangliang Zhang

Heterogeneous network embedding (HNE) is a challenging task due to the diverse node types and/or diverse relationships between nodes. Existing HNE methods are typically unsupervised. To maximize the profit of utilizing the rare and valuable supervised information in HNEs, we develop a novel Active Heterogeneous Network Embedding (ActiveHNE) framework, which includes two components: Discriminative Heterogeneous Network Embedding (DHNE) and Active Query in Heterogeneous Networks (AQHN). In DHNE, we introduce a novel semi-supervised heterogeneous network embedding method based on graph convolutional neural network. In AQHN, we first introduce three active selection strategies based on uncertainty and representativeness, and then derive a batch selection method that assembles these strategies using a multi-armed bandit mechanism. ActiveHNE aims at improving the performance of HNE by feeding the most valuable supervision obtained by AQHN into DHNE. Experiments on public datasets demonstrate the effectiveness of ActiveHNE and its advantage on reducing the query cost.

IJCAI Conference 2019 Conference Paper

AddGraph: Anomaly Detection in Dynamic Graph Using Attention-based Temporal GCN

Li Zheng
Zhenpeng Li
Jian Li
Zhao Li
Jun Gao

Anomaly detection in dynamic graphs becomes very critical in many different application scenarios, e. g. , recommender systems, while it also raises huge challenges due to the high flexible nature of anomaly and lack of sufficient labelled data. It is better to learn the anomaly patterns by considering all possible features including the structural, content and temporal features, rather than utilizing heuristic rules over the partial features. In this paper, we propose AddGraph, a general end-to-end anomalous edge detection framework using an extended temporal GCN (Graph Convolutional Network) with an attention model, which can capture both long-term patterns and the short-term patterns in dynamic graphs. In order to cope with insufficient explicit labelled data, we employ the negative sampling and margin loss in training of AddGraph in a semi-supervised fashion. We conduct extensive experiments on real-world datasets, and illustrate that AddGraph can outperform the state-of-the-art competitors in anomaly detection significantly.

AAAI Conference 2019 Conference Paper

Deep Cascade Multi-Task Learning for Slot Filling in Online Shopping Assistant

Yu Gong
Xusheng Luo
Yu Zhu
Wenwu Ou
Zhao Li
Muhua Zhu
Kenny Q. Zhu
Lu Duan

Slot filling is a critical task in natural language understanding (NLU) for dialog systems. State-of-the-art approaches treat it as a sequence labeling problem and adopt such models as BiLSTM-CRF. While these models work relatively well on standard benchmark datasets, they face challenges in the context of E-commerce where the slot labels are more informative and carry richer expressions. In this work, inspired by the unique structure of E-commerce knowledge base, we propose a novel multi-task model with cascade and residual connections, which jointly learns segment tagging, named entity tagging and slot filling. Experiments show the effectiveness of the proposed cascade and residual structures. Our model has a 14. 6% advantage in F1 score over the strong baseline methods on a new Chinese E-commerce shopping assistant dataset, while achieving competitive accuracies on a standard dataset. Furthermore, online test deployed on such dominant E-commerce platform shows 130% improvement on accuracy of understanding user utterances. Our model has already gone into production in the E-commerce platform.

TIST Journal 2019 Journal Article

Deep Multi-scale Discriminative Networks for Double JPEG Compression Forensics

Cheng Deng
Zhao Li
Xinbo Gao
Dacheng Tao

As JPEG is the most widely used image format, the importance of tampering detection for JPEG images in blind forensics is self-evident. In this area, extracting effective statistical characteristics from a JPEG image for classification remains a challenge. Effective features are designed manually in traditional methods, suggesting that extensive labor-consuming research and derivation is required. In this article, we propose a novel image tampering detection method based on deep multi-scale discriminative networks (MSD-Nets). The multi-scale module is designed to automatically extract multiple features from the discrete cosine transform (DCT) coefficient histograms of the JPEG image. This module can capture the characteristic information in different scale spaces. In addition, a discriminative module is also utilized to improve the detection effect of the networks in those difficult situations when the first compression quality ( QF 1) is higher than the second one ( QF 2). A special network in this module is designed to distinguish the small statistical difference between authentic and tampered regions in these cases. Finally, a probability map can be obtained and the specific tampering area is located using the last classification results. Extensive experiments demonstrate the superiority of our proposed method in both quantitative and qualitative metrics when compared with state-of-the-art approaches.

AAMAS Conference 2019 Conference Paper

Fraud Regulating Policy for E-Commerce via Constrained Contextual Bandits

Zehong Hu
Zhen Wang
Zhao Li
Shichang Hu
Shasha Ruan
Jie Zhang

Fraud sellers in e-commerce often promote themselves via fake visits or purchases to increase sales, jeopardizing the business environment of the platform. How to regulate the exposure of these sellers to buyers without affecting normal online business remains a challenging problem, since blocking them entirely without discrimination may kill the normal transactions and could potentially decrease the total transactions of the platform. To address this problem, we introduce a regulating valve which blocks fraud sellers with a certain probability. To learn the optimal blocking policy, we model the regulating valve as a contextual bandit problem with a constraint on the total transaction decline. Since existing bandit algorithms are unable to incorporate the transaction constraint, we propose a novel bandit algorithm, which decides the policy based on a set of neural networks and iteratively updates the neural networks with online observations and the constraint. Experiments on synthetic data and one of the largest e-commerce platforms in the world both show that our algorithm effectively and efficiently outperforms existing bandit algorithms by a large margin.

AAAI Conference 2019 Conference Paper

General Robustness Evaluation of Incentive Mechanism against Bounded Rationality Using Continuum-Armed Bandits

Zehong Hu
Jie Zhang
Zhao Li

Incentive mechanisms that assume agents to be fully rational, may fail due to the bounded rationality of agents in practice. It is thus crucial to evaluate to what extent mechanisms can resist agents’ bounded rationality, termed robustness. In this paper, we propose a general empirical framework for robustness evaluation. One novelty of our framework is to develop a robustness formulation that is generally applicable to different types of incentive mechanisms and bounded rationality models. This formulation considers not only the incentives to agents but also the performance of mechanisms. The other novelty lies in converting the empirical robustness computation into a continuum-armed bandit problem, and then developing an efficient solver that has theoretically guaranteed error rate upper bound. We also conduct extensive experiments using various mechanisms to verify the advantages and practicability of our robustness evaluation framework.

IJCAI Conference 2018 Conference Paper

Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty

Mengchen Zhao
Zhao Li
Bo An
Haifeng Lu
Yifan Yang
Chen Chu

Conducting fraud transactions has become popular among e-commerce sellers to make their products favorable to the platform and buyers, which decreases the utilization efficiency of buyer impressions and jeopardizes the business environment. Fraud detection techniques are necessary but not enough for the platform since it is impossible to recognize all the fraud transactions. In this paper, we focus on improving the platform's impression allocation mechanism to maximize its profit and reduce the sellers' fraudulent behaviors simultaneously. First, we learn a seller behavior model to predict the sellers' fraudulent behaviors from the real-world data provided by one of the largest e-commerce company in the world. Then, we formulate the platform's impression allocation problem as a continuous Markov Decision Process (MDP) with unbounded action space. In order to make the action executable in practice and facilitate learning, we propose a novel deep reinforcement learning algorithm DDPG-ANP that introduces an action norm penalty to the reward function. Experimental results show that our algorithm significantly outperforms existing baselines in terms of scalability and solution quality.

NeurIPS Conference 2018 Conference Paper

Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing

Zehong Hu
Yitao Liang
Jie Zhang
Zhao Li
Yang Liu

Incentive mechanisms for crowdsourcing are designed to incentivize financially self-interested workers to generate and report high-quality labels. Existing mechanisms are often developed as one-shot static solutions, assuming a certain level of knowledge about worker models (expertise levels, costs for exerting efforts, etc. ). In this paper, we propose a novel inference aided reinforcement mechanism that acquires data sequentially and requires no such prior assumptions. Specifically, we first design a Gibbs sampling augmented Bayesian inference algorithm to estimate workers' labeling strategies from the collected labels at each step. Then we propose a reinforcement incentive learning (RIL) method, building on top of the above estimates, to uncover how workers respond to different payments. RIL dynamically determines the payment without accessing any ground-truth labels. We theoretically prove that RIL is able to incentivize rational workers to provide high-quality labels both at each step and in the long run. Empirical results show that our mechanism performs consistently well under both rational and non-fully rational (adaptive learning) worker models. Besides, the payments offered by RIL are more robust and have lower variances compared to existing one-shot mechanisms.

AAAI Conference 2013 Conference Paper

Online Group Feature Selection from Feature Streams

Haiguang Li
Xindong Wu
Zhao Li
Wei Ding

Standard feature selection algorithms deal with given candidate feature sets at the individual feature level. When features exhibit certain group structures, it is beneficial to conduct feature selection in a grouped manner. For high-dimensional features, it could be far more preferable to online generate and process features one at a time rather than wait for generating all features before learning begins. In this paper, we discuss a new and interesting problem of online group feature selection from feature streams at both the group and individual feature levels simultaneously from a feature stream. Extensive experiments on both real-world and synthetic datasets demonstrate the superiority of the proposed algorithm.

AAAI Conference 2010 Conference Paper

A Phrase-Based Method for Hierarchical Clustering of Web Snippets

Zhao Li
Xindong Wu

Document clustering has been applied in web information retrieval, which facilitates users’ quick browsing by organizing retrieved results into different groups. Meanwhile, a tree-like hierarchical structure is wellsuited for organizing the retrieved results in favor of web users. In this regard, we introduce a new method for hierarchical clustering of web snippets by exploiting a phrase-based document index. In our method, a hierarchy of web snippets is built based on phrases instead of all snippets, and the snippets are then assigned to the corresponding clusters consisting of phrases. We show that, as opposed to the traditional hierarchical clustering, our method not only presents meaningful cluster labels but also improves clustering performance.