Arrow Research search

Author name cluster

He Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers
2 author rows

Possible papers

36

TAAS Journal 2026 Journal Article

Adaptive Scheduling of Multimodal Large Language Model in Intelligent Edge Computing

  • Xingyu Yuan
  • He Li
  • Mianxiong Dong
  • Kaoru Ota

Multimodal Large Language Models (MLLMs) integrate multimodal encoders with Large Language Models (LLMs) to overcome the limitations of text-only models. Traditional LLMs are deployed on high-performance cloud servers, but MLLMs, which process multimodal data, face high transmission latency and privacy risks when tasks are offloaded to the cloud. Intelligent edge computing is a promising solution for supporting such latency-sensitive and privacy-sensitive tasks. However, the heterogeneity of edge environments makes efficient MLLM inference challenging. In this work, we enhance MLLM inference efficiency in heterogeneous edge environments by decoupling MLLM into LLM and multimodal encoders, deploying the LLM on high-performance devices and the multimodal encoders on lower-capability devices. Additionally, we observe that processing MLLM tasks in edge environments involves numerous configuration parameters that impact inference speed and energy consumption in an unknown and possibly time-varying fashion. To address this challenge, we present an adaptive scheduling algorithm that assigns parameters to tasks or minimizing energy consumption while meeting maximum latency constraints. The results of extensive experimental trials demonstrate that the proposed approach consistently outperforms existing state-of-the-art methods, achieving significant improvements in both latency reduction and energy efficiency.

AAAI Conference 2026 Conference Paper

ARGH-Mark: Anchor-Synchronized Watermarking with Hamming Correction for Robust and Quality-Preserving LLM Attribution

  • He Li
  • Xiaojun Chen
  • Jingcheng He
  • Zhendong Zhao
  • Shuguang Yuan
  • Xin Zhao
  • Yunfei Yang

The proliferation of large language models has intensified demands for reliable content attribution, yet existing watermarking techniques face a fundamental trilemma: they cannot simultaneously optimize for robustness against attacks, minimal text quality degradation, and detection efficiency. To resolve this challenge, we propose ARGH-Mark, a novel watermarking framework that integrates three synergistic innovations: (1) Anchor-synchronized phase recovery for maintaining detection integrity under insertion/deletion attacks, (2) RG-balanced vocabulary modulation that dynamically partitions lexicons via contextual hashing to preserve generation quality, and (3) Hamming-based error correction enabling single-bit error rectification through algebraic coding. Comprehensive evaluations across question answering (ELI5), summarization (CNN/DailyMail), and text generation (C4) demonstrate state-of-the-art performance: the proposed ARGH-Mark framework achieves near-perfect match rate and bit accuracy across diverse configurations, while preserving the quality of the generated text. It significantly reduces detection latency, enabling real-time extraction, and maintains high robustness against token tampering attacks through integrated Hamming error correction, ensuring reliable attribution in adversarial settings. ARGH-Mark achieves a new Pareto frontier in the watermarking design space and advances trustworthy deployment of generative AI in alignment-critical applications.

AAAI Conference 2026 Conference Paper

DeepTracer: Tracing Stolen Model via Deep Coupled Watermarks

  • Yunfei Yang
  • Xiaojun Chen
  • Yuexin Xuan
  • Zhendong Zhao
  • Xin Zhao
  • He Li

Model watermarking techniques can embed watermark information into the protected model for ownership declaration by constructing specific input-output pairs. However, existing watermarks are easily removed when facing model stealing attacks, and make it difficult for model owners to effectively verify the copyright of stolen models. In this paper, we analyze the root cause of the failure of current watermarking methods under model stealing scenarios and then explore potential solutions. Specifically, we introduce a robust watermarking framework, DeepTracer, which leverages a novel watermark samples construction method and a same-class coupling loss constraint. DeepTracer can incur a high-coupling model between watermark task and primary task that makes adversaries inevitably learn the hidden watermark task when stealing the primary task functionality. Furthermore, we propose an effective watermark samples filtering mechanism that elaborately select watermark key samples used in model ownership verification to enhance the reliability of watermarks. Extensive experiments across multiple datasets and models demonstrate that our method surpasses existing approaches in defending against various model stealing attacks, as well as watermark attacks, and achieves new state-of-the-art effectiveness and robustness.

JBHI Journal 2025 Journal Article

A Home-based Dual-mode Upper Limb Rehabilitation System: Teleoperation Mode and Bilateral Mode with sEMG and IMU

  • He Li
  • Shuxiang Guo
  • Ruijie He
  • Bin Wang
  • Mingchao Ding

Upper limb hemiplegia is a common functional disorder among stroke patients, significantly affecting their quality of life. To address this issue, robot-assisted upper limb rehabilitation training has emerged as a new therapeutic approach, breaking through time and space limitations of traditional rehabilitation. Based on the above, a home-based dual-mode upper limb rehabilitation system is built, including teleoperation mode based on a cloud server and bilateral mode with fusion of Surface Electromyography (sEMG) and Inertial Measurement Unit (IMU). In the telerehabilitation mode, patients can receive professional guidance and regular training at home, greatly enhancing the accessibility of rehabilitation services. The experiments with the master side in Beijing City (China) and the slave side in three different cities are conducted through a cloud server. The slave side is controlled by the master side, and the contact force is sent back to the master side. In the bilateral mode, the intention of continuous movements across subjects can be accurately predicted via the fusion of sEMG and IMU, improving the naturalness of human-robot interaction. In the subject-independent modeling, the Root Mean Square Error (RMSE) under fusion showed a relative decrease of 15. 0329% (p −4 ) compared to IMU data alone, and a significantly greater reduction of 61. 9376% (p −4 ) in comparison with sEMG data alone. Robot-assisted upper limb exoskeleton, cloud-based teleoperation and bilateral training based on sEMG and IMU collectively form a new rehabilitation system, representing part of the future rehabilitation trend.

NeurIPS Conference 2025 Conference Paper

DKDR: Dynamic Knowledge Distillation for Reliability in Federated Learning

  • Yueyang Yuan
  • Wenke Huang
  • Frank Wan
  • Kaiqi Guan
  • He Li
  • Mang Ye

Federated Learning (FL) has demonstrated a promising future in privacy-friendly collaboration but it faces the data heterogeneity problem. Knowledge Distillation (KD) can serve as an effective method to address this issue. However, challenges arise from the unreliability of existing distillation methods in multi-domain scenarios. Prevalent distillation solutions primarily aim to fit the distributions of the global model directly by minimizing forward Kullback-Leibler divergence (KLD). This results in significant bias when the outputs of the global model are multi-peaked, which indicates the unreliability of the distillation pathway. Meanwhile, cross-domain update conflicts can notably reduce the accuracy of the global model (teacher model) in certain domains, reflecting the unreliability of the teacher model in these domains. In this work, we propose DKDR (Dynamic Knowledge Distillation for Reliability in Federated Learning), which dynamically assigns weights to forward and reverse KLD based on knowledge discrepancies. This enables clients to fit the outputs from the teacher precisely. Moreover, we use knowledge decoupling to identify domain experts, thus clients can acquire reliable domain knowledge from experts. Empirical results from single-domain and multi-domain image classification tasks demonstrate the effectiveness of the proposed method and the efficiency of its key modules. The code is available at https: //github. com/YueyangYuan/DKDR.

IJCAI Conference 2025 Conference Paper

Horae: A Domain-Agnostic Language for Automated Service Regulation

  • Yutao Sun
  • Mingshuai Chen
  • Tiancheng Zhao
  • Kangjia Zhao
  • He Li
  • Jintao Chen
  • Zhongyi Wang
  • Liqiang Lu

Artificial intelligence is rapidly encroaching on the field of service regulation. However, existing AI-based regulation techniques are often tailored to specific application domains and thus are difficult to generalize in an automated manner. This paper presents Horae, a unified specification language for modeling (multimodal) regulation rules across a diverse set of domains. We showcase how Horae facilitates an intelligent service regulation pipeline by further exploiting a fine-tuned large language model named RuleGPT that automates the Horae modeling process, thereby yielding an end-to-end framework for fully automated intelligent service regulation. The feasibility and effectiveness of our framework are demonstrated over a benchmark of various real-world regulation domains. In particular, we show that our open-sourced, fine-tuned RuleGPT with 7B parameters suffices to outperform GPT-3. 5 and perform on par with GPT-4o.

NeurIPS Conference 2025 Conference Paper

MoodAngels: A Retrieval-augmented Multi-agent Framework for Psychiatry Diagnosis

  • Mengxi Xiao
  • Ben Liu
  • He Li
  • Jimin Huang
  • Qianqian Xie
  • Xiaofen Zong
  • Mang Ye
  • Min Peng

The application of AI in psychiatric diagnosis faces significant challenges, including the subjective nature of mental health assessments, symptom overlap across disorders, and privacy constraints limiting data availability. To address these issues, we present MoodAngels, the first specialized multi-agent framework for mood disorder diagnosis. Our approach combines granular-scale analysis of clinical assessments with a structured verification process, enabling more accurate interpretation of complex psychiatric data. Complementing this framework, we introduce MoodSyn, an open-source dataset of 1, 173 synthetic psychiatric cases that preserves clinical validity while ensuring patient privacy. Experimental results demonstrate that MoodAngels outperforms conventional methods, with our baseline agent achieving 12. 3\% higher accuracy than GPT-4o on real-world cases, and our full multi-agent system delivering further improvements. Together, these contributions provide both an advanced diagnostic tool and a critical research resource for computational psychiatry, bridging important gaps in AI-assisted mental health assessment.

AAAI Conference 2025 Conference Paper

NightReID: A Large-Scale Nighttime Person Re-Identification Benchmark

  • Yuxuan Zhao
  • Weijian Ruan
  • He Li
  • Mang Ye

Person re-identification (Re-ID) is crucial for intelligent surveillance systems, facilitating the identification of individuals across multiple camera views. While significant advancements have been made for daytime scenarios, ensuring reliable Re-ID performance during nighttime remains a significant challenge. Given the cost and limited accessibility of infrared cameras, we investigate a critical question: Can RGB cameras be effectively utilized for accurate Re-ID during nighttime? To address this, we introduce NightReID, a large-scale RGB Re-ID dataset collected from a real-world nighttime surveillance system. NightReID includes 1,500 identities and over 53,000 images, capturing diverse scenes with complex lighting and adverse weather conditions. This rich dataset provides a valuable benchmark for advancing nighttime Re-ID research. Moreover, we propose the Enhancement, Denoising, and Alignment (EDA) framework with two novel modules to enhance nighttime Re-ID performance. First, an unsupervised Image Enhancement and Denoising (IED) method is designed to improve the quality of nighttime images, preserving critical details while removing noise without requiring paired ground truth. Second, we introduce Data Distribution Alignment (DDA) through statistical priors, aligning the distributions between pre-training data and nighttime data to mitigate domain shift. Extensive experiments on multiple nighttime Re-ID datasets demonstrate the significance of NightReID and validate the efficacy, flexibility, and applicability of the EDA framework.

IJCAI Conference 2025 Conference Paper

Pixel-wise Divide and Conquer for Federated Vessel Segmentation

  • Tian Chen
  • Wenke Huang
  • Zhihao Wang
  • Zekun Shi
  • He Li
  • Wenhui Dong
  • Mang Ye
  • Bo Du

Accurate vessel segmentation is essential for diagnosing and managing vascular and ophthalmic diseases. Traditional learning-based vessel segmentation methods heavily rely on high-quality, pixel-level annotated datasets. However, segmentation performance suffers significantly when applied in federated learning settings due to vessel morphology inconsistency and vessel-background imbalance. The former limits the ability of models to capture fine-grained vessels, while the latter overemphasizes background pixels and biases the model towards them. To address these challenges, we propose a novel method named Federated Vessel-Aware Calibration (FVAC), which leverages global uncertainty to provide differentiated guidance for clients, focusing on pixels of various morphologies that are difficult to distinguish. Furthermore, we introduce a foreground-background decoupling alignment strategy that utilizes more stable and balanced global features to mitigate semantic drift caused by vessel-background imbalance in local clients. Comprehensive experiments confirm the effectiveness of our method

IJCAI Conference 2025 Conference Paper

Prototype-guided Knowledge Propagation with Adaptive Learning for Lifelong Person Re-identification

  • Zhijie Lu
  • Wuxuan Shi
  • He Li
  • Mang Ye

Lifelong Person Re-identification (LReID) is essential in dynamic camera networks, which continually adapts to new environments while preserving previously acquired knowledge. Existing LReID techniques often preserve samples from past datasets to maintain old knowledge, potentially leading to privacy risks. While prototype-based methods offer privacy advantages, current approaches primarily focus on adjusting classifiers for image classification tasks, neglecting representation biases between old and new identities in person re-identification. This study introduces a novel Prototype-guided Knowledge Propagation (PKP) method, which mitigates discrepancies in similar identity images between old and new tasks by guiding prototype construction through triplet loss constraints. Additionally, to address disparities between prototypes and the updated feature extractor, an Adaptive Parameter Evolution (APE) strategy is proposed. APE optimizes the integration of the old and new models by assessing the importance of the new tasks, dynamically selecting the most pertinent parameters for updates according to their contribution to the current task. Extensive experiments on the LReID benchmark demonstrate that our approach surpasses state-of-the-art prototype-based LReID methods in terms of mAP and rank-1 accuracy. Code is available at https: //github. com/joyner-7/IJCAI2025-PKA.

IROS Conference 2025 Conference Paper

TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification

  • Huaqi Tao
  • Bingxi Liu 0001
  • Calvin Chen
  • Tingjun Huang
  • He Li
  • Jinqiang Cui
  • Hong Zhang 0013

Visual Place Recognition (VPR) is a crucial capability for long-term autonomous robots, enabling them to identify previously visited locations using visual information. However, existing methods remain limited in indoor settings due to the highly repetitive structures inherent in such environments. We observe that scene texts frequently appear in indoor spaces and can help distinguish visually similar but different places. This inspires us to propose TextInPlace, a simple yet effective VPR framework that integrates Scene Text Spotting (STS) to mitigate visual perceptual ambiguity in repetitive indoor environments. Specifically, TextInPlace adopts a dual-branch architecture within a local parameter sharing network. The VPR branch employs attention-based aggregation to extract global descriptors for coarse-grained retrieval, while the STS branch utilizes a bridging text spotter to detect and recognize scene texts. Finally, the discriminative texts are filtered to compute text similarity and re-rank the top-K retrieved images. To bridge the gap between current text-based repetitive indoor scene datasets and the typical scenarios encountered in robot navigation, we establish an indoor VPR benchmark dataset, called Maze-with-Text. Extensive experiments on both custom and public datasets demonstrate that TextInPlace achieves superior performance over existing methods that rely solely on appearance information. The dataset, code, and trained models are publicly available at https://github.com/HqiTao/TextInPlace.

ICML Conference 2025 Conference Paper

Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation

  • He Li
  • Haoang Chi
  • Mingyu Liu
  • Wanrong Huang
  • Liyang Xu
  • Wenjing Yang 0002

The real world naturally has dimensions of time and space. Therefore, estimating the counterfactual outcomes with spatial-temporal attributes is a crucial problem. However, previous methods are based on classical statistical models, which still have limitations in performance and generalization. This paper proposes a novel framework for estimating counterfactual outcomes with spatial-temporal attributes using the Transformer, exhibiting stronger estimation ability. Under mild assumptions, the proposed estimator within this framework is consistent and asymptotically normal. To validate the effectiveness of our approach, we conduct simulation experiments and real data experiments. Simulation experiments show that our estimator has a stronger estimation capability than baseline methods. Real data experiments provide a valuable conclusion to the causal effect of conflicts on forest loss in Colombia. The source code is available at this URL.

NeurIPS Conference 2024 Conference Paper

Autoregressive Image Generation without Vector Quantization

  • Tianhong Li
  • Yonglong Tian
  • He Li
  • Mingyang Deng
  • Kaiming He

Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessity for autoregressive modeling. In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us to apply autoregressive models in a continuous-valued space. Rather than using categorical cross-entropy loss, we define a Diffusion Loss function to model the per-token probability. This approach eliminates the need for discrete-valued tokenizers. We evaluate its effectiveness across a wide range of cases, including standard autoregressive models and generalized masked autoregressive (MAR) variants. By removing vector quantization, our image generator achieves strong results while enjoying the speed advantage of sequence modeling. We hope this work will motivate the use of autoregressive generation in other continuous-valued domains and applications. Code is available at https: //github. com/LTH14/mar.

NeurIPS Conference 2024 Conference Paper

Parameter Disparities Dissection for Backdoor Defense in Heterogeneous Federated Learning

  • Wenke Huang
  • Mang Ye
  • Zekun Shi
  • Guancheng Wan
  • He Li
  • Bo Du

Backdoor attacks pose a serious threat to federated systems, where malicious clients optimize on the triggered distribution to mislead the global model towards a predefined target. Existing backdoor defense methods typically require either homogeneous assumption, validation datasets, or client optimization conflicts. In our work, we observe that benign heterogeneous distributions and malicious triggered distributions exhibit distinct parameter importance degrees. We introduce the Fisher Discrepancy Cluster and Rescale (FDCR) method, which utilizes Fisher Information to calculate the degree of parameter importance for local distributions. This allows us to reweight client parameter updates and identify those with large discrepancies as backdoor attackers. Furthermore, we prioritize rescaling important parameters to expedite adaptation to the target distribution, encouraging significant elements to contribute more while diminishing the influence of trivial ones. This approach enables FDCR to handle backdoor attacks in heterogeneous federated learning environments. Empirical results on various heterogeneous federated scenarios under backdoor attacks demonstrate the effectiveness of our method.

NeurIPS Conference 2024 Conference Paper

Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?

  • Haoang Chi
  • He Li
  • Wenjing Yang
  • Feng Liu
  • Long Lan
  • Xiaoguang Ren
  • Tongliang Liu
  • Bo Han

Causal reasoning capability is critical in advancing large language models (LLMs) towards artificial general intelligence (AGI). While versatile LLMs appear to have demonstrated capabilities in understanding contextual causality and providing responses that obey the laws of causality, it remains unclear whether they perform genuine causal reasoning akin to humans. However, current evidence indicates the contrary. Specifically, LLMs are only capable of performing shallow (level-1) causal reasoning, primarily attributed to the causal knowledge embedded in their parameters, but they lack the capacity for genuine human-like (level-2) causal reasoning. To support this hypothesis, methodologically, we delve into the autoregression mechanism of transformer-based LLMs, revealing that it is not inherently causal. Empirically, we introduce a new causal Q&A benchmark named CausalProbe 2024, whose corpus is fresh and nearly unseen for the studied LLMs. Empirical results show a significant performance drop on CausalProbe 2024 compared to earlier benchmarks, indicating that LLMs primarily engage in level-1 causal reasoning. To bridge the gap towards level-2 causal reasoning, we draw inspiration from the fact that human reasoning is usually facilitated by general knowledge and intended goals. Inspired by this, we propose G$^2$-Reasoner, a LLM causal reasoning method that incorporates general knowledge and goal-oriented prompts into LLMs' causal reasoning processes. Experiments demonstrate that G$^2$-Reasoner significantly enhances LLMs' causal reasoning capability, particularly in fresh and fictitious contexts. This work sheds light on a new path for LLMs to advance towards genuine causal reasoning, going beyond level-1 and making strides towards level-2.

TIST Journal 2023 Journal Article

3D-Guided Frontal Face Generation for Pose-Invariant Recognition

  • Hao Wu
  • Jianyang Gu
  • Xiaojin Fan
  • He Li
  • Lidong Xie
  • Jian Zhao

Although deep learning techniques have achieved extraordinary accuracy in recognizing human faces, the pose variances of images captured in real-world scenarios still hinder reliable model appliance. To mitigate this gap, we propose to recognize faces via generation frontal face images with a 3D -Guided Deep P ose- I nvariant Face Recognition M odel (3D-PIM) consisted of a simulator and a refiner module. The simulator employs a 3D Morphable Model (3D MM) to fit the shape and appearance features and recover primary frontal images with less training data. The refiner further enhances the image realism on both global facial structure and local details with adversarial training, while keeping the discriminative identity information consistent with original images. An Adaptive Weighting (AW) metric is then adopted to leverage the complimentary information from recovered frontal faces and original profile faces and to obtain credible similarity scores for recognition. Extended experiments verify the superiority of the proposed “recognition via generation” framework over state-of-the-art.

ICML Conference 2023 Conference Paper

Distribution Free Domain Generalization

  • Peifeng Tong
  • Wu Su
  • He Li
  • Jialin Ding
  • Zhan Haoxiang
  • Song Xi Chen

Accurate prediction of the out-of-distribution data is desired for a learning algorithm. In domain generalization, training data from source domains tend to have different distributions from that of the target domain, while the target data are absence in the training process. We propose a Distribution Free Domain Generalization (DFDG) procedure for classification by conducting standardization to avoid the dominance of a few domains in the training process. The essence of the DFDG is its reformulating the cross domain/class discrepancy by pairwise two sample test statistics, and equally weights their importance or the covariance structures to avoid dominant domain/class. A theoretical generalization bound is established for the multi-class classification problem. The DFDG is shown to offer a superior performance in empirical studies with fewer hyperparameters, which means faster and easier implementation.

IJCAI Conference 2023 Conference Paper

Dynamic Group Link Prediction in Continuous-Time Interaction Network

  • Shijie Luo
  • He Li
  • Jianbin Huang

Recently, group link prediction has received increasing attention due to its important role in analyzing relationships between individuals and groups. However, most existing group link prediction methods emphasize static settings or only make cursory exploitation of historical information, so they fail to obtain good performance in dynamic applications. To this end, we attempt to solve the group link prediction problem in continuous-time dynamic scenes with fine-grained temporal information. We propose a novel continuous-time group link prediction method CTGLP to capture the patterns of future link formation between individuals and groups. A new graph neural network CTGNN is presented to learn the latent representations of individuals by biasedly aggregating neighborhood information. Moreover, we design an importance-based group modeling function to model the embedding of a group based on its known members. CTGLP eventually learns a probability distribution and predicts the link target. Experimental results on various datasets with and without unseen nodes show that CTGLP outperforms the state-of-the-art methods by 13. 4% and 13. 2% on average.

TIST Journal 2022 Journal Article

Deep Reinforcement Learning-based Trajectory Pricing on Ride-hailing Platforms

  • Jianbin Huang
  • Longji Huang
  • Meijuan Liu
  • He Li
  • Qinglin Tan
  • Xiaoke Ma
  • Jiangtao Cui
  • De-Shuang Huang

Dynamic pricing plays an important role in solving the problems such as traffic load reduction, congestion control, and revenue improvement. Efficient dynamic pricing strategies can increase capacity utilization, total revenue of service providers, and the satisfaction of both passengers and drivers. Many proposed dynamic pricing technologies focus on short-term optimization and face poor scalability in modeling long-term goals for the limitations of solution optimality and prohibitive computation. In this article, a deep reinforcement learning framework is proposed to tackle the dynamic pricing problem for ride-hailing platforms. A soft actor-critic (SAC) algorithm is adopted in the reinforcement learning framework. First, the dynamic pricing problem is translated into a Markov Decision Process (MDP) and is set up in continuous action spaces, which is no need for the discretization of action space. Then, a new reward function is obtained by the order response rate and the KL-divergence between supply distribution and demand distribution. Experiments and case studies demonstrate that the proposed method outperforms the baselines in terms of order response rate and total revenue.

TIST Journal 2022 Journal Article

Deep Spatio-temporal Adaptive 3D Convolutional Neural Networks for Traffic Flow Prediction

  • He Li
  • Xuejiao Li
  • Liangcai Su
  • Duo Jin
  • Jianbin Huang
  • Deshuang Huang

Traffic flow prediction is the upstream problem of path planning, intelligent transportation system, and other tasks. Many studies have been carried out on the traffic flow prediction of the spatio-temporal network, but the effects of spatio-temporal flexibility (historical data of the same type of time intervals in the same location will change flexibly) and spatio-temporal correlation (different road conditions have different effects at different times) have not been considered at the same time. We propose the Deep Spatio-temporal Adaptive 3D Convolution Neural Network (ST-A3DNet), which is a new scheme to solve both spatio-temporal correlation and flexibility, and consider spatio-temporal complexity (complex external factors, such as weather and holidays). Different from other traffic forecasting models, ST-A3DNet captures the spatio-temporal relationship at the same time through the Adaptive 3D convolution module, assigns different weights flexibly according to the influence of historical data, and obtains the impact of external factors on the flow through the ex-mask module. Considering the holidays and weather conditions, we train our model for experiments in Xi’an and Chengdu. We evaluate the ST-A3DNet and the results show that we have better results than the other 11 baselines.

IROS Conference 2021 Conference Paper

Decentralized, Unlabeled Multi-Agent Navigation in Obstacle-Rich Environments using Graph Neural Networks

  • Xuebo Ji
  • He Li
  • Zherong Pan
  • Xifeng Gao
  • Changhe Tu

We propose a decentralized, learning-based solution to the challenging problem of unlabeled multi-agent navigation among obstacles, where robots need to simultaneously tackle the problems of goal assignment, local collision avoidance, and navigation. Our method has each robot infer their desired action by communicating with each other as well as a set of position-fixed routers. The inference is carried out on a graph neural network (GNN) with both robot and router nodes. We train our GNN using imitation learning on a small group of robots, where we modify the centralized version of the concurrent goal assignment and planning algorithm (CAPT) as our expert. By sharing weights among all robots and routers, our model can scale to unseen environments with any number of possibly kinodynamic agents during test time. We have achieved a success rate of 91. 2% and 85. 6% for point and car-like robots, respectively. Source code will be publicly available upon the publication of the work.

AAAI Conference 2012 Conference Paper

Discovering Spammers in Social Networks

  • Yin Zhu
  • Xiao Wang
  • Erheng Zhong
  • Nathan Liu
  • He Li
  • Qiang Yang

As the popularity of the social media increases, as evidenced in Twitter, Facebook and China’s Renren, spamming activities also picked up in numbers and variety. On social network sites, spammers often disguise themselves by creating fake accounts and hijacking normal users’ accounts for personal gains. Different from the spammers in traditional systems such as SMS and email, spammers in social media behave like normal users and they continue to change their spamming strategies to fool anti-spamming systems. However, due to the privacy and resource concerns, many social media websites cannot fully monitor all the contents of users, making many of the previous approaches, such as topology-based and content-classification-based methods, infeasible to use. In this paper, we propose a Supervised Matrix Factorization method with Social Regularization (SMFSR) for spammer detection in social networks that exploits both social activities as well as users’ social relations in an innovative and highly scalable manner. The proposed method detects spammers collectively based on users’ social actions and social relations. We have empirically tested our method on data from Renren. com, which is one of the largest social networks in China, and demonstrated that our new method can improve the detection performance significantly.