Arrow Research search

Author name cluster

Ying Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers
2 author rows

Possible papers

36

JBHI Journal 2026 Journal Article

A Local-Global Multi-View Diffusion Variational Graph Auto-Encoder for lncRNA-Protein Interaction Prediction

  • Dongdong Mao
  • Ying Sun

Long non-coding RNAs (lncRNAs) interact with proteins, influencing cell growth, differentiation, and disease onset. Despite significant advancements in computational methods, current approaches rely heavily on manually engineered features and require improved feature fusion techniques. Furthermore, prior studies have predominantly utilized supervised and semi-supervised learning techniques, which fail to effectively harness limited data from various sources, significantly constraining their generalizability and performance across diverse scenarios. Additionally, existing variational graph auto-encoders (VGAE) do not adequately capture long-range interactions of biomolecules. Therefore, this study introduces the Local-Global Multi-View Diffusion Variational Graph Auto-encoder ( LG - MDVGA ) for predicting lncRNA-protein interactions (LPIs). LG-MDVGA integrates a feature construction and fusion module that creates parameterized feature matrices for lncRNAs and proteins, which are updated through backpropagation. To capture local features effectively, separate adaptive local multi-modal feature matrices for lncRNAs and proteins are constructed. To fully utilize limited data to capture global data features and enhance predictive accuracy and generalization, LG-MDVGA incorporates a global multi-space collaborative computation by self-supervised learning. In addition, it introduces a diffusion variational graph auto-encoder (DVGA) to address the limitation that traditional VGAE have difficulty in capturing the complex patterns and relationships of LPIs. Experimental results show that LG-MDVGA significantly outperforms current methods and holds potential for discovering new LPIs. Additionally, LG-MDVGA was tested on five datasets involving three other types of biological entities and consistently attained superior performance. This underscores the generalizability and high precision of LG-MDVGA in accurately predicting associations among biological entities.

AAAI Conference 2026 Conference Paper

Discovering Decoupled Functional Modules in Large Language Models

  • Yanke Yu
  • Jin Li
  • Ying Sun
  • Ping Li
  • Zhefeng Wang
  • Yi Zheng

Understanding the internal functional organization of Large Language Models (LLMs) is crucial for improving their trustworthiness and performance. However, how LLMs organize different functions into modules remains highly unexplored. To bridge this gap, we formulate a function module discovery problem and propose an Unsupervised LLM Cross-layer MOdule Discovery (ULCMOD) framework that simultaneously disentangles the large set of neurons in the entire LLM into modules while discovering the topics of input samples related to these modules. Our framework introduces a novel objective function and an efficient Iterative Decoupling (IterD) algorithm. Extensive experiments show that our method discovers high-quality, disentangled modules that capture more meaningful semantic information and achieve superior performance in various downstream tasks. Moreover, our qualitative analysis reveals that the discovered modules show function comprehensiveness, function hierarchy, and clear function spatial arrangement within LLMs. Our work provides a novel tool for interpreting LLMs' function modules, filling a critical gap in LLMs' interpretability research.

AAAI Conference 2026 Conference Paper

NGTM: Substructure-based Neural Graph Topic Model for Interpretable Graph Generation

  • Yuanxin Zhuang
  • Dazhong Shen
  • Ying Sun

Graph generation plays a pivotal role across numerous domains, including molecular design and knowledge graph construction. Although existing methods achieve considerable success in generating realistic graphs, their interpretability remains limited, often obscuring the rationale behind structural decisions. To address this challenge, we propose the Neural Graph Topic Model (NGTM), a novel generative framework inspired by topic modeling in natural language processing. NGTM represents graphs as mixtures of latent topics, each defining a distribution over semantically meaningful substructures, which facilitates explicit interpretability at both local and global scales. The generation process transparently integrates these topic distributions with a global structural variable, enabling clear semantic tracing of each generated graph. Experiments demonstrate that NGTM achieves competitive generation quality while uniquely enabling fine-grained control and interpretability, allowing users to tune structural features or induce biological properties through topic-level adjustments.

AAAI Conference 2026 Conference Paper

Towards Illumination-Aware Restoration of Metalens-Captured Images: A New Dataset and a Strong Baseline

  • Fen Fang
  • Xinan Liang
  • Muli Yang
  • Jinghong Zheng
  • Tobias Mass
  • Ying Sun
  • Xulei Yang
  • Xuewu Xu

Metalenses offer compelling advantages such as lightweight and ultra-thin design, making them promising alternatives to conventional lenses. However, their widespread adoption is hindered by image quality degradation caused by chromatic and angular aberrations. To mitigate this, restoration processes are often necessary to recover high-quality RGB images from metalens-captured inputs. While recent deep learning-based restoration methods show promise, they typically (1) blur or distort peripheral regions, or (2) fail entirely under unseen illumination conditions. To advance metalens image restoration, we introduce IlluMeta---the first and largest real-world, illumination-aware metalens image dataset—captured across diverse lighting environments. In addition, we propose a novel end-to-end restoration framework that directs attention to challenging regions and adaptively adjusts to varying illuminations via reinforcement learning. Experiments show that our method can be applied in a plug-and-play manner to enhance existing models, significantly improving image restoration quality, especially under unseen lighting conditions, paving the way for broader real-world deployment of metalens technologies.

AAAI Conference 2026 Conference Paper

Your AI-Generated Image Detector Can Secretly Achieve SOTA Accuracy, If Calibrated

  • Muli Yang
  • Gabriel James Goenawan
  • Henan Wang
  • Huaiyuan Qin
  • Chenghao Xu
  • Yanhua Yang
  • Fen Fang
  • Ying Sun

Despite being trained on balanced datasets, existing AI-generated image detectors often exhibit systematic bias at test time, frequently misclassifying fake images as real. We hypothesize that this behavior stems from distributional shift in fake samples and implicit priors learned during training. Specifically, models tend to overfit to superficial artifacts that do not generalize well across different generation methods, leading to a misaligned decision threshold when faced with test-time distribution shift. To address this, we propose a theoretically grounded post-hoc calibration framework based on Bayesian decision theory. In particular, we introduce a learnable scalar correction to the model’s logits, optimized on a small validation set from the target distribution while keeping the backbone frozen. This parametric adjustment compensates for distributional shift in model output, realigning the decision boundary even without requiring ground-truth labels. Experiments on challenging benchmarks show that our approach significantly improves robustness without retraining, offering a lightweight and principled solution for reliable and adaptive AI-generated image detection in the open world.

JMLR Journal 2025 Journal Article

Decentralized Sparse Linear Regression via Gradient-Tracking

  • Marie Maros
  • Gesualdo Scutari
  • Ying Sun
  • Guang Cheng

We study sparse linear regression over a network of agents, modeled as an undirected graph without a center node. The estimation of the $s$-sparse parameter is formulated as a constrained LASSO problem wherein each agent owns a subset of the $N$ total observations. We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm under high-dimensional scaling, allowing the ambient dimension $d$ to grow with (and possibly exceed) the sample size $N$. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the average loss functions, suitable conditions on the network connectivity and algorithm tuning, the distributed algorithm converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model, $O(s\log d/N)$. When $s\log d/N=o(1)$, a condition necessary for statistical consistency, an $\varepsilon$-optimal solution is attained after ${O}(\kappa \log (1/\varepsilon))$ gradient computations and $O(\kappa/(1-\rho) \log (1/\varepsilon))$ communication rounds, where $\kappa$ is the restricted condition number of the loss function and $\rho$ measures the network connectivity. The computation cost matches that of the centralized projected gradient algorithm despite having data distributed; whereas the communication rounds reduce as the network connectivity improves. Overall, our study reveals interesting connections between statistical efficiency, network connectivity and topology, and convergence rate in the high dimensional setting. [abs] [ pdf ][ bib ] &copy JMLR 2025. ( edit, beta )

JMLR Journal 2025 Journal Article

Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis

  • Youcheng Niu
  • Jinming Xu
  • Ying Sun
  • Yan Huang
  • Li Chai

This paper considers solving a class of nonconvex-strongly-convex distributed stochastic bilevel optimization (DSBO) problems with personalized inner-level objectives. Most existing algorithms require computational loops for hypergradient estimation, leading to computational inefficiency. Moreover, the impact of data heterogeneity on convergence in bilevel problems is not explicitly characterized yet. To address these issues, we propose LoPA, a loopless personalized distributed algorithm that leverages a tracking mechanism for iterative approximation of inner-level solutions and Hessian-inverse matrices without relying on extra computation loops. Our theoretical analysis explicitly characterizes the heterogeneity across nodes (denoted by $b$), and establishes a sublinear rate of $\mathcal{O}( {\frac{1}{{{{\left( {1 - \rho } \right)}}K}}\!+ \!\frac{{(\frac{b}{\sqrt{m}})^{\frac{2}{3}} }}{{\left( {1 - \rho } \right)^{\frac{2}{3}} K^{\frac{2}{3}} }} \!+ \!\frac{1}{\sqrt{ K }}( {\sigma _{\operatorname{p} }} + \frac{1}{\sqrt{m}}{\sigma _{\operatorname{c} }} ) } )$ without the boundedness of local hypergradients, where ${\sigma _{\operatorname{p} }}$ and ${\sigma _{\operatorname{c} }}$ represent the gradient sampling variances associated with the inner- and outer-level variables, respectively. We also integrate LoPA with a gradient tracking scheme to eliminate the impact of data heterogeneity, yielding an improved rate of ${{\mathcal{O}}}(\frac{{1}}{{ (1-\rho)^2K }} \!+\! \frac{1}{{\sqrt{K}}}( \sigma_{\rm{p}} \!+\! \frac{1}{\sqrt{m}}\sigma_{\rm{c}} ) )$. The computational complexity of LoPA is of ${{\mathcal{O}}}({\epsilon^{-2}})$ to an $\epsilon$-stationary point, matching the communication complexity due to the loopless structure, which outperforms existing counterparts for DSBO. Numerical experiments validate the effectiveness of the proposed algorithm. [abs] [ pdf ][ bib ] &copy JMLR 2025. ( edit, beta )

AAAI Conference 2025 Conference Paper

EFSkip: A New Error Feedback with Linear Speedup for Compressed Federated Learning with Arbitrary Data Heterogeneity

  • Hongyan Bao
  • Pengwen Chen
  • Ying Sun
  • Zhize Li

Due to the communication bottleneck in distributed and decentralized federated learning applications, algorithms using compressed communication have attracted significant attention. The Error Feedback (EF) is a widely-studied compression framework for convergence with biased compressors such as top-k sparsification. Although various improvements have been obtained in recent years, the theoretical guarantee for EF-type framework is still limited. Previous works either 1) rely on strong assumptions such as bounded gradient/dissimilarity assumptions, thus can not deal with arbitrary data heterogeneity and also slow the convergence speed, or 2) can not enjoy linear speedup in the number of clients. In this work, we propose a new EFSkip framework which removes the strong assumptions to allow arbitrary data heterogeneity and enjoys linear speedup for significantly improving upon previous results. In particular, EFSkip achieves a substantially lower computational complexity compared to the previous EF21, i.e., EFSkip enjoys the linear speedup in the number of clients (reducing the result linearly using more clients). We also show that EFSkip enjoys linear speedup and achieves faster convergence for nonconvex problems satisfying Polyak-Lojasiewicz (PL) condition. We believe that the new EFSkip framework will have a large impact on the communication- and computation-efficient distributed and decentralized federated learning.

NeurIPS Conference 2025 Conference Paper

Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding

  • Weiyu Guo
  • Ziyang Chen
  • Shaoguang WANG
  • Jianxiang He
  • Yijie Xu
  • Jinhui Ye
  • Ying Sun
  • Hui Xiong

Understanding long video content is a complex endeavor that often relies on densely sampled frame captions or end-to-end feature selectors, yet these techniques commonly overlook the logical relationships between textual queries and visual elements. In practice, computational constraints necessitate coarse frame subsampling, a challenge analogous to “finding a needle in a haystack. ” To address this issue, we introduce a semantics-driven search framework that reformulates keyframe selection under the paradigm of Visual Semantic-Logical Search (VSLS). Specifically, we systematically define four fundamental logical dependencies: 1) spatial co-occurrence, 2) temporal proximity, 3) attribute dependency, and 4) causal order. These relations dynamically update frame sampling distributions through an iterative refinement process, enabling context-aware identification of semantically critical frames tailored to specific query requirements. Our method establishes new state-of-the-art performance on the manually annotated benchmark in keyframe selection metrics. Furthermore, when applied to downstream video question-answering tasks, the proposed approach demonstrates the best performance gains over existing methods on LongVideoBench and Video-MME, validating its effectiveness in bridging the logical gap between textual queries and visual-temporal reasoning. The code will be publicly available.

AAAI Conference 2025 Conference Paper

Non-Convex Tensor Recovery from Local Measurements

  • Tongle Wu
  • Ying Sun
  • Jicong Fan

Motivated by the settings where sensing the entire tensor is infeasible, this paper proposes a novel tensor compressed sensing model, where measurements are only obtained from sensing each lateral slice via mutually independent matrices. Leveraging the low tubal rank structure, we reparameterize the unknown tensor?* using two compact tensor factors and formulate the recovery problem as a nonconvex minimization problem. To solve the problem, we first propose an alternating minimization algorithm, termed Alt-PGD-Min, that iteratively optimizes the two factors using a projected gradient descent and an exact minimization step, respectively. Despite nonconvexity, we prove that Alt-PGD-Min achieves ϵ-accuracy recovery with?(?²log1/?) iteration complexity and?(?⁶rn₃logn₃(?²r(n₁+n₂)+n₁log1/ε)) sample complexity, where? denotes tensor condition number of?*. To further accelerate the convergence, especially when the tensor is ill-conditioned with large?, we prove Alt-ScalePGD-Min that preconditions the gradient update using an approximate Hessian that can be computed efficiently. We show that Alt-ScalePGD-Min achieves? independent iteration complexity?(log1/ε) and improves the sample complexity to?(?⁴rn₃log n₃(?⁴ r(n₁ + n₂)+n₁log 1/ε)). Experiments validate the effectiveness of the proposed methods.

NeurIPS Conference 2025 Conference Paper

Non-Convex Tensor Recovery from Tube-Wise Sensing

  • Tongle Wu
  • Ying Sun

In this paper, we propose a novel tube-wise local tensor compressed sensing (CS) model, where sensing operators are independently applied to each tube of a third-order tensor. To recover the low-rank ground truth tensor, we minimize a non-convex objective via Burer–Monteiro factorization and solve it using gradient descent with spectral initialization. We prove that this approach achieves exact recovery with a linear convergence rate. Notably, our method attains provably lower sample complexity than existing TCS methods. Our proof leverages the leave-one-out technique to show that gradient descent generates iterates implicitly biased towards solutions with bounded incoherence, which ensures contraction of optimization error in consecutive iterates. Empirical results validate the effectiveness of GD in solving the proposed local TCS model.

NeurIPS Conference 2025 Conference Paper

OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects

  • Mark H. Huang
  • Lin Geng Foo
  • Christian Theobalt
  • Ying Sun
  • De Wen Soh

Free-moving object reconstruction from monocular video remains challenging, particularly without reliable pose or depth cues and under arbitrary object motion. We introduce OnlineSplatter, a novel online feed-forward framework generating high-quality, object-centric 3D Gaussians directly from RGB frames without requiring camera pose, depth priors, or bundle optimization. Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field, maintaining constant computational cost regardless of video sequence length. Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys, robustly fusing current frame features with temporally aggregated object states. This design enables effective handling of free-moving objects via spatial-guided memory readout and an efficient sparsification mechanism, ensuring comprehensive yet compact object coverage. Evaluations on real-world datasets demonstrate that OnlineSplatter significantly outperforms state-of-the-art pose-free reconstruction baselines, consistently improving with more observations while maintaining constant memory and runtime.

ICML Conference 2025 Conference Paper

Understanding the Statistical Accuracy-Communication Trade-off in Personalized Federated Learning with Minimax Guarantees

  • Xin Yu
  • Zelin He
  • Ying Sun
  • Lingzhou Xue
  • Runze Li

Personalized federated learning (PFL) offers a flexible framework for aggregating information across distributed clients with heterogeneous data. This work considers a personalized federated learning setting that simultaneously learns global and local models. While purely local training has no communication cost, collaborative learning among the clients can leverage shared knowledge to improve statistical accuracy, presenting an accuracy-communication trade-off in personalized federated learning. However, the theoretical analysis of how personalization quantitatively influences sample and algorithmic efficiency and their inherent trade-off is largely unexplored. This paper makes a contribution towards filling this gap, by providing a quantitative characterization of the personalization degree on the tradeoff. The results further offer theoretical insights for choosing the personalization degree. As a side contribution, we establish the minimax optimality in terms of statistical accuracy for a widely studied PFL formulation. The theoretical result is validated on both synthetic and real-world datasets and its generalizability is verified in a non-convex setting.

IJCAI Conference 2024 Conference Paper

Anomaly Subgraph Detection through High-Order Sampling Contrastive Learning

  • Ying Sun
  • Wenjun Wang
  • Nannan Wu
  • Chunlong Bao

Anomaly subgraph detection is a crucial task in various real-world applications, including identifying high-risk areas, detecting river pollution, and monitoring disease outbreaks. Early traditional graph-based methods can obtain high-precision detection results in scenes with small-scale graphs and obvious anomaly features. Most existing anomaly detection methods based on deep learning primarily concentrate on identifying anomalies at the node level, while neglecting to detect anomaly groups in the internal structure. In this paper, we propose a novel end-to-end Graph Neural Network (GNN) based anomaly subgraph detection approach(ASD-HC) in graph-structured data. 1)We propose a high-order neighborhood sampling strategy to construct our node and k-order neighbor-subgraph instance pairs. 2)Anomaly features of nodes are captured through a self-supervised contrastive learning model. 3) Detecting the maximum connected anomaly subgraph is performed by integrating the Non-parameter Graph Scan statistics and a Random Walk module. We evaluate ASD-HC against five state-of-the-art baselines using five benchmark datasets. ASD-HC outperforms the baselines by over 13. 01% in AUC score. Various experiments demonstrate that our approach effectively detects anomaly subgraphs within large-scale graphs.

IJCAI Conference 2024 Conference Paper

Implicit Anomaly Subgraph Detection (IASD) in Multi-Domain Attribute Networks

  • Ying Sun

Anomaly subgraph detection is a vital task in various real applications. However, with the advancement of AI technology, it faces new challenges: 1) Anomaly features are often deeply hidden within large datasets, and 2) Anomaly detection approaches are required to unveil the mechanisms behind anomaly generation. Our study focuses on detecting hidden anomaly subgraphs within big data and offering improved explanations for the root cause of anomalies by integrating multi-domain datasets.

NeurIPS Conference 2024 Conference Paper

Implicit Regularization of Decentralized Gradient Descent for Sparse Regression

  • Tongle Wu
  • Ying Sun

We consider learning a sparse model from linear measurements taken by a network of agents. Different from existing decentralized methods designed based on the LASSO regression with explicit $\ell_1$ norm regularization, we exploit the implicit regularization of decentralized optimization method applied to an over-parameterized nonconvex least squares formulation without penalization. Our first result shows that despite nonconvexity, if the network connectivity is good, the well-known decentralized gradient descent algorithm (DGD) with small initialization and early stopping can compute the statistically optimal solution. Sufficient conditions on the initialization scale, choice of step size, network connectivity, and stopping time are further provided to achieve convergence. Our result recovers the convergence rate of gradient descent in the centralized setting, showing its tightness. Based on the analysis of DGD, we further propose a communication-efficient version, termed T-DGD, by truncating the iterates before transmission. In the high signal-to-noise ratio (SNR) regime, we show that T-DGD achieves comparable statistical accuracy to DGD, while the communication cost is logarithmic in the number of parameters. Numerical results are provided to validate the effectiveness of DGD and T-DGD for sparse learning through implicit regularization.

JBHI Journal 2024 Journal Article

Model Generalizability Investigation for GFCE-MRI Synthesis in NPC Radiotherapy Using Multi-Institutional Patient-Based Data Normalization

  • Wen Li
  • Saikit Lam
  • Yinghui Wang
  • Chenyang Liu
  • Tian Li
  • Jens Kleesiek
  • Andy Lai-Yin Cheung
  • Ying Sun

Recently, deep learning has been demonstrated to be feasible in eliminating the use of gadoliniumbased contrast agents (GBCAs) through synthesizing gadolinium-free contrast-enhanced MRI (GFCE-MRI) from contrast-free MRI sequences, providing the community with an alternative to get rid of GBCAs-associated safety issues in patients. Nevertheless, generalizability assessment of the GFCE-MRI model has been largely challenged by the high inter-institutional heterogeneity of MRI data, on top of the scarcity of multi-institutional data itself. Although various data normalization methods have been adopted to address the heterogeneity issue, it has been limited to single-institutional investigation and there is no standard normalization approach presently. In this study, we aimed at investigating generalizability of GFCE-MRI model using data from seven institutions by manipulating heterogeneity of MRI data under five popular normalization approaches. Three state-of-the-art neural networks were applied to map from T1-weighted and T2-weighted MRI to contrast-enhanced MRI (CE-MRI) for GFCE-MRI synthesis in patients with nasopharyngeal carcinoma. MRI data from three institutions were used separately to generate three uni-institution models and jointly for a tri-institution model. The five normalization methods were applied to normalize the data of each model. MRI data from the remaining four institutions served as external cohorts for model generalizability assessment. Quality of GFCE-MRI was quantitatively evaluated against ground-truth CE-MRI using mean absolute error (MAE) and peak signal-to-noise ratio(PSNR). Results showed that performance of all uni-institution models remarkably dropped on the external cohorts. By contrast, model trained using multi-institutional data with Z-Score normalization yielded the best model generalizability improvement.

NeurIPS Conference 2024 Conference Paper

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs

  • Liyi Chen
  • Panrong Tong
  • Zhongming Jin
  • Ying Sun
  • Jieping Ye
  • Hui Xiong

Large Language Models (LLMs) have shown remarkable reasoning capabilities on complex tasks, but they still suffer from out-of-date knowledge, hallucinations, and opaque decision-making. In contrast, Knowledge Graphs (KGs) can provide explicit and editable knowledge for LLMs to alleviate these issues. Existing paradigm of KG-augmented LLM manually predefines the breadth of exploration space and requires flawless navigation in KGs. However, this paradigm cannot adaptively explore reasoning paths in KGs based on the question semantics and self-correct erroneous reasoning paths, resulting in a bottleneck in efficiency and effect. To address these limitations, we propose a novel self-correcting adaptive planning paradigm for KG-augmented LLM named Plan-on-Graph (PoG), which first decomposes the question into several sub-objectives and then repeats the process of adaptively exploring reasoning paths, updating memory, and reflecting on the need to self-correct erroneous reasoning paths until arriving at the answer. Specifically, three important mechanisms of Guidance, Memory, and Reflection are designed to work together, to guarantee the adaptive breadth of self-correcting planning for graph reasoning. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of PoG.

IROS Conference 2024 Conference Paper

Risk-Aware Non-Myopic Motion Planner for Large-Scale Robotic Swarm Using CVaR Constraints

  • Xuru Yang
  • Yunze Hu
  • Han Gao
  • Kang Ding
  • Zhaoyang Li
  • Pingping Zhu
  • Ying Sun
  • Chang Liu

Swarm robotics has garnered significant attention due to its ability to accomplish elaborate and synchronized tasks. Existing methodologies for motion planning of swarm robotic systems mainly encounter difficulties in scalability and safety guarantee. To address these limitations, we propose a Risk-aware swarm mOtion planner using conditional ValuE-at-Risk (ROVER) that systematically navigates large-scale swarms through cluttered environments while ensuring safety. ROVER formulates a finite-time model predictive control (FTMPC) problem predicated upon the macroscopic state of the robot swarm represented by a Gaussian Mixture Model (GMM) and integrates conditional value-at-risk (CVaR) to ensure collision avoidance. The key component of ROVER is imposing a CVaR constraint on the distribution of the Signed Distance Function between the swarm GMM and obstacles in the FTMPC to enforce collision avoidance. Utilizing the analytical expression of CVaR of a GMM derived in this work, we develop a computationally efficient solution to solve the non-linear constrained FTMPC through sequential linear programming. Simulations and comparisons with representative benchmark approaches demonstrate the effectiveness of ROVER in flexibility, scalability, and safety guarantee.

NeurIPS Conference 2024 Conference Paper

SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network

  • Weiyu Guo
  • Ying Sun
  • Yijie Xu
  • Ziyue Qiao
  • Yongkui Yang
  • Hui Xiong

Surface electromyography (sEMG) based gesture recognition offers a natural and intuitive interaction modality for wearable devices. Despite significant advancements in sEMG-based gesture recognition models, existing methods often suffer from high computational latency and increased energy consumption. Additionally, the inherent instability of sEMG signals, combined with their sensitivity to distribution shifts in real-world settings, compromises model robustness. To tackle these challenges, we propose a novel SpGesture framework based on Spiking Neural Networks, which possesses several unique merits compared with existing methods: (1) Robustness: By utilizing membrane potential as a memory list, we pioneer the introduction of Source-Free Domain Adaptation into SNN for the first time. This enables SpGesture to mitigate the accuracy degradation caused by distribution shifts. (2) High Accuracy: With a novel Spiking Jaccard Attention, SpGesture enhances the SNNs' ability to represent sEMG features, leading to a notable rise in system accuracy. To validate SpGesture's performance, we collected a new sEMG gesture dataset which has different forearm postures, where SpGesture achieved the highest accuracy among the baselines ($89. 26\%$). Moreover, the actual deployment on the CPU demonstrated a latency below 100ms, well within real-time requirements. This impressive performance showcases SpGesture's potential to enhance the applicability of sEMG in real-world scenarios. The code is available at https: //github. com/guoweiyu/SpGesture/.

NeurIPS Conference 2024 Conference Paper

Tackling Uncertain Correspondences for Multi-Modal Entity Alignment

  • Liyi Chen
  • Ying Sun
  • Shengzhe Zhang
  • Yuyang Ye
  • Wei Wu
  • Hui Xiong

Recently, multi-modal entity alignment has emerged as a pivotal endeavor for the integration of Multi-Modal Knowledge Graphs (MMKGs) originating from diverse data sources. Existing works primarily focus on fully depicting entity features by designing various modality encoders or fusion approaches. However, uncertain correspondences between inter-modal or intra-modal cues, such as weak inter-modal associations, description diversity, and modality absence, still severely hinder the effective exploration of aligned entity similarities. To this end, in this paper, we propose a novel Tackling uncertain correspondences method for Multi-modal Entity Alignment (TMEA). Specifically, to handle diverse attribute knowledge descriptions, we design alignment-augmented abstract representation that incorporates the large language model and in-context learning into attribute alignment and filtering for generating and embedding the attribute abstract. In order to mitigate the influence of the modality absence, we propose to unify all modality features into a shared latent subspace and generate pseudo features via variational autoencoders according to existing modal features. Then, we develop an inter-modal commonality enhancement mechanism based on cross-attention with orthogonal constraints, to address weak semantic associations between modalities. Extensive experiments on two real-world datasets validate the effectiveness of TMEA with a clear improvement over competitive baselines.

IJCAI Conference 2023 Conference Paper

Beyond Homophily: Robust Graph Anomaly Detection via Neural Sparsification

  • Zheng Gong
  • Guifeng Wang
  • Ying Sun
  • Qi Liu
  • Yuting Ning
  • Hui Xiong
  • Jingyu Peng

Recently, graph-based anomaly detection (GAD) has attracted rising attention due to its effectiveness in identifying anomalies in relational and structured data. Unfortunately, the performance of most existing GAD methods suffers from the inherent structural noises of graphs induced by hidden anomalies connected with considerable benign nodes. In this work, we propose SparseGAD, a novel GAD framework that sparsifies the structures of target graphs to effectively reduce noises and collaboratively learns node representations. It then robustly detects anomalies by uncovering the underlying dependency among node pairs in terms of homophily and heterophily, two essential connection properties of GAD. Extensive experiments on real-world datasets of GAD demonstrate that the proposed framework achieves significantly better detection quality compared with the state-of-the-art methods, even when the graph is heavily attacked. Code will be available at https: //github. com/KellyGong/SparseGAD. git.

AAAI Conference 2023 Conference Paper

Counterfactual Dynamics Forecasting – a New Setting of Quantitative Reasoning

  • Yanzhu Liu
  • Ying Sun
  • Joo-Hwee Lim

Rethinking and introspection are important elements of human intelligence. To mimic these capabilities, counterfactual reasoning has attracted attention of AI researchers recently, which aims to forecast the alternative outcomes for hypothetical scenarios (“what-if”). However, most existing approaches focused on qualitative reasoning (e.g., casual-effect relationship). It lacks a well-defined description of the differences between counterfactuals and facts, as well as how these differences evolve over time. This paper defines a new problem formulation - counterfactual dynamics forecasting - which is described in middle-level abstraction under the structural causal models (SCM) framework and derived as ordinary differential equations (ODEs) as low-level quantitative computation. Based on it, we propose a method to infer counterfactual dynamics considering the factual dynamics as demonstration. Moreover, the evolution of differences between facts and counterfactuals are modelled by an explicit temporal component. The experimental results on two dynamical systems demonstrate the effectiveness of the proposed method.

JMLR Journal 2023 Journal Article

Distributed Sparse Regression via Penalization

  • Yao Ji
  • Gesualdo Scutari
  • Ying Sun
  • Harsha Honnappa

We study sparse linear regression over a network of agents, modeled as an undirected graph (with no centralized node). The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint—the latter being instrumental to obtain distributed solution methods. While penalty-based consensus methods have been extensively studied in the optimization literature, their statistical and computational guarantees in the high dimensional setting remain unclear. This work provides an answer to this open problem. Our contribution is two-fold. First, we establish statistical consistency of the estimator: under a suitable choice of the penalty parameter, the optimal solution of the penalized problem achieves near optimal minimax rate $O(s \log d/N)$ in $\ell_2$-loss, where $s$ is the sparsity value, $d$ is the ambient dimension, and $N$ is the total sample size in the network—this matches centralized sample rates. Second, we show that the proximal-gradient algorithm applied to the penalized problem, which naturally leads to distributed implementations, converges linearly up to a tolerance of the order of the centralized statistical error---the rate scales as $O(d)$, revealing an unavoidable speed-accuracy dilemma. Numerical results demonstrate the tightness of the derived sample rate and convergence rate scalings. [abs] [ pdf ][ bib ] &copy JMLR 2023. ( edit, beta )

ICLR Conference 2022 Conference Paper

Hybrid Local SGD for Federated Learning with Heterogeneous Communications

  • Yuanxiong Guo
  • Ying Sun
  • Rui Hu 0005
  • Yanmin Gong 0001

Communication is a key bottleneck in federated learning where a large number of edge devices collaboratively learn a model under the orchestration of a central server without sharing their own training data. While local SGD has been proposed to reduce the number of FL rounds and become the algorithm of choice for FL, its total communication cost is still prohibitive when each device needs to communicate with the remote server repeatedly for many times over bandwidth-limited networks. In light of both device-to-device (D2D) and device-to-server (D2S) cooperation opportunities in modern communication networks, this paper proposes a new federated optimization algorithm dubbed hybrid local SGD (HL-SGD) in FL settings where devices are grouped into a set of disjoint clusters with high D2D communication bandwidth. HL-SGD subsumes previous proposed algorithms such as local SGD and gossip SGD and enables us to strike the best balance between model accuracy and runtime. We analyze the convergence of HL-SGD in the presence of heterogeneous data for general nonconvex settings. We also perform extensive experiments and show that the use of hybrid model aggregation via D2D and D2S communications in HL-SGD can largely speed up the training time of federated learning.

IJCAI Conference 2022 Conference Paper

Monitoring Vegetation From Space at Extremely Fine Resolutions via Coarsely-Supervised Smooth U-Net

  • Joshua Fan
  • Di Chen
  • Jiaming Wen
  • Ying Sun
  • Carla Gomes

Monitoring vegetation productivity at extremely fine resolutions is valuable for real-world agricultural applications, such as detecting crop stress and providing early warning of food insecurity. Solar-Induced Chlorophyll Fluorescence (SIF) provides a promising way to directly measure plant productivity from space. However, satellite SIF observations are only available at a coarse spatial resolution, making it impossible to monitor how individual crop types or farms are doing. This poses a challenging coarsely-supervised regression (or downscaling) task; at training time, we only have SIF labels at a coarse resolution (3km), but we want to predict SIF at much finer spatial resolutions (e. g. 30m, a 100x increase). We also have additional fine-resolution input features, but the relationship between these features and SIF is unknown. To address this, we propose Coarsely-Supervised Smooth U-Net (CS-SUNet), a novel method for this coarse supervision setting. CS-SUNet combines the expressive power of deep convolutional networks with novel regularization methods based on prior knowledge (such as a smoothness loss) that are crucial for preventing overfitting. Experiments show that CS-SUNet resolves fine-grained variations in SIF more accurately than existing methods.

NeurIPS Conference 2021 Conference Paper

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

  • Ying Sun
  • Hengshu Zhu
  • Chuan Qin
  • Fuzhen Zhuang
  • Qing He
  • Hui Xiong

Neural network based deep learning techniques have shown great success for numerous applications. While it is expected to understand their intrinsic decision-making processes, these deep neural networks often work in a black-box way. To this end, in this paper, we aim to discern the decision-making processes of neural networks through a hierarchical voting strategy by developing an explainable deep learning model, namely Voting Transformation-based Explainable Neural Network (VOTEN). Specifically, instead of relying on massive feature combinations, VOTEN creatively models expressive single-valued voting functions between explicitly modeled latent concepts to achieve high fitting ability. Along this line, we first theoretically analyze the major components of VOTEN and prove the relationship and advantages of VOTEN compared with Multi-Layer Perceptron (MLP), the basic structure of deep neural networks. Moreover, we design efficient algorithms to improve the model usability by explicitly showing the decision processes of VOTEN. Finally, extensive experiments on multiple real-world datasets clearly validate the performances and explainability of VOTEN.

TIST Journal 2021 Journal Article

Exploring the Risky Travel Area and Behavior of Car-hailing Service

  • Hongting Niu
  • Hengshu Zhu
  • Ying Sun
  • Xinjiang Lu
  • Jing Sun
  • Zhiyuan Zhao
  • Hui Xiong
  • Bo Lang

Recent years have witnessed the rapid development of car-hailing services, which provide a convenient approach for connecting passengers and local drivers using their personal vehicles. At the same time, the concern on passenger safety has gradually emerged and attracted more and more attention. While car-hailing service providers have made considerable efforts on developing real-time trajectory tracking systems and alarm mechanisms, most of them only focus on providing rescue-supporting information rather than preventing potential crimes. Recently, the newly available large-scale car-hailing order data have provided an unparalleled chance for researchers to explore the risky travel area and behavior of car-hailing services, which can be used for building an intelligent crime early warning system. To this end, in this article, we propose a Risky Area and Risky Behavior Evaluation System (RARBEs) based on the real-world car-hailing order data. In RARBEs, we first mine massive multi-source urban data and train an effective area risk prediction model, which estimates area risk at the urban block level. Then, we propose a transverse and longitudinal double detection method, which estimates behavior risk based on two aspects, including fraud trajectory recognition and fraud patterns mining. In particular, we creatively propose a bipartite graph-based algorithm to model the implicit relationship between areas and behaviors, which collaboratively adjusts area risk and behavior risk estimation based on random walk regularization. Finally, extensive experiments on multi-source real-world urban data clearly validate the effectiveness and efficiency of our system.

AAAI Conference 2021 System Paper

TAILOR: Teaching with Active and Incremental Learning for Object Registration

  • Qianli Xu
  • Nicolas Gauthier
  • Wenyu Liang
  • Fen Fang
  • Hui Li Tan
  • Ying Sun
  • Yan Wu
  • Liyuan Li

When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and laborintensive. We present TAILOR - a method and system for object registration with active and incremental learning. When instructed by a human teacher to register an object, TAILOR is able to automatically select viewpoints to capture informative images by actively exploring viewpoints, and employs a fast incremental learning algorithm to learn new objects without potential forgetting of previously learned objects. We demonstrate the effectiveness of our method with a KUKA robot to learn novel objects used in a real-world gearbox assembly task through natural interactions.

JMLR Journal 2019 Journal Article

Decentralized Dictionary Learning Over Time-Varying Digraphs

  • Amir Daneshmand
  • Ying Sun
  • Gesualdo Scutari
  • Francisco Facchinei
  • Brian M. Sadler

This paper studies Dictionary Learning problems wherein the learning task is distributed over a multi-agent network, modeled as a time-varying directed graph. This formulation is relevant, for instance, in Big Data scenarios where massive amounts of data are collected/stored in different locations (e.g., sensors, clouds) and aggregating and/or processing all data in a fusion center might be inefficient or unfeasible, due to resource limitations, communication overheads or privacy issues. We develop a unified decentralized algorithmic framework for this class of nonconvex problems, which is proved to converge to stationary solutions at a sublinear rate. The new method hinges on Successive Convex Approximation techniques, coupled with a decentralized tracking mechanism aiming at locally estimating the gradient of the smooth part of the sum-utility. To the best of our knowledge, this is the first provably convergent decentralized algorithm for Dictionary Learning and, more generally, bi-convex problems over (time-varying) (di)graphs. [abs] [ pdf ][ bib ] &copy JMLR 2019. ( edit, beta )