Author name cluster

Ying Sun

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

36 papers

2 author rows

JBHI Journal 2026 Journal Article

A Local-Global Multi-View Diffusion Variational Graph Auto-Encoder for lncRNA-Protein Interaction Prediction

Dongdong Mao
Ying Sun

Long non-coding RNAs (lncRNAs) interact with proteins, influencing cell growth, differentiation, and disease onset. Despite significant advancements in computational methods, current approaches rely heavily on manually engineered features and require improved feature fusion techniques. Furthermore, prior studies have predominantly utilized supervised and semi-supervised learning techniques, which fail to effectively harness limited data from various sources, significantly constraining their generalizability and performance across diverse scenarios. Additionally, existing variational graph auto-encoders (VGAE) do not adequately capture long-range interactions of biomolecules. Therefore, this study introduces the Local-Global Multi-View Diffusion Variational Graph Auto-encoder ( LG - MDVGA ) for predicting lncRNA-protein interactions (LPIs). LG-MDVGA integrates a feature construction and fusion module that creates parameterized feature matrices for lncRNAs and proteins, which are updated through backpropagation. To capture local features effectively, separate adaptive local multi-modal feature matrices for lncRNAs and proteins are constructed. To fully utilize limited data to capture global data features and enhance predictive accuracy and generalization, LG-MDVGA incorporates a global multi-space collaborative computation by self-supervised learning. In addition, it introduces a diffusion variational graph auto-encoder (DVGA) to address the limitation that traditional VGAE have difficulty in capturing the complex patterns and relationships of LPIs. Experimental results show that LG-MDVGA significantly outperforms current methods and holds potential for discovering new LPIs. Additionally, LG-MDVGA was tested on five datasets involving three other types of biological entities and consistently attained superior performance. This underscores the generalizability and high precision of LG-MDVGA in accurately predicting associations among biological entities.

Details DOI

EAAI Journal 2026 Journal Article

Causal knowledge modulated temporal generative adversarial network for carbon emissions forecasting

Xiao Yao
Ying Sun
Yuanrong Wang
Changfeng Shi

Details DOI

AAAI Conference 2026 Conference Paper

Discovering Decoupled Functional Modules in Large Language Models

Yanke Yu
Jin Li
Ying Sun
Ping Li
Zhefeng Wang
Yi Zheng

Understanding the internal functional organization of Large Language Models (LLMs) is crucial for improving their trustworthiness and performance. However, how LLMs organize different functions into modules remains highly unexplored. To bridge this gap, we formulate a function module discovery problem and propose an Unsupervised LLM Cross-layer MOdule Discovery (ULCMOD) framework that simultaneously disentangles the large set of neurons in the entire LLM into modules while discovering the topics of input samples related to these modules. Our framework introduces a novel objective function and an efficient Iterative Decoupling (IterD) algorithm. Extensive experiments show that our method discovers high-quality, disentangled modules that capture more meaningful semantic information and achieve superior performance in various downstream tasks. Moreover, our qualitative analysis reveals that the discovered modules show function comprehensiveness, function hierarchy, and clear function spatial arrangement within LLMs. Our work provides a novel tool for interpreting LLMs' function modules, filling a critical gap in LLMs' interpretability research.

PDF Details DOI

AAAI Conference 2026 Conference Paper

NGTM: Substructure-based Neural Graph Topic Model for Interpretable Graph Generation

Yuanxin Zhuang
Dazhong Shen
Ying Sun

Graph generation plays a pivotal role across numerous domains, including molecular design and knowledge graph construction. Although existing methods achieve considerable success in generating realistic graphs, their interpretability remains limited, often obscuring the rationale behind structural decisions. To address this challenge, we propose the Neural Graph Topic Model (NGTM), a novel generative framework inspired by topic modeling in natural language processing. NGTM represents graphs as mixtures of latent topics, each defining a distribution over semantically meaningful substructures, which facilitates explicit interpretability at both local and global scales. The generation process transparently integrates these topic distributions with a global structural variable, enabling clear semantic tracing of each generated graph. Experiments demonstrate that NGTM achieves competitive generation quality while uniquely enabling fine-grained control and interpretability, allowing users to tune structural features or induce biological properties through topic-level adjustments.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Towards Illumination-Aware Restoration of Metalens-Captured Images: A New Dataset and a Strong Baseline

Fen Fang
Xinan Liang
Muli Yang
Jinghong Zheng
Tobias Mass
Ying Sun
Xulei Yang
Xuewu Xu

Metalenses offer compelling advantages such as lightweight and ultra-thin design, making them promising alternatives to conventional lenses. However, their widespread adoption is hindered by image quality degradation caused by chromatic and angular aberrations. To mitigate this, restoration processes are often necessary to recover high-quality RGB images from metalens-captured inputs. While recent deep learning-based restoration methods show promise, they typically (1) blur or distort peripheral regions, or (2) fail entirely under unseen illumination conditions. To advance metalens image restoration, we introduce IlluMeta---the first and largest real-world, illumination-aware metalens image dataset—captured across diverse lighting environments. In addition, we propose a novel end-to-end restoration framework that directs attention to challenging regions and adaptively adjusts to varying illuminations via reinforcement learning. Experiments show that our method can be applied in a plug-and-play manner to enhance existing models, significantly improving image restoration quality, especially under unseen lighting conditions, paving the way for broader real-world deployment of metalens technologies.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Your AI-Generated Image Detector Can Secretly Achieve SOTA Accuracy, If Calibrated

Muli Yang
Gabriel James Goenawan
Henan Wang
Huaiyuan Qin
Chenghao Xu
Yanhua Yang
Fen Fang
Ying Sun

Despite being trained on balanced datasets, existing AI-generated image detectors often exhibit systematic bias at test time, frequently misclassifying fake images as real. We hypothesize that this behavior stems from distributional shift in fake samples and implicit priors learned during training. Specifically, models tend to overfit to superficial artifacts that do not generalize well across different generation methods, leading to a misaligned decision threshold when faced with test-time distribution shift. To address this, we propose a theoretically grounded post-hoc calibration framework based on Bayesian decision theory. In particular, we introduce a learnable scalar correction to the model’s logits, optimized on a small validation set from the target distribution while keeping the backbone frozen. This parametric adjustment compensates for distributional shift in model output, realigning the decision boundary even without requiring ground-truth labels. Experiments on challenging benchmarks show that our approach significantly improves robustness without retraining, offering a lightweight and principled solution for reliable and adaptive AI-generated image detection in the open world.

PDF Details DOI

JMLR Journal 2025 Journal Article

Decentralized Sparse Linear Regression via Gradient-Tracking

Marie Maros
Gesualdo Scutari
Ying Sun
Guang Cheng

We study sparse linear regression over a network of agents, modeled as an undirected graph without a center node. The estimation of the $s$-sparse parameter is formulated as a constrained LASSO problem wherein each agent owns a subset of the $N$ total observations. We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm under high-dimensional scaling, allowing the ambient dimension $d$ to grow with (and possibly exceed) the sample size $N$. Our theory shows that, under standard notions of restricted strong convexity and smoothness of the average loss functions, suitable conditions on the network connectivity and algorithm tuning, the distributed algorithm converges globally at a linear rate to an estimate that is within the centralized statistical precision of the model, $O(s\log d/N)$. When $s\log d/N=o(1)$, a condition necessary for statistical consistency, an $\varepsilon$-optimal solution is attained after ${O}(\kappa \log (1/\varepsilon))$ gradient computations and $O(\kappa/(1-\rho) \log (1/\varepsilon))$ communication rounds, where $\kappa$ is the restricted condition number of the loss function and $\rho$ measures the network connectivity. The computation cost matches that of the centralized projected gradient algorithm despite having data distributed; whereas the communication rounds reduce as the network connectivity improves. Overall, our study reveals interesting connections between statistical efficiency, network connectivity and topology, and convergence rate in the high dimensional setting. [abs] [ pdf ][ bib ] &copy JMLR 2025. ( edit, beta )

PDF Details

JMLR Journal 2025 Journal Article

Distributed Stochastic Bilevel Optimization: Improved Complexity and Heterogeneity Analysis

Youcheng Niu
Jinming Xu
Ying Sun
Yan Huang
Li Chai

This paper considers solving a class of nonconvex-strongly-convex distributed stochastic bilevel optimization (DSBO) problems with personalized inner-level objectives. Most existing algorithms require computational loops for hypergradient estimation, leading to computational inefficiency. Moreover, the impact of data heterogeneity on convergence in bilevel problems is not explicitly characterized yet. To address these issues, we propose LoPA, a loopless personalized distributed algorithm that leverages a tracking mechanism for iterative approximation of inner-level solutions and Hessian-inverse matrices without relying on extra computation loops. Our theoretical analysis explicitly characterizes the heterogeneity across nodes (denoted by $b$), and establishes a sublinear rate of $\mathcal{O}( {\frac{1}{{{{\left( {1 - \rho } \right)}}K}}\!+ \!\frac{{(\frac{b}{\sqrt{m}})^{\frac{2}{3}} }}{{\left( {1 - \rho } \right)^{\frac{2}{3}} K^{\frac{2}{3}} }} \!+ \!\frac{1}{\sqrt{ K }}( {\sigma _{\operatorname{p} }} + \frac{1}{\sqrt{m}}{\sigma _{\operatorname{c} }} ) } )$ without the boundedness of local hypergradients, where ${\sigma _{\operatorname{p} }}$ and ${\sigma _{\operatorname{c} }}$ represent the gradient sampling variances associated with the inner- and outer-level variables, respectively. We also integrate LoPA with a gradient tracking scheme to eliminate the impact of data heterogeneity, yielding an improved rate of ${{\mathcal{O}}}(\frac{{1}}{{ (1-\rho)^2K }} \!+\! \frac{1}{{\sqrt{K}}}( \sigma_{\rm{p}} \!+\! \frac{1}{\sqrt{m}}\sigma_{\rm{c}} ) )$. The computational complexity of LoPA is of ${{\mathcal{O}}}({\epsilon^{-2}})$ to an $\epsilon$-stationary point, matching the communication complexity due to the loopless structure, which outperforms existing counterparts for DSBO. Numerical experiments validate the effectiveness of the proposed algorithm. [abs] [ pdf ][ bib ] &copy JMLR 2025. ( edit, beta )

PDF Details

AAAI Conference 2025 Conference Paper

EFSkip: A New Error Feedback with Linear Speedup for Compressed Federated Learning with Arbitrary Data Heterogeneity

Hongyan Bao
Pengwen Chen
Ying Sun
Zhize Li

Due to the communication bottleneck in distributed and decentralized federated learning applications, algorithms using compressed communication have attracted significant attention. The Error Feedback (EF) is a widely-studied compression framework for convergence with biased compressors such as top-k sparsification. Although various improvements have been obtained in recent years, the theoretical guarantee for EF-type framework is still limited. Previous works either 1) rely on strong assumptions such as bounded gradient/dissimilarity assumptions, thus can not deal with arbitrary data heterogeneity and also slow the convergence speed, or 2) can not enjoy linear speedup in the number of clients. In this work, we propose a new EFSkip framework which removes the strong assumptions to allow arbitrary data heterogeneity and enjoys linear speedup for significantly improving upon previous results. In particular, EFSkip achieves a substantially lower computational complexity compared to the previous EF21, i.e., EFSkip enjoys the linear speedup in the number of clients (reducing the result linearly using more clients). We also show that EFSkip enjoys linear speedup and achieves faster convergence for nonconvex problems satisfying Polyak-Lojasiewicz (PL) condition. We believe that the new EFSkip framework will have a large impact on the communication- and computation-efficient distributed and decentralized federated learning.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding

Weiyu Guo
Ziyang Chen
Shaoguang WANG
Jianxiang He
Yijie Xu
Jinhui Ye
Ying Sun
Hui Xiong

Understanding long video content is a complex endeavor that often relies on densely sampled frame captions or end-to-end feature selectors, yet these techniques commonly overlook the logical relationships between textual queries and visual elements. In practice, computational constraints necessitate coarse frame subsampling, a challenge analogous to “finding a needle in a haystack. ” To address this issue, we introduce a semantics-driven search framework that reformulates keyframe selection under the paradigm of Visual Semantic-Logical Search (VSLS). Specifically, we systematically define four fundamental logical dependencies: 1) spatial co-occurrence, 2) temporal proximity, 3) attribute dependency, and 4) causal order. These relations dynamically update frame sampling distributions through an iterative refinement process, enabling context-aware identification of semantically critical frames tailored to specific query requirements. Our method establishes new state-of-the-art performance on the manually annotated benchmark in keyframe selection metrics. Furthermore, when applied to downstream video question-answering tasks, the proposed approach demonstrates the best performance gains over existing methods on LongVideoBench and Video-MME, validating its effectiveness in bridging the logical gap between textual queries and visual-temporal reasoning. The code will be publicly available.

PDF Details

AAAI Conference 2025 Conference Paper

Non-Convex Tensor Recovery from Local Measurements

Tongle Wu
Ying Sun
Jicong Fan

Motivated by the settings where sensing the entire tensor is infeasible, this paper proposes a novel tensor compressed sensing model, where measurements are only obtained from sensing each lateral slice via mutually independent matrices. Leveraging the low tubal rank structure, we reparameterize the unknown tensor?* using two compact tensor factors and formulate the recovery problem as a nonconvex minimization problem. To solve the problem, we first propose an alternating minimization algorithm, termed Alt-PGD-Min, that iteratively optimizes the two factors using a projected gradient descent and an exact minimization step, respectively. Despite nonconvexity, we prove that Alt-PGD-Min achieves ϵ-accuracy recovery with?(?²log1/?) iteration complexity and?(?⁶rn₃logn₃(?²r(n₁+n₂)+n₁log1/ε)) sample complexity, where? denotes tensor condition number of?*. To further accelerate the convergence, especially when the tensor is ill-conditioned with large?, we prove Alt-ScalePGD-Min that preconditions the gradient update using an approximate Hessian that can be computed efficiently. We show that Alt-ScalePGD-Min achieves? independent iteration complexity?(log1/ε) and improves the sample complexity to?(?⁴rn₃log n₃(?⁴ r(n₁ + n₂)+n₁log 1/ε)). Experiments validate the effectiveness of the proposed methods.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Non-Convex Tensor Recovery from Tube-Wise Sensing

Tongle Wu
Ying Sun

In this paper, we propose a novel tube-wise local tensor compressed sensing (CS) model, where sensing operators are independently applied to each tube of a third-order tensor. To recover the low-rank ground truth tensor, we minimize a non-convex objective via Burer–Monteiro factorization and solve it using gradient descent with spectral initialization. We prove that this approach achieves exact recovery with a linear convergence rate. Notably, our method attains provably lower sample complexity than existing TCS methods. Our proof leverages the leave-one-out technique to show that gradient descent generates iterates implicitly biased towards solutions with bounded incoherence, which ensures contraction of optimization error in consecutive iterates. Empirical results validate the effectiveness of GD in solving the proposed local TCS model.

PDF Details

NeurIPS Conference 2025 Conference Paper

OnlineSplatter: Pose-Free Online 3D Reconstruction for Free-Moving Objects

Mark H. Huang
Lin Geng Foo
Christian Theobalt
Ying Sun
De Wen Soh

Free-moving object reconstruction from monocular video remains challenging, particularly without reliable pose or depth cues and under arbitrary object motion. We introduce OnlineSplatter, a novel online feed-forward framework generating high-quality, object-centric 3D Gaussians directly from RGB frames without requiring camera pose, depth priors, or bundle optimization. Our approach anchors reconstruction using the first frame and progressively refines the object representation through a dense Gaussian primitive field, maintaining constant computational cost regardless of video sequence length. Our core contribution is a dual-key memory module combining latent appearance-geometry keys with explicit directional keys, robustly fusing current frame features with temporally aggregated object states. This design enables effective handling of free-moving objects via spatial-guided memory readout and an efficient sparsification mechanism, ensuring comprehensive yet compact object coverage. Evaluations on real-world datasets demonstrate that OnlineSplatter significantly outperforms state-of-the-art pose-free reconstruction baselines, consistently improving with more observations while maintaining constant memory and runtime.

PDF Details

ICML Conference 2025 Conference Paper

Understanding the Statistical Accuracy-Communication Trade-off in Personalized Federated Learning with Minimax Guarantees

Xin Yu
Zelin He
Ying Sun
Lingzhou Xue
Runze Li

Personalized federated learning (PFL) offers a flexible framework for aggregating information across distributed clients with heterogeneous data. This work considers a personalized federated learning setting that simultaneously learns global and local models. While purely local training has no communication cost, collaborative learning among the clients can leverage shared knowledge to improve statistical accuracy, presenting an accuracy-communication trade-off in personalized federated learning. However, the theoretical analysis of how personalization quantitatively influences sample and algorithmic efficiency and their inherent trade-off is largely unexplored. This paper makes a contribution towards filling this gap, by providing a quantitative characterization of the personalization degree on the tradeoff. The results further offer theoretical insights for choosing the personalization degree. As a side contribution, we establish the minimax optimality in terms of statistical accuracy for a widely studied PFL formulation. The theoretical result is validated on both synthetic and real-world datasets and its generalizability is verified in a non-convex setting.

Details

IJCAI Conference 2024 Conference Paper

Anomaly Subgraph Detection through High-Order Sampling Contrastive Learning

Ying Sun
Wenjun Wang
Nannan Wu
Chunlong Bao

Anomaly subgraph detection is a crucial task in various real-world applications, including identifying high-risk areas, detecting river pollution, and monitoring disease outbreaks. Early traditional graph-based methods can obtain high-precision detection results in scenes with small-scale graphs and obvious anomaly features. Most existing anomaly detection methods based on deep learning primarily concentrate on identifying anomalies at the node level, while neglecting to detect anomaly groups in the internal structure. In this paper, we propose a novel end-to-end Graph Neural Network (GNN) based anomaly subgraph detection approach(ASD-HC) in graph-structured data. 1)We propose a high-order neighborhood sampling strategy to construct our node and k-order neighbor-subgraph instance pairs. 2)Anomaly features of nodes are captured through a self-supervised contrastive learning model. 3) Detecting the maximum connected anomaly subgraph is performed by integrating the Non-parameter Graph Scan statistics and a Random Walk module. We evaluate ASD-HC against five state-of-the-art baselines using five benchmark datasets. ASD-HC outperforms the baselines by over 13. 01% in AUC score. Various experiments demonstrate that our approach effectively detects anomaly subgraphs within large-scale graphs.

PDF Details DOI

EAAI Journal 2024 Journal Article

Domain-knowledge Inspired Pseudo Supervision (DIPS) for unsupervised image-to-image translation models to support cross-domain classification

Firas Al-Hindawi
Md Mahfuzur Rahman Siddiquee
Teresa Wu
Han Hu
Ying Sun

Details DOI

IJCAI Conference 2024 Conference Paper

Implicit Anomaly Subgraph Detection (IASD) in Multi-Domain Attribute Networks

Ying Sun

Anomaly subgraph detection is a vital task in various real applications. However, with the advancement of AI technology, it faces new challenges: 1) Anomaly features are often deeply hidden within large datasets, and 2) Anomaly detection approaches are required to unveil the mechanisms behind anomaly generation. Our study focuses on detecting hidden anomaly subgraphs within big data and offering improved explanations for the root cause of anomalies by integrating multi-domain datasets.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Implicit Regularization of Decentralized Gradient Descent for Sparse Regression

Tongle Wu
Ying Sun

We consider learning a sparse model from linear measurements taken by a network of agents. Different from existing decentralized methods designed based on the LASSO regression with explicit $\ell_1$ norm regularization, we exploit the implicit regularization of decentralized optimization method applied to an over-parameterized nonconvex least squares formulation without penalization. Our first result shows that despite nonconvexity, if the network connectivity is good, the well-known decentralized gradient descent algorithm (DGD) with small initialization and early stopping can compute the statistically optimal solution. Sufficient conditions on the initialization scale, choice of step size, network connectivity, and stopping time are further provided to achieve convergence. Our result recovers the convergence rate of gradient descent in the centralized setting, showing its tightness. Based on the analysis of DGD, we further propose a communication-efficient version, termed T-DGD, by truncating the iterates before transmission. In the high signal-to-noise ratio (SNR) regime, we show that T-DGD achieves comparable statistical accuracy to DGD, while the communication cost is logarithmic in the number of parameters. Numerical results are provided to validate the effectiveness of DGD and T-DGD for sparse learning through implicit regularization.

PDF Details DOI

JBHI Journal 2024 Journal Article

Model Generalizability Investigation for GFCE-MRI Synthesis in NPC Radiotherapy Using Multi-Institutional Patient-Based Data Normalization

Wen Li
Saikit Lam
Yinghui Wang
Chenyang Liu
Tian Li
Jens Kleesiek
Andy Lai-Yin Cheung
Ying Sun

Recently, deep learning has been demonstrated to be feasible in eliminating the use of gadoliniumbased contrast agents (GBCAs) through synthesizing gadolinium-free contrast-enhanced MRI (GFCE-MRI) from contrast-free MRI sequences, providing the community with an alternative to get rid of GBCAs-associated safety issues in patients. Nevertheless, generalizability assessment of the GFCE-MRI model has been largely challenged by the high inter-institutional heterogeneity of MRI data, on top of the scarcity of multi-institutional data itself. Although various data normalization methods have been adopted to address the heterogeneity issue, it has been limited to single-institutional investigation and there is no standard normalization approach presently. In this study, we aimed at investigating generalizability of GFCE-MRI model using data from seven institutions by manipulating heterogeneity of MRI data under five popular normalization approaches. Three state-of-the-art neural networks were applied to map from T1-weighted and T2-weighted MRI to contrast-enhanced MRI (CE-MRI) for GFCE-MRI synthesis in patients with nasopharyngeal carcinoma. MRI data from three institutions were used separately to generate three uni-institution models and jointly for a tri-institution model. The five normalization methods were applied to normalize the data of each model. MRI data from the remaining four institutions served as external cohorts for model generalizability assessment. Quality of GFCE-MRI was quantitatively evaluated against ground-truth CE-MRI using mean absolute error (MAE) and peak signal-to-noise ratio(PSNR). Results showed that performance of all uni-institution models remarkably dropped on the external cohorts. By contrast, model trained using multi-institutional data with Z-Score normalization yielded the best model generalizability improvement.

Details DOI

NeurIPS Conference 2024 Conference Paper

Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs

Liyi Chen
Panrong Tong
Zhongming Jin
Ying Sun
Jieping Ye
Hui Xiong

Large Language Models (LLMs) have shown remarkable reasoning capabilities on complex tasks, but they still suffer from out-of-date knowledge, hallucinations, and opaque decision-making. In contrast, Knowledge Graphs (KGs) can provide explicit and editable knowledge for LLMs to alleviate these issues. Existing paradigm of KG-augmented LLM manually predefines the breadth of exploration space and requires flawless navigation in KGs. However, this paradigm cannot adaptively explore reasoning paths in KGs based on the question semantics and self-correct erroneous reasoning paths, resulting in a bottleneck in efficiency and effect. To address these limitations, we propose a novel self-correcting adaptive planning paradigm for KG-augmented LLM named Plan-on-Graph (PoG), which first decomposes the question into several sub-objectives and then repeats the process of adaptively exploring reasoning paths, updating memory, and reflecting on the need to self-correct erroneous reasoning paths until arriving at the answer. Specifically, three important mechanisms of Guidance, Memory, and Reflection are designed to work together, to guarantee the adaptive breadth of self-correcting planning for graph reasoning. Finally, extensive experiments on three real-world datasets demonstrate the effectiveness and efficiency of PoG.

PDF Details DOI

IROS Conference 2024 Conference Paper

Risk-Aware Non-Myopic Motion Planner for Large-Scale Robotic Swarm Using CVaR Constraints

Xuru Yang
Yunze Hu
Han Gao
Kang Ding
Zhaoyang Li
Pingping Zhu
Ying Sun
Chang Liu

Swarm robotics has garnered significant attention due to its ability to accomplish elaborate and synchronized tasks. Existing methodologies for motion planning of swarm robotic systems mainly encounter difficulties in scalability and safety guarantee. To address these limitations, we propose a Risk-aware swarm mOtion planner using conditional ValuE-at-Risk (ROVER) that systematically navigates large-scale swarms through cluttered environments while ensuring safety. ROVER formulates a finite-time model predictive control (FTMPC) problem predicated upon the macroscopic state of the robot swarm represented by a Gaussian Mixture Model (GMM) and integrates conditional value-at-risk (CVaR) to ensure collision avoidance. The key component of ROVER is imposing a CVaR constraint on the distribution of the Signed Distance Function between the swarm GMM and obstacles in the FTMPC to enforce collision avoidance. Utilizing the analytical expression of CVaR of a GMM derived in this work, we develop a computationally efficient solution to solve the non-linear constrained FTMPC through sequential linear programming. Simulations and comparisons with representative benchmark approaches demonstrate the effectiveness of ROVER in flexibility, scalability, and safety guarantee.

Details

NeurIPS Conference 2024 Conference Paper

SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network

Weiyu Guo
Ying Sun
Yijie Xu
Ziyue Qiao
Yongkui Yang
Hui Xiong

Surface electromyography (sEMG) based gesture recognition offers a natural and intuitive interaction modality for wearable devices. Despite significant advancements in sEMG-based gesture recognition models, existing methods often suffer from high computational latency and increased energy consumption. Additionally, the inherent instability of sEMG signals, combined with their sensitivity to distribution shifts in real-world settings, compromises model robustness. To tackle these challenges, we propose a novel SpGesture framework based on Spiking Neural Networks, which possesses several unique merits compared with existing methods: (1) Robustness: By utilizing membrane potential as a memory list, we pioneer the introduction of Source-Free Domain Adaptation into SNN for the first time. This enables SpGesture to mitigate the accuracy degradation caused by distribution shifts. (2) High Accuracy: With a novel Spiking Jaccard Attention, SpGesture enhances the SNNs' ability to represent sEMG features, leading to a notable rise in system accuracy. To validate SpGesture's performance, we collected a new sEMG gesture dataset which has different forearm postures, where SpGesture achieved the highest accuracy among the baselines ($89. 26\%$). Moreover, the actual deployment on the CPU demonstrated a latency below 100ms, well within real-time requirements. This impressive performance showcases SpGesture's potential to enhance the applicability of sEMG in real-world scenarios. The code is available at https: //github. com/guoweiyu/SpGesture/.

PDF Details DOI

AIIM Journal 2024 Journal Article

Stacked deep learning approach for efficient SARS-CoV-2 detection in blood samples

Wu Wang
Fouzi Harrou
Abdelkader Dairi
Ying Sun

Details DOI

EAAI Journal 2024 Journal Article

Surface defect detection methods for industrial products with imbalanced samples: A review of progress in the 2020s

Dongxu Bai
Gongfa Li
Du Jiang
Juntong Yun
Bo Tao
Guozhang Jiang
Ying Sun
Zhaojie Ju

Details DOI

NeurIPS Conference 2024 Conference Paper

Tackling Uncertain Correspondences for Multi-Modal Entity Alignment

Liyi Chen
Ying Sun
Shengzhe Zhang
Yuyang Ye
Wei Wu
Hui Xiong

Recently, multi-modal entity alignment has emerged as a pivotal endeavor for the integration of Multi-Modal Knowledge Graphs (MMKGs) originating from diverse data sources. Existing works primarily focus on fully depicting entity features by designing various modality encoders or fusion approaches. However, uncertain correspondences between inter-modal or intra-modal cues, such as weak inter-modal associations, description diversity, and modality absence, still severely hinder the effective exploration of aligned entity similarities. To this end, in this paper, we propose a novel Tackling uncertain correspondences method for Multi-modal Entity Alignment (TMEA). Specifically, to handle diverse attribute knowledge descriptions, we design alignment-augmented abstract representation that incorporates the large language model and in-context learning into attribute alignment and filtering for generating and embedding the attribute abstract. In order to mitigate the influence of the modality absence, we propose to unify all modality features into a shared latent subspace and generate pseudo features via variational autoencoders according to existing modal features. Then, we develop an inter-modal commonality enhancement mechanism based on cross-attention with orthogonal constraints, to address weak semantic associations between modalities. Extensive experiments on two real-world datasets validate the effectiveness of TMEA with a clear improvement over competitive baselines.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Beyond Homophily: Robust Graph Anomaly Detection via Neural Sparsification

Zheng Gong
Guifeng Wang
Ying Sun
Qi Liu
Yuting Ning
Hui Xiong
Jingyu Peng

Recently, graph-based anomaly detection (GAD) has attracted rising attention due to its effectiveness in identifying anomalies in relational and structured data. Unfortunately, the performance of most existing GAD methods suffers from the inherent structural noises of graphs induced by hidden anomalies connected with considerable benign nodes. In this work, we propose SparseGAD, a novel GAD framework that sparsifies the structures of target graphs to effectively reduce noises and collaboratively learns node representations. It then robustly detects anomalies by uncovering the underlying dependency among node pairs in terms of homophily and heterophily, two essential connection properties of GAD. Extensive experiments on real-world datasets of GAD demonstrate that the proposed framework achieves significantly better detection quality compared with the state-of-the-art methods, even when the graph is heavily attacked. Code will be available at https: //github. com/KellyGong/SparseGAD. git.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Counterfactual Dynamics Forecasting – a New Setting of Quantitative Reasoning

Yanzhu Liu
Ying Sun
Joo-Hwee Lim

Rethinking and introspection are important elements of human intelligence. To mimic these capabilities, counterfactual reasoning has attracted attention of AI researchers recently, which aims to forecast the alternative outcomes for hypothetical scenarios (“what-if”). However, most existing approaches focused on qualitative reasoning (e.g., casual-effect relationship). It lacks a well-defined description of the differences between counterfactuals and facts, as well as how these differences evolve over time. This paper defines a new problem formulation - counterfactual dynamics forecasting - which is described in middle-level abstraction under the structural causal models (SCM) framework and derived as ordinary differential equations (ODEs) as low-level quantitative computation. Based on it, we propose a method to infer counterfactual dynamics considering the factual dynamics as demonstration. Moreover, the evolution of differences between facts and counterfactuals are modelled by an explicit temporal component. The experimental results on two dynamical systems demonstrate the effectiveness of the proposed method.

PDF Details DOI

JMLR Journal 2023 Journal Article

Distributed Sparse Regression via Penalization

Yao Ji
Gesualdo Scutari
Ying Sun
Harsha Honnappa

We study sparse linear regression over a network of agents, modeled as an undirected graph (with no centralized node). The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint—the latter being instrumental to obtain distributed solution methods. While penalty-based consensus methods have been extensively studied in the optimization literature, their statistical and computational guarantees in the high dimensional setting remain unclear. This work provides an answer to this open problem. Our contribution is two-fold. First, we establish statistical consistency of the estimator: under a suitable choice of the penalty parameter, the optimal solution of the penalized problem achieves near optimal minimax rate $O(s \log d/N)$ in $\ell_2$-loss, where $s$ is the sparsity value, $d$ is the ambient dimension, and $N$ is the total sample size in the network—this matches centralized sample rates. Second, we show that the proximal-gradient algorithm applied to the penalized problem, which naturally leads to distributed implementations, converges linearly up to a tolerance of the order of the centralized statistical error---the rate scales as $O(d)$, revealing an unavoidable speed-accuracy dilemma. Numerical results demonstrate the tightness of the derived sample rate and convergence rate scalings. [abs] [ pdf ][ bib ] &copy JMLR 2023. ( edit, beta )

PDF Details

ICLR Conference 2022 Conference Paper

Hybrid Local SGD for Federated Learning with Heterogeneous Communications

Yuanxiong Guo
Ying Sun
Rui Hu 0005
Yanmin Gong 0001

Communication is a key bottleneck in federated learning where a large number of edge devices collaboratively learn a model under the orchestration of a central server without sharing their own training data. While local SGD has been proposed to reduce the number of FL rounds and become the algorithm of choice for FL, its total communication cost is still prohibitive when each device needs to communicate with the remote server repeatedly for many times over bandwidth-limited networks. In light of both device-to-device (D2D) and device-to-server (D2S) cooperation opportunities in modern communication networks, this paper proposes a new federated optimization algorithm dubbed hybrid local SGD (HL-SGD) in FL settings where devices are grouped into a set of disjoint clusters with high D2D communication bandwidth. HL-SGD subsumes previous proposed algorithms such as local SGD and gossip SGD and enables us to strike the best balance between model accuracy and runtime. We analyze the convergence of HL-SGD in the presence of heterogeneous data for general nonconvex settings. We also perform extensive experiments and show that the use of hybrid model aggregation via D2D and D2S communications in HL-SGD can largely speed up the training time of federated learning.

Details

IJCAI Conference 2022 Conference Paper

Monitoring Vegetation From Space at Extremely Fine Resolutions via Coarsely-Supervised Smooth U-Net

Joshua Fan
Di Chen
Jiaming Wen
Ying Sun
Carla Gomes

Monitoring vegetation productivity at extremely fine resolutions is valuable for real-world agricultural applications, such as detecting crop stress and providing early warning of food insecurity. Solar-Induced Chlorophyll Fluorescence (SIF) provides a promising way to directly measure plant productivity from space. However, satellite SIF observations are only available at a coarse spatial resolution, making it impossible to monitor how individual crop types or farms are doing. This poses a challenging coarsely-supervised regression (or downscaling) task; at training time, we only have SIF labels at a coarse resolution (3km), but we want to predict SIF at much finer spatial resolutions (e. g. 30m, a 100x increase). We also have additional fine-resolution input features, but the relationship between these features and SIF is unknown. To address this, we propose Coarsely-Supervised Smooth U-Net (CS-SUNet), a novel method for this coarse supervision setting. CS-SUNet combines the expressive power of deep convolutional networks with novel regularization methods based on prior knowledge (such as a smoothness loss) that are crucial for preventing overfitting. Experiments show that CS-SUNet resolves fine-grained variations in SIF more accurately than existing methods.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

Discerning Decision-Making Process of Deep Neural Networks with Hierarchical Voting Transformation

Ying Sun
Hengshu Zhu
Chuan Qin
Fuzhen Zhuang
Qing He
Hui Xiong

Neural network based deep learning techniques have shown great success for numerous applications. While it is expected to understand their intrinsic decision-making processes, these deep neural networks often work in a black-box way. To this end, in this paper, we aim to discern the decision-making processes of neural networks through a hierarchical voting strategy by developing an explainable deep learning model, namely Voting Transformation-based Explainable Neural Network (VOTEN). Specifically, instead of relying on massive feature combinations, VOTEN creatively models expressive single-valued voting functions between explicitly modeled latent concepts to achieve high fitting ability. Along this line, we first theoretically analyze the major components of VOTEN and prove the relationship and advantages of VOTEN compared with Multi-Layer Perceptron (MLP), the basic structure of deep neural networks. Moreover, we design efficient algorithms to improve the model usability by explicitly showing the decision processes of VOTEN. Finally, extensive experiments on multiple real-world datasets clearly validate the performances and explainability of VOTEN.

PDF Details

TIST Journal 2021 Journal Article

Exploring the Risky Travel Area and Behavior of Car-hailing Service

Hongting Niu
Hengshu Zhu
Ying Sun
Xinjiang Lu
Jing Sun
Zhiyuan Zhao
Hui Xiong
Bo Lang

Recent years have witnessed the rapid development of car-hailing services, which provide a convenient approach for connecting passengers and local drivers using their personal vehicles. At the same time, the concern on passenger safety has gradually emerged and attracted more and more attention. While car-hailing service providers have made considerable efforts on developing real-time trajectory tracking systems and alarm mechanisms, most of them only focus on providing rescue-supporting information rather than preventing potential crimes. Recently, the newly available large-scale car-hailing order data have provided an unparalleled chance for researchers to explore the risky travel area and behavior of car-hailing services, which can be used for building an intelligent crime early warning system. To this end, in this article, we propose a Risky Area and Risky Behavior Evaluation System (RARBEs) based on the real-world car-hailing order data. In RARBEs, we first mine massive multi-source urban data and train an effective area risk prediction model, which estimates area risk at the urban block level. Then, we propose a transverse and longitudinal double detection method, which estimates behavior risk based on two aspects, including fraud trajectory recognition and fraud patterns mining. In particular, we creatively propose a bipartite graph-based algorithm to model the implicit relationship between areas and behaviors, which collaboratively adjusts area risk and behavior risk estimation based on random walk regularization. Finally, extensive experiments on multi-source real-world urban data clearly validate the effectiveness and efficiency of our system.

Details DOI

AAAI Conference 2021 System Paper

TAILOR: Teaching with Active and Incremental Learning for Object Registration

Qianli Xu
Nicolas Gauthier
Wenyu Liang
Fen Fang
Hui Li Tan
Ying Sun
Yan Wu
Liyuan Li

When deploying a robot to a new task, one often has to train it to detect novel objects, which is time-consuming and laborintensive. We present TAILOR - a method and system for object registration with active and incremental learning. When instructed by a human teacher to register an object, TAILOR is able to automatically select viewpoints to capture informative images by actively exploring viewpoints, and employs a fast incremental learning algorithm to learn new objects without potential forgetting of previously learned objects. We demonstrate the effectiveness of our method with a KUKA robot to learn novel objects used in a real-world gearbox assembly task through natural interactions.

PDF Details

AIIM Journal 2019 Journal Article

A deep survival analysis method based on ranking

Bingzhong Jing
Tao Zhang
Zixian Wang
Ying Jin
Kuiyuan Liu
Wenze Qiu
Liangru Ke
Ying Sun

Details DOI

JMLR Journal 2019 Journal Article

Decentralized Dictionary Learning Over Time-Varying Digraphs

Amir Daneshmand
Ying Sun
Gesualdo Scutari
Francisco Facchinei
Brian M. Sadler

This paper studies Dictionary Learning problems wherein the learning task is distributed over a multi-agent network, modeled as a time-varying directed graph. This formulation is relevant, for instance, in Big Data scenarios where massive amounts of data are collected/stored in different locations (e.g., sensors, clouds) and aggregating and/or processing all data in a fusion center might be inefficient or unfeasible, due to resource limitations, communication overheads or privacy issues. We develop a unified decentralized algorithmic framework for this class of nonconvex problems, which is proved to converge to stationary solutions at a sublinear rate. The new method hinges on Successive Convex Approximation techniques, coupled with a decentralized tracking mechanism aiming at locally estimating the gradient of the smooth part of the sum-utility. To the best of our knowledge, this is the first provably convergent decentralized algorithm for Dictionary Learning and, more generally, bi-convex problems over (time-varying) (di)graphs. [abs] [ pdf ][ bib ] &copy JMLR 2019. ( edit, beta )

PDF Details

AAAI Conference 1993 Conference Paper

A Framework for Model-Based Repair

Ying Sun

PDF Details