Author name cluster

Ying Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

48 papers

2 author rows

EAAI Journal 2026 Journal Article

A multivariate long-term time-series prediction model for water quality based on Transformer architecture with spectral reconstruction optimizer

Dashe Li
Ying Li
Lu Liu
Xiaodong Ji
Haoran Xing

Details DOI

EAAI Journal 2026 Journal Article

Corrigendum to “An intelligent decision support framework for nursing home resource planning with enhanced heterogeneous service demand modeling” [Eng. Appl. Artif. Intell. (137), Part B, November 2024, 109221]

Xuxue Sun
Nan Kong
Weiping Ding
Ying Li
Nazmus Sakib
Hao Zeng
Hongdao Meng
Chris Masterson

Details DOI

EAAI Journal 2026 Journal Article

Global relationship awareness 3-dimensional object detection using 4-dimensional radar

Pianzhang Duan
Li Wang
Cheng Fang
Ziying Song
Ming Gao
Mo Zhou
Ying Li
Yibo Zhang

Details DOI

AAAI Conference 2026 Conference Paper

Learning Whom to Align With: Progressive Anomaly Combination Detection for Partially View-Aligned Clustering

Hang Gao
Zuosong Cai
Yuze Li
Cheng Liu
Gaoyang Li
Ying Li
Wei Du
You Zhou

Partially View-aligned Clustering (PVC) addresses the challenge of partial view alignment in multi-view learning by leveraging complementary and consistent information. While existing PVC methods show promise, most rely on distance-based strategies that are sensitive to view-specific details and noise, limiting their robustness. In this work, we propose a novel view alignment strategy that reformulates the alignment task as an anomaly detection problem. Rather than learning a view-alignment matrix that enforces strict one-to-one correspondences across views, we adopt a progressive approach to identify well-aligned samples. Specifically, we sample subsets of data by generating random view combinations from unaligned samples and propose an anomaly combination detection module to evaluate the alignment consistency of these combinations. In addition, our progressive training framework alternates between updating model parameters and selecting high-confidence view combinations for subsequent optimization. By reformulating view alignment as an anomaly detection task, our approach provides a more robust and effective solution to partial view alignment. Experiments on benchmark datasets demonstrate that our method outperforms state-of-the-art approaches in the PVC problem.

PDF Details DOI

AAAI Conference 2026 Conference Paper

ManipDreamer3D: Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory

Ying Li
Xiaobao Wei
Xiaowei Chi
Yuming Li
Zhongyu Zhao
Hao Wang
Ningning Ma
Ming Lu

Data scarcity continues to be a critical bottleneck in the field of robotic manipulation, limiting the ability to train robust and generalizable models. While diffusion models provide a promising approach to synthesizing realistic robotic manipulation videos, their effectiveness hinges on the availability of precise and reasonable control instructions. Current methods primarily rely on 2D trajectories as instruction prompts, which inherently face issues with 3D spatial ambiguity. In this work, we present a novel framework named ManipDreamer3Dfor generating plausible 3D-aware robotic manipulation videos from the input image and the text instruction. Our method combines 3D trajectory planning with a reconstructed 3D occupancy map created from a third-person perspective, along with a novel trajectory-to-video diffusion model. Specifically, ManipDreamer3D first reconstructs the 3D occupancy representation from the input image and then computes an optimized 3D end-effector trajectory, minimizing path length, avoiding collisions and retiming. Next, we employ a latent editing technique to create video sequences from the initial image latent, text instruction and the optimized 3D trajectory. This process conditions our specially trained trajectory-to-video diffusion model to produce robotic pick-and-place videos. Our method significantly reduces human intervention requirements by autonomously planing plausible 3D trajectories. Experimental results demonstrate its superior visual quality and precision.

PDF Details DOI

EAAI Journal 2026 Journal Article

Multi-modal semantic interaction fusion with dual-consistency contrastive learning for rotating machinery fault diagnosis

Ying Li
Xiaoping Liu
Xutong Zhang
Pengfei Liang
Xuetao Xu
Xiaoming Yuan
Lijie Zhang

Details DOI

AAAI Conference 2026 Conference Paper

Pansharpening for Thin-Cloud Contaminated Remote Sensing Images: A Unified Framework and Benchmark Dataset

Songcheng Du
Yang Zou
Jiaxin Li
Mingxuan Liu
Ying Li
Changjing Shang
Qiang Shen

Pansharpening under thin cloudy conditions is a practically significant yet rarely addressed task, challenged by simultaneous spatial resolution degradation and cloud-induced spectral distortions. Existing methods often address cloud removal and pansharpening sequentially, leading to cumulative errors and suboptimal performance due to the lack of joint degradation modeling. To address these challenges, we propose a Unified Pansharpening Model with Thin Cloud Removal (Pan-TCR), an end-to-end framework that integrates physical priors. Motivated by theoretical analysis in the frequency domain, we design a frequency-decoupled restoration (FDR) block that disentangles the restoration of multispectral image (MSI) features into amplitude and phase components, each guided by complementary degradation-robust prompts: the near-infrared (NIR) band amplitude for cloud-resilient restoration, and the panchromatic (PAN) phase for high-resolution structural enhancement. To ensure coherence between the two components, we further introduce an interactive inter-frequency consistency (IFC) module, enabling cross-modal refinement that enforces consistency and robustness across frequency cues. Furthermore, we introduce the first real-world thin-cloud contaminated pansharpening dataset (PanTCR-GF2), comprising paired clean and cloudy PAN-MSI images, to enable robust benchmarking under realistic conditions. Extensive experiments on real-world and synthetic datasets demonstrate the superiority and robustness of Pan-TCR, establishing a new benchmark for pansharpening under realistic atmospheric degradations.

PDF Details DOI

AAAI Conference 2026 Conference Paper

UVLM: Benchmarking Video Language Model for Underwater World Understanding

Xizhe Xue
Yang Zhou
Dawei Yan
Lijie Tao
Junjie Li
Ying Li
Haokui Zhang
Rong Xiao

Recently, video-language models (VidLMs) have gained widespread attention and adoption. However, existing works primarily focus on terrestrial scenarios, overlooking the highly demanding application needs of underwater observation. To overcome this gap, we introduce UVLM, an under water observation benchmark which is build through a collaborative approach combining human expertise and AI models. To ensure data quality, we have conducted in-depth considerations from multiple perspectives. First, to address the unique challenges of underwater environments, we selected videos that represent typical underwater challenges including light variations, water turbidity, and diverse viewing angles to construct the dataset. Second, to ensure data diversity, the dataset covers a wide range of frame rates, resolutions, 419 classes of marine animals, and various static plants and terrains. Next, for task diversity, we adopted a structured design where observation targets are categorized into two major classes: biological and environmental. Each category includes content observation and change/action observation, totaling 20 subtask types. Finally, we designed several challenging evaluation metrics to enable quantitative comparison and analysis of different methods. Experiments on two representative VidLMs demonstrate that fine-tuning VidLMs on UVLM significantly improves underwater world understanding while also showing potential for slight improvements on existing in-air VidLM benchmarks.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Contrastive Auxiliary Learning with Structure Transformation for Heterogeneous Graphs

Wei Du
Hongmin Sun
Hang Gao
Gaoyang Li
Ying Li

In recent years, methods based on heterogeneous graph neural networks (HGNNs) have been widely used for embedding heterogeneous graphs (HGs) due to their ability to effectively encode the rich information from HGs into low-dimensional node embeddings. Existing HGNNs focus on neighbor aggregation and semantic fusion while neglecting the HG structure and learning paradigms. However, the original HG data might lack node features, which existing models may not effectively account for. Additionally, exclusively relying on a single supervised learning approach may only partially leverage the invariant information in graph data. To address these challenges, we introduce the Contrastive Auxiliary Learning Model for Heterogeneous Graphs (CALHG). This model combines edge perturbation and graph diffusion to enhance graph data, allowing it to capture the inherent structural information within heterogeneous graphs fully. Additionally, we employ a category-guided multi-view contrastive learning approach, which does not rely on positive and negative samples for model training, enabling us to capture the intrinsic invariances in heterogeneous graph data. Extensive experiments and analyses on five benchmark datasets without node features and three benchmark datasets with node features demonstrate the effectiveness and efficiency of our novel method compared with several state-of-the-art methods.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Corrigendum to “Development of diffusion analysis along the perivascular space (DTI-ALPS) index during childhood and adolescence: Evidence from two longitudinal cohorts” [NeuroImage 317 (2025) 121331]

Shiwei Lin
Qunjun Liang
Ying Li
Caixue Cheng
Tingting Gong
Yingwei Qiu

Details DOI

YNIMG Journal 2025 Journal Article

Development of diffusion analysis along the perivascular space (DTI-ALPS) index during childhood and adolescence: Evidence from two longitudinal cohorts

Shiwei Lin
Qunjun Liang
Ying Li
Caixue Cheng
Tingting Gong
Yingwei Qiu

Details DOI

AAAI Conference 2025 Conference Paper

Dynamic Syntactic Feature Filtering and Injecting Networks for Cross-lingual Dependency Parsing

Jianjian Liu
Zhengtao Yu
Ying Li
Yuxin Huang
Shengxiang Gao

Pre-trained language models enhanced parsers have achieved outstanding performance in rich-resource languages. Cross-lingual dependency parsing aims to learn useful knowledge from high-resource languages to alleviate data scarcity in low-resource languages. However, effectively reducing the syntactic structure distributional bias and excavating the commonalities among languages is the key challenge for cross-lingual dependency parsing. To address this issue, we propose novel dynamic syntactic feature filtering and injecting networks based on the typical shared-private model that employs one shared and two private encoders to separate source and target language features. Concretely, a Language-Specific Filtering Network (LSFN) on private encoders emphasizes helpful information and ignores the irrelevant or harmful parts of it from the source language. Meanwhile, a Language-Invariant Injecting Network (LIIN) on the shared encoder integrates the advantages of BiLSTM and improved Transformer encoders to transcend language boundaries, thus amplifying syntactic commonalities across languages. Experiments on seven benchmark datasets show that our model achieves an average absolute gain of 1.84 UAS and 3.43 LAS compared with the shared-private model. Comparative experiments validate that both LSFN and LIIN components are complementary in transferring beneficial knowledge from source to target languages. Detailed analyses highlight that our model can effectively capture linguistic commonalities and mitigate the effect of distributional bias, showcasing its robustness and efficacy.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

FreqExit: Enabling Early-Exit Inference for Visual Autoregressive Models via Frequency-Aware Guidance

Ying Li
Chengfei Lyu
Huan Wang

Visual AutoRegressive (VAR) modeling employs a next-scale decoding paradigm that progresses from coarse structures to fine details. While enhancing fidelity and scalability, this approach challenges two fundamental assumptions of conventional dynamic inference: semantic stability (intermediate outputs approximating final results) and monotonic locality (smooth representation evolution across layers), which renders existing dynamic inference methods ineffective for VAR models. To address this challenge, we propose FreqExit, a unified training framework that enables dynamic inference in VAR without altering its architecture or compromising output quality. FreqExit is based on a key insight: high-frequency details are crucial for perceptual quality and tend to emerge only in later decoding stages. Leveraging this insight, we design targeted mechanisms that guide the model to learn more effectively through frequency-aware supervision. The proposed framework consists of three components: (1) a curriculum-based supervision strategy with progressive layer dropout and early exit loss; (2) a wavelet-domain high-frequency consistency loss that aligns spectral content across different generation steps; and (3) a lightweight self-supervised frequency-gated module that guides adaptive learning of both structural and detailed spectral components. On ImageNet 256×256, FreqExit achieves up to 2× speedup with only minor degradation, and delivers 1. 3× acceleration without perceptible quality loss. This enables runtime-adaptive acceleration within a unified model, offering a favorable trade-off between efficiency and fidelity for for practical and flexible deployment.

PDF Details

IJCAI Conference 2025 Conference Paper

FreqMoE: Dynamic Frequency Enhancement for Neural PDE Solvers

Tianyu Chen
Haoyi Zhou
Ying Li
Hao Wang
Zhenzhe Zhang
Tianchen Zhu
Shanghang Zhang
Jianxin Li

Fourier Neural Operators (FNO) have emerged as promising solutions for efficiently solving partial differential equations (PDEs) by learning infinite-dimensional function mappings through frequency domain transformations. However, the sparsity of high-frequency signals limits computational efficiency for high-dimensional inputs, and fixed-pattern truncation often causes high-frequency signal loss, reducing performance in scenarios such as high-resolution inputs or long-term predictions. To address these challenges, we propose FreqMoE, an efficient and progressive training framework that exploits the dependency of high-frequency signals on low-frequency components. The model first learns low-frequency weights and then applies a sparse upward-cycling strategy to construct a mixture of experts (MoE) in the frequency domain, effectively extending the learned weights to high-frequency regions. Experiments on both regular and irregular grid PDEs demonstrate that FreqMoE achieves up to 16. 6 percent accuracy improvement while using merely 2. 1 percent parameters (47. 32x reduction) compared to dense FNO. Furthermore, the approach demonstrates remarkable stability in long-term predictions and generalizes seamlessly to various FNO variants and grid structures, establishing a new Low frequency Pretraining, High frequency Fine-tuning'' paradigm for solving PDEs.

PDF Details DOI

EAAI Journal 2025 Journal Article

Interpretable and robust fault diagnosis of rotating machinery in noisy environments via improved high-order spatial interactions network

Bin Wang
Pengfei Liang
Ying Li
Junhui Hu
Lijie Zhang

Details DOI

JBHI Journal 2025 Journal Article

Localized Intra- and Inter-Tumoral Heterogeneity for Predicting Treatment Response to Neoadjuvant Chemotherapy in Breast Cancer

Yinhao Liang
Wenjie Tang
Qingcong Kong
Ting Wang
Jianjun Zhang
Wing W. Y. Ng
Siyi Chen
Ying Li

This study proposes a novel method for extracting breast cancer tumor heterogeneity descriptors to non-invasively predict whether pathological complete response (pCR) can be achieved after neoadjuvant chemotherapy (NAC). These localized descriptors extract corresponding heterogeneity features for different radiomic features and are able to capture tumor characteristics at various localization levels. These descriptors also capture tumor heterogeneity both at the individual tumor level and across the whole dataset, providing decision-making models with features that are both more effective and interpretable. We validated the effectiveness of the proposed features with the Kolmogorov-Arnold network (KAN) across multiple centers, yielding an AUC of 0. 92 when combined with pathological features and demonstrating good performance in external datasets (AUCs of 0. 84 and 0. 81). Additionally, we transform the best model into a symbolic formula to intuitively explain the machine learning model's prediction process, showing how factors such as age, HER2, Ki-67 and heterogeneity influence the prediction. The symbolized model is consistent with the experience of clinical experts, which enhances users' confidence in deep models. The experimental results show that our proposed features and method outperform classical heterogeneity features and end-to-end neural networks with a small additional computational cost.

Details DOI

ICLR Conference 2025 Conference Paper

LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation

Can Jin
Ying Li
Mingyu Zhao
Shiyu Zhao 0001
Zhenting Wang
Xiaoxiao He
Ligong Han
Tong Che

Visual prompting has gained popularity as a method for adapting pre-trained models to specific tasks, particularly in the realm of parameter-efficient tuning. However, existing visual prompting techniques often pad the prompt parameters around the image, limiting the interaction between the visual prompts and the original image to a small set of patches while neglecting the inductive bias present in shared information across different patches. In this study, we conduct a thorough preliminary investigation to identify and address these limitations. We propose a novel visual prompt design, introducing **Lo**w-**R**ank matrix multiplication for **V**isual **P**rompting (LoR-VP), which enables shared and patch-specific information across rows and columns of image pixels. Extensive experiments across seven network architectures and four datasets demonstrate significant improvements in both performance and efficiency compared to state-of-the-art visual prompting methods, achieving up to $6\times$ faster training times, utilizing $18\times$ fewer visual prompt parameters, and delivering a 3.1% improvement in performance.

Details

EAAI Journal 2025 Journal Article

Multi-objective optimization of additive manufacturing process parameters of nitinol alloys for elastocaloric cooling

Shuyao Wang
Yongjun Shi
Wenjia Cheng
Kaijun Fan
Qin Wang
Ying Li
Quan Li
Huayang Sun

Details DOI

NeurIPS Conference 2025 Conference Paper

Multi-View Oriented GPLVM: Expressiveness and Efficiency

Zi Yang
Ying Li
Zhidi Lin
Michael Minyi Zhang
Pablo Olmos

The multi-view Gaussian process latent variable model (MV-GPLVM) aims to learn a unified representation from multi-view data but is hindered by challenges such as limited kernel expressiveness and low computational efficiency. To overcome these issues, we first introduce a new duality between the spectral density and the kernel function. By modeling the spectral density with a bivariate Gaussian mixture, we then derive a generic and expressive kernel termed Next-Gen Spectral Mixture (NG-SM) for MV-GPLVMs. To address the inherent computational inefficiency of the NG-SM kernel, we propose a random Fourier feature approximation. Combined with a tailored reparameterization trick, this approximation enables scalable variational inference for both the model and the unified latent representations. Numerical evaluations across a diverse range of multi-view datasets demonstrate that our proposed method consistently outperforms state-of-the-art models in learning meaningful latent representations.

PDF Details

EAAI Journal 2025 Journal Article

Process planning of parameter intelligent adjustment for batch machining based on historical data segmented modeling

Juan Lu
Shiying Tu
Ying Li
Liang Zhang
Xiaoping Liao

Details DOI

AAAI Conference 2025 Conference Paper

Safe Online Convex Optimization with Heavy-Tailed Observation Noises

Yunhao Yang
Bo Xue
Yunzhi Hao
Ying Li
Yuanyu Wan

We investigate safe online convex optimization (SOCO), where each decision must satisfy a set of unknown linear constraints. Assuming that the unknown constraints can be observed with a sub-Gaussian noise for each chosen decision, previous studies have established a high-probability regret bound of O(T^{2/3}). However, this assumption may not hold in many practical scenarios. To address this limitation, in this paper, we relax the assumption to allow any noise that admits finite (1+ε)-th moments for some ε∈(0,1], and propose two algorithms that enjoy an O(T^{c_ε}) regret bound with high probability, where T is the time horizon and c_ε=(1+ε)/(1+2ε). The key idea of our two algorithms is to respectively utilize the median-of-means and truncation techniques to achieve accurate estimation under heavy-tailed noises. To the best of our knowledge, these are the first algorithms designed to handle SOCO with heavy-tailed observation noises.

PDF Details DOI

IS Journal 2025 Journal Article

SNNL: A Programming Language for SNN Development

Qinghui Xing
Zirun Li
Ying Li
Schahram Dustdar
Xin Du
Gang Pan
Shuiguang Deng

Spiking Neural Networks (SNNs) are gaining attention for biological plausibility and energy efficiency. Advances in neuromorphic systems—integrating hardware and software tools—accelerate SNN implementation. Yet, deploying SNNs on such platforms remains challenging due to model complexity and system heterogeneity, requiring flexible frameworks. Existing tools (e. g. , PyNN, Brian2) show limited expressiveness for neuromorphic applications or poor cross-platform support. This paper proposes SNNL, a flexible domain-specific language for SNN development and deployment on neuromorphic hardware. SNNL decouples neuronal dynamics modeling from network topology specification: equation-based representations handle diverse neuron/synapse models, while hierarchical constructs define complex connectivity patterns. We present a Darwin3-targeted compiler with efficient code generation. Evaluations confirm SNNL achieves precise neuronal dynamic descriptions and flexible network configurations. This work bridges algorithm-hardware gaps in neuromorphic computing by enhancing programmability. Experimental results have demonstrated the feasibility of SNNL in developing SNNs for neuromorphic systems.

Details DOI

YNIMG Journal 2025 Journal Article

Tracking neural activity patterns during rapid high-altitude transitions

Ji-Yu Xie
Yi Zhang
Wei Shen
Liying Wu
Quanhao Yu
Zhen Lyu
Liangyuan Song
Rui Yang

Details DOI

EAAI Journal 2025 Journal Article

Tube-LaneNet: Predict each three-dimensional lane as a completed structure via geometric priors

Genghua Kou
Shihao Wang
Ying Li

Details DOI

EAAI Journal 2025 Journal Article

Two-layer knowledge graph transformer network-based question and answer explainable recommendation

Ying Li
Ming Li
Jin Ding
Yixue Bai

Details DOI

NeurIPS Conference 2025 Conference Paper

URDF-Anything: Constructing Articulated Objects with 3D Multimodal Language Model

Zhe Li
Xiang Bai
Jieyu Zhang
Zhuangzhe Wu
Che Xu
Ying Li
Chengkai Hou
Shanghang Zhang

Constructing accurate digital twins of articulated objects is essential for robotic simulation training and embodied AI world model building, yet historically requires painstaking manual modeling or multi-stage pipelines. In this work, we propose \textbf{URDF-Anything}, an end-to-end automatic reconstruction framework based on a 3D multimodal large language model (MLLM). URDF-Anything utilizes an autoregressive prediction framework based on point-cloud and text multimodal input to jointly optimize geometric segmentation and kinematic parameter prediction. It implements a specialized [SEG] token mechanism that interacts directly with point cloud features, enabling fine-grained part-level segmentation while maintaining consistency with the kinematic parameter predictions. Experiments on both simulated and real-world datasets demonstrate that our method significantly outperforms existing approaches regarding geometric segmentation (mIoU 17\% improvement), kinematic parameter prediction (average error reduction of 29\%), and physical executability (surpassing baselines by 50\%). Notably, our method exhibits excellent generalization ability, performing well even on objects outside the training set. This work provides an efficient solution for constructing digital twins for robotic simulation, significantly enhancing the sim-to-real transfer capability.

PDF Details

EAAI Journal 2024 Journal Article

A multidimensional probabilistic model based evolutionary algorithm for the energy-efficient distributed flexible job-shop scheduling problem

Zi-Qi Zhang
Ying Li
Bin Qian
Rong Hu
Jian-Bo Yang

Details DOI

EAAI Journal 2024 Journal Article

A new efficient algorithm for short path planning of the vertical take-off and landing air-ground integrated vehicle

Jing Zhao
Weida Wang
Chao Yang
Ying Li
Liuquan Yang
Jiankang Cheng

Details DOI

EAAI Journal 2024 Journal Article

An intelligent decision support framework for nursing home resource planning with enhanced heterogeneous service demand modeling

Xuxue Sun
Nan Kong
Weiping Ding
Ying Li
Nazmus Sakib
Hao Zeng
Hongdao Meng
Chris Masterson

Details DOI

AAAI Conference 2024 Conference Paper

Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images

Qingping Zheng
Yuanfan Guo
Jiankang Deng
Jianhua Han
Ying Li
Songcen Xu
Hang Xu

Stable diffusion, a generative model used in text-to-image synthesis, frequently encounters resolution-induced composition problems when generating images of varying sizes. This issue primarily stems from the model being trained on pairs of single-scale images and their corresponding text descriptions. Moreover, direct training on images of unlimited sizes is unfeasible, as it would require an immense number of text-image pairs and entail substantial computational expenses. To overcome these challenges, we propose a two-stage pipeline named Any-Size-Diffusion (ASD), designed to efficiently generate well-composed HD images of any size, while minimizing the need for high-memory GPU resources. Specifically, the initial stage, dubbed Any Ratio Adaptability Diffusion (ARAD), leverages a selected set of images with a restricted range of ratios to optimize the text-conditional diffusion model, thereby improving its ability to adjust composition to accommodate diverse image sizes. To support the creation of images at any desired size, we further introduce a technique called Fast Seamless Tiled Diffusion (FSTD) at the subsequent stage. This method allows for the rapid enlargement of the ASD output to any high-resolution size, avoiding seaming artifacts or memory overloads. Experimental results on the LAION-COCO and MM-CelebA-HQ benchmarks demonstrate that ASD can produce well-structured images of arbitrary sizes, cutting down the inference time by 2X compared to the traditional tiled algorithm. The source code is available at https://github.com/ProAirVerse/Any-Size-Diffusion.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

FBLG: A Local Graph Based Approach for Handling Dual Skewed Non-IID Data in Federated Learning

Yi Xu
Ying Li
Haoyu Luo
Xiaoliang Fan
Xiao Liu

In real-world situations, federated learning often needs to process non-IID (non-independent and identically distributed) data with multiple skews, causing inadequate model performance. Existing federated learning methods mainly focus on addressing the problem with a single skew of non-IID, and hence the performance of global models can be degraded when faced with dual skewed non-IID data caused by heterogeneous label distributions and sample sizes among clients. To address the problem with dual skewed non-IID data, in this paper, we propose a federated learning algorithm based on local graph, named FBLG. Specifically, to address the label distribution skew, we firstly construct a local graph based on clients' local losses and Jensen-Shannon (JS) divergence, so that similar clients can be selected for aggregation to ensure a highly consistent global model. Afterwards, to address the sample size skew, we design the objective function to favor clients with more samples as models trained with more samples tend to carry more useful information. Experiments on four datasets with dual skewed non-IID data demonstrate FBLG outperforms nine baseline methods and achieves up to 9% improvement in accuracy. Simultaneously, both theoretical analysis and experiments show FBLG can converge quickly.

PDF Details DOI

EAAI Journal 2024 Journal Article

Smooth fusion of multi-spectral images via total variation minimization for traffic scene semantic segmentation

Ying Li
Aiqing Fang
Yangming Guo
Wei Sun
Xiaobao Yang
Xiaodong Wang

Details DOI

YNIMG Journal 2024 Journal Article

The dorsomedial prefrontal cortex promotes self-control by inhibiting the egocentric perspective

Chen Jin
Ying Li
Yin Yin
Tenda Ma
Wei Hong
Yan Liu
Nan Li
Xinyue Zhang

Details DOI

AAAI Conference 2023 Conference Paper

Semi-attention Partition for Occluded Person Re-identification

Mengxi Jia
Yifan Sun
Yunpeng Zhai
Xinhua Cheng
Yi Yang
Ying Li

This paper proposes a Semi-Attention Partition (SAP) method to learn well-aligned part features for occluded person re-identification (re-ID). Currently, the mainstream methods employ either external semantic partition or attention-based partition, and the latter manner is usually better than the former one. Under this background, this paper explores a potential that the weak semantic partition can be a good teacher for the strong attention-based partition. In other words, the attention-based student can substantially surpass its noisy semantic-based teacher, contradicting the common sense that the student usually achieves inferior (or comparable) accuracy. A key to this effect is: the proposed SAP encourages the attention-based partition of the (transformer) student to be partially consistent with the semantic-based teacher partition through knowledge distillation, yielding the so-called semi-attention. Such partial consistency allows the student to have both consistency and reasonable conflict with the noisy teacher. More specifically, on the one hand, the attention is guided by the semantic partition from the teacher. On the other hand, the attention mechanism itself still has some degree of freedom to comply with the inherent similarity between different patches, thus gaining resistance against noisy supervision. Moreover, we integrate a battery of well-engineered designs into SAP to reinforce their cooperation (e.g., multiple forms of teacher-student consistency), as well as to promote reasonable conflict (e.g., mutual absorbing partition refinement and a supervision signal dropout strategy). Experimental results confirm that the transformer student achieves substantial improvement after this semi-attention learning scheme, and produces new state-of-the-art accuracy on several standard re-ID benchmarks.

PDF Details DOI

YNIMG Journal 2022 Journal Article

Increased or decreased? Interpersonal neural synchronization in group creation

Zheng Liang
Songqing Li
Siyuan Zhou
Shi Chen
Ying Li
Yanran Chen
Qingbai Zhao
Furong Huang

Details DOI

TCS Journal 2022 Journal Article

The non-inclusive g-good-neighbor diagnosability of interconnection networks

Jun Yuan
Ying Li
Aixia Liu
Huijuan Qiao

Details DOI

IS Journal 2022 Journal Article

Xsickness in Intelligent Mobile Spaces and Metaverses

Ruichen Tan
Ruiyang Gao
Wenbo Li
Kai Cao
Ying Li
Chen Lv
Fei-Yue Wang
Dongpu Cao

Motion sickness is known to be a common problem that influences the comfort and work efficiency of human beings during their daily lives. With the proliferation of increasingly intelligent systems, the detection and mitigation of motion sickness will face more opportunities along with bigger challenges. On the one hand, the technology for integrated sensors in the intelligent system will provide more accurate and efficient methods for motion sickness detection. However, on the other hand, since cyber-physical systems have been gaining increasing concerns in the past two decades, the cyber-physical-social systems introduce and augment the social characteristics of such systems. The interactions between physical space and cyber space increase the chance of sensory conflicts when people use intelligent systems, such as traveling in intelligent cockpits or using metaverse-related virtual reality devices. The multimodal interaction methods and larger screens will cause more sensory conflicts. The symptoms will be more severe compared to traditional motion sickness. In this article, the classifications are first introduced based on the causes of motion sickness. A new type of multifactorial motion sickness (Xsickness) is discussed, which is foreseeable to be common with intelligent development. Then, the current state-of-the-art detection methods for motion sickness and cybersickness are summarized and theoretical methods for Xsickness detection are discussed. Finally, the mitigation methods based on motion reduction and four means of human perception are discussed and the innovative mitigation methods based on the intelligent system are also introduced.

Details DOI

JBHI Journal 2021 Journal Article

GCSBA-Net: Gabor-Based and Cascade Squeeze Bi-Attention Network for Gland Segmentation

Zhijie Wen
Ru Feng
Jingxin Liu
Ying Li
Shihui Ying

Colorectal cancer is the second and the third most common cancer in women and men, respectively. Pathological diagnosis is the “gold standard” for tumor diagnosis. Accurate segmentation of glands from tissue images is a crucial step in assisting pathologists in their diagnosis. The typical methods for gland segmentation form a dense image representation, ignoring its texture and multi-scale attention information. Therefore, we utilize a Gabor-based module to extract texture information at different scales and directions in histopathology images. This paper also designs a Cascade Squeeze Bi-Attention (CSBA) module. Specifically, we add Atrous Cascade Spatial Pyramid (ACSP), Squeeze Position Attention (SPA) module and Squeeze Channel Attention module (SCA) to model semantic correlation and maintain the multi-level aggregation on the spatial pyramid with different dilations. Besides, to solve the imbalance of data distribution and boundary blur, we propose a hybrid loss function to response the object boudary better. The experimental results show that the proposed method achieves state-of-the-art performance on the GlaS challenge dataset and CRAG colorectal adenocarcinoma dataset, respectively.

Details DOI

IJCAI Conference 2021 Conference Paper

Knowledge-Aware Dialogue Generation via Hierarchical Infobox Accessing and Infobox-Dialogue Interaction Graph Network

Sixing Wu
Minghui Wang
Dawei Zhang
Yang Zhou
Ying Li
Zhonghai Wu

Due to limited knowledge carried by queries, traditional dialogue systems often face the dilemma of generating boring responses, leading to poor user experience. To alleviate this issue, this paper proposes a novel infobox knowledge-aware dialogue generation approach, HITA-Graph, with three unique features. First, open-domain infobox tables that describe entities with relevant attributes are adopted as the knowledge source. An order-irrelevance Hierarchical Infobox Table Encoder is proposed to represent an infobox table at three levels of granularity. In addition, an Infobox-Dialogue Interaction Graph Network is built to effectively integrate the infobox context and the dialogue context into a unified infobox representation. Second, a Hierarchical Infobox Attribute Attention mechanism is developed to access the encoded infobox knowledge at different levels of granularity. Last but not least, a Dynamic Mode Fusion strategy is designed to allow the Decoder to select a vocabulary word or copy a word from the given infobox/query. We extract infobox tables from Chinese Wikipedia and construct an infobox knowledge base. Extensive evaluation on an open-released Chinese corpus demonstrates the superior performance of our approach against several representative methods.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact

Sixing Wu
Ying Li
Dawei Zhang
Yang Zhou
Zhonghai Wu

Insufficient semantic understanding of dialogue always leads to the appearance of generic responses, in generative dialogue systems. Recently, high-quality knowledge bases have been introduced to enhance dialogue understanding, as well as to reduce the prevalence of boring responses. Although such knowledge-aware approaches have shown tremendous potential, they always utilize the knowledge in a black-box fashion. As a result, the generation process is somewhat uncontrollable, and it is also not interpretable. In this paper, we introduce a topic fact-based commonsense knowledge-aware approach, TopicKA. Different from previous works, TopicKA generates responses conditioned not only on the query message but also on a topic fact with an explicit semantic meaning, which also controls the direction of generation. Topic facts are recommended by a recommendation network trained under the Teacher-Student framework. To integrate the recommendation network and the generation network, this paper designs four schemes, which include two non-sampling schemes and two sampling methods. We collected and constructed a large-scale Chinese commonsense knowledge graph. Experimental results on an open Chinese benchmark dataset indicate that our model outperforms baselines in terms of both the objective and the subjective metrics.

PDF Details DOI

IJCAI Conference 2019 Conference Paper

Self-attentive Biaffine Dependency Parsing

Ying Li
Zhenghua Li
Min Zhang
Rui Wang
Sheng Li
Luo Si

The current state-of-the-art dependency parsing approaches employ BiLSTMs to encode input sentences. Motivated by the success of the transformer-based machine translation, this work for the first time applies the self-attention mechanism to dependency parsing as the replacement of the BiLSTM-based encoders, leading to competitive performance on both English and Chinese benchmark data. Based on the detailed error analysis, we then combine the power of both BiLSTM and self-attention via model ensembles, demonstrating their complementary capability of capturing contextual information. Finally, we explore the recently proposed contextualized word representations as extra input features, and further improve the parsing performance.

PDF Details

AAAI Conference 2018 Conference Paper

Early Prediction of Diabetes Complications from Electronic Health Records: A Multi-Task Survival Analysis Approach

Bin Liu
Ying Li
Zhaonan Sun
Soumya Ghosh
Kenney Ng

Type 2 diabetes mellitus (T2DM) is a chronic disease that usually results in multiple complications. Early identiﬁcation of individuals at risk for complications after being diagnosed with T2DM is of signiﬁcant clinical value. In this paper, we present a new data-driven predictive approach to predict when a patient will develop complications after the initial T2DM diagnosis. We propose a novel survival analysis method to model the time-to-event of T2DM complications designed to simultaneously achieve two important metrics: 1) accurate prediction of event times, and 2) good ranking of the relative risks of two patients. Moreover, to better capture the correlations of time-to-events of the multiple complications, we further develop a multi-task version of the survival model. To assess the performance of these approaches, we perform extensive experiments on patient level data extracted from a large electronic health record claims database. The results show that our new proposed survival analysis approach consistently outperforms traditional survival models and demonstrate the effectiveness of the multi-task framework over modeling each complication independently.

PDF Details

YNICL Journal 2017 Journal Article

Acupuncture modulates the abnormal brainstem activity in migraine without aura patients

Zhengjie Li
Fang Zeng
Tao Yin
Lei Lan
Nikos Makris
Kristen Jorgenson
Taipin Guo
Feng Wu

Details DOI

EAAI Journal 2015 Journal Article

Efficient web service QoS prediction using local neighborhood matrix factorization

Wei Lo
Jianwei Yin
Ying Li
Zhaohui Wu

Details DOI

TCS Journal 2013 Journal Article

Optimal fault-tolerant routing algorithm and fault-tolerant diameter in directed double-loop networks

Yebin Chen
Ying Li
Tao Chen

Details DOI

YNIMG Journal 2012 Journal Article

A quantitative analytic pipeline for evaluating neuronal activities by high‐throughput synaptic vesicle imaging

Jing Fan
Xiaofeng Xia
Ying Li
Jennifer G. Dy
Stephen T.C. Wong

Details DOI

ICRA Conference 2002 Conference Paper

An Analytical Grasp Planning on Given Object with Multifingered Hand

Ying Li
Yong Yu 0003
Showzow Tsujio

In this paper, an analytical approach is proposed for planning finger positions of grasping an object with a multifingered hand. First, a method is given to obtain which combination of the object edges is possible to be used for grasping. Then, a graspable finger position region (GFPR) on a combination of edges is defined where the object can be held successfully. It is shown that the region is bounded by several boundary hyperplanes. By combining these boundary hyperplanes, two propositions for analytically and exactly obtaining the GFPR are proposed. An algorithm is proposed to find a stable GFPR that contains the biggest inscribed hypersphere of GFPR and has the largest volume. Finally, a numerical example is performed to show the effectiveness of the proposed grasp planning approach.

Details

IROS Conference 2001 Conference Paper

A novel analytical method for finger position regions on grasped object

Yong Yu 0003
Ying Li
Showzow Tsujio

An analytical approach is proposed for obtaining finger position regions of an object with a multi-fingered hand. First, a method to obtain which combination of the object edges is possible to be used for grasping, is given. Then, a graspable finger position region on a combination of edges is defined where the object can be held successfully. It is shown that the region is bounded by plural boundary hyperplanes. By combining these boundary hyperplanes, two propositions for exactly obtaining the graspable finger position region by using an analytical method, are proposed. Finally, numerical examples are performed to show the effectiveness of the proposed approach.

Details