Arrow Research search

Author name cluster

Yan Zhao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

24 papers
2 author rows

Possible papers

24

AAAI Conference 2026 Conference Paper

A Brain-Inspired Saliency Prediction Framework for Human-AI Cognitive Consistency in AIGC Content via Multi-Region Liquid Neurons

  • Shibo Wang
  • Yan Zhao
  • Shigang Wang
  • Jian Wei
  • Shuo Li

In recent years, human-AI cognitive consistency has emerged as a crucial perspective for evaluating the perceptual quality and interpretability of AIGC (Artificial Intelligence Generated Content). This paper proposes a biologically inspired saliency prediction framework that models six core regions of the human visual system—namely V1, V2, V4, MT, LIP, and FEF—using liquid neurons to capture the dynamic saliency features aligned with human gaze behavior. To enable effective alignment between AIGC models and human cognitive mechanisms, we introduce a cross-domain dual-teacher distillation strategy and construct a large-scale multimodal dataset comprising natural images, eye-tracking data, AIGC-generated images, and their corresponding cross-attention maps. Furthermore, we propose HAMCI (Human-AI Mutual Cognitive Index), a novel metric designed to quantitatively assess the spatial and semantic alignment between predicted saliency maps and model attention distributions. The proposed method demonstrates promising performance across various saliency prediction and cognitive alignment tasks, with results comparable to or surpassing recent state-of-the-art methods in several benchmarks. The code and dataset will be released upon acceptance to facilitate future research on cognitively aligned AIGC evaluation.

AAAI Conference 2026 Conference Paper

D-FCGS: Feedforward Compression of Dynamic Gaussian Splatting for Free-Viewpoint Videos

  • Wenkang Zhang
  • Yan Zhao
  • Qiang Wang
  • Zhixin Xu
  • Li Song
  • Zhengxue Cheng

Free-Viewpoint Video (FVV) enables immersive 3D experiences, but efficient compression of dynamic 3D representation remains a major challenge. Existing dynamic 3D Gaussian Splatting methods couple reconstruction with optimization-dependent compression and customized motion formats, limiting generalization and standardization. To address this, we propose D-FCGS, a novel Feedforward Compression framework for Dynamic Gaussian Splatting. Key innovations include: (1) a standardized Group-of-Frames (GoF) structure with I-P coding, leveraging sparse control points to extract inter-frame motion tensors; (2) a dual prior-aware entropy model that fuses hyperprior and spatial-temporal priors for accurate rate estimation; (3) a control-point-guided motion compensation mechanism and refinement network to enhance view-consistent fidelity. Trained on Gaussian frames derived from multi-view videos, D-FCGS generalizes across diverse scenes in a zero-shot fashion. Experiments show that it matches the rate-distortion performance of optimization-based methods, achieving over 40 times compression compared to the baseline while preserving visual quality across viewpoints. This work advances feedforward compression of dynamic 3DGS, facilitating scalable FVV transmission and storage for immersive applications.

EAAI Journal 2026 Journal Article

Single-cell distillation discriminative clustering based on asymmetric autoencoder

  • Junliang Shang
  • Aitian Fan
  • Baojuan Qin
  • Yan Zhao
  • Xiaohan Zhang
  • Shoujia Jiang
  • Feng Li
  • Jin-Xing Liu

Single-cell ribonucleic acid sequencing (scRNA-seq) technology enables the analysis of tissue heterogeneity at the single-cell level, providing essential tools for tasks such as cell type identification and trajectory inference. As a core step in single-cell data analysis, cell clustering is crucial for identifying cell types, detecting subpopulations, and understanding cellular functional states. However, traditional clustering methods often fail to capture the true cellular structure due to the lack of prior knowledge, while supervised clustering methods are prone to domain distribution mismatches. To address these challenges, this paper proposes a single-cell distillation-based discriminative clustering method (scAADC), which leverages an asymmetric autoencoder to enhance feature extraction and reconstruction capabilities. Additionally, a contrastive learning strategy is incorporated to capture the most representative features of cells. Furthermore, a distillation-based discriminative clustering module is designed to utilize source domain labels and feature distance constraints, ensuring that similar cells cluster together while distinct cell types remain well separated. This allows the model to extract discriminative information from labeled source data. Finally, based on the extracted discriminative features, target data are clustered, implicitly aligning domain distributions and reducing cluster overlap. We evaluate scAADC on both simulated and real datasets. Experimental results demonstrate that scAADC achieves accuracy (ACC) and Adjusted Rand Index (ARI) values as high as 0. 9976 and 0. 9936 on simulated datasets, and 0. 9728 and 0. 9325 on real datasets, outperforming other state-of-the-art single-cell clustering methods. By integrating Artificial Intelligence-driven feature learning with cross-domain knowledge distillation, scAADC provides an efficient and robust solution for single-cell data analysis.

EAAI Journal 2025 Journal Article

A novel wasserstein autoencoder-enhanced thermo-mechanical coupled reduced-order model for high pressure turbine blades life monitoring

  • Rongqiao Wang
  • Ruoqi Chen
  • Yan Zhao
  • Tianbao Shen
  • Gaoxiang Chen
  • Dianyin Hu
  • Zhimin Jiang
  • Xuemin Wang

To facilitate effective assessment of loads (e. g. , temperature and stress) and life management of high-pressure turbine (HPT) blades, a wasserstein autoencoder (WAE)-enhanced thermodynamically coupled reduced-order model (ROM) is proposed in this paper. The advanced ROM for the nonlinear thermomechanical coupling fields is developed by introducing the deep learning model of WAE in the proper orthogonal decomposition (POD) method. The proposed method improves the prediction accuracy of loads in locally focused regions and generalization performance. The accuracy and efficiency of this method are validated through 30 sets of validation conditions. Results indicate that the proposed approach achieves higher accuracy and better generalization performance than traditional POD-based methods, with errors maintained within 10. Additionally, computational speed is improved by nearly 1400 times compared to conventional numerical methods. The WAE-enhanced ROM is applied for load and life assessment of the HPT blades throughout their service life. The evaluation time for a single aeroengine performance parameter is 1. 7 s, and for a single flight evaluation, it is 67 s, which highlights the effectiveness of the proposed method in enabling the assessment of the loads and remaining life of HPT blades.

JBHI Journal 2025 Journal Article

Cell-Level Free Cervical Lesion Detection in Cytology Images Via Weakly Supervised Self-Correction

  • Jiayi Wu
  • Yan Zhao
  • Chinmay Chakraborty
  • Sandeep Kumar Thota
  • Jingmin Xin
  • Keping Yu

Cervical cancer remains the fourth most common cancer among women worldwide. Early detection of cervical lesions in cytology images can prevent disease progression, but current deep learning methods for cell- or patch-level analysis in whole slide images (WSI) face significant challenges due to limited, noisy, or incomplete annotations. To address these limitations, weakly supervised learning methods, particularly multiple instance learning (MIL), have been explored. However, traditional MIL methods often suffer from label noise, leading to inaccurate feature extraction, which in turn restricts their robustness and generalization. In this paper, we propose Self-Correcting Instance Learning (SCIL), a novel two-stage instance-based MIL framework designed to enhance instance-level cervical lesion detection under bag-level supervision. SCIL incorporates a weakly supervised self-correction mechanism within a teacher-student architecture to mitigate the effects of noisy pseudo labels. This process involves a contrastive dynamic weighting strategy to adjust instance-level loss and enhance feature representation in stage one, followed by an uncertainty-based self-correction strategy in stage two to retain only high-confidence data with reassigned labels. Extensive evaluations of a slide cervical cytology image dataset demonstrate that SCIL significantly improves the detection of cervical lesions at both the patch and slide levels, highlighting its ability to overcome the limitations of imperfect data in cervical lesion detection.

AAAI Conference 2025 Conference Paper

L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression

  • Junxuan Zhang
  • Zhengxue Cheng
  • Yan Zhao
  • Shihao Wang
  • Dajiang Zhou
  • Guo Lu
  • Li Song

Learning-based probabilistic models can be combined with an entropy coder for data compression. However, due to the high complexity of learning-based models, their practical application as text compressors has been largely overlooked. To address this issue, our work focuses on a low-complexity design while maintaining compression performance. We introduce a novel Learned Lossless Low-complexity Text Compression method (L3TC). Specifically, we conduct extensive experiments demonstrating that RWKV models achieve the fastest decoding speed with a moderate compression ratio, making it the most suitable backbone for our method. Second, we propose an outlier-aware tokenizer that uses a limited vocabulary to cover frequent tokens while allowing outliers to bypass the prediction and encoding. Third, we propose a novel high-rank reparameterization strategy that enhances the learning capability during training without increasing complexity during inference. Experimental results validate that our method achieves 48% bit saving compared to gzip compressor. Besides, L3TC offers compression performance comparable to other learned compressors, with a 50x reduction in model parameters. More importantly, L3TC is the fastest among all learned compressors, providing real-time decoding speeds up to megabytes per second.

NeurIPS Conference 2025 Conference Paper

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

  • Zhixun Chen
  • Ping Guo
  • Wenhan Han
  • Yifan Zhang
  • Binbin Liu
  • Haobin Lin
  • Fengze Liu
  • Yan Zhao

Data quality is a critical driver of large language model performance, yet existing model-based selection methods focus almost exclusively on English, neglecting other languages that are essential in the training mix for multilingual LLMs. We introduce MuRating, a scalable framework that transfers high-quality English data-quality signals into a multilingual autorater, capable of handling 17 languages. MuRating aggregates multiple English autoraters via pairwise comparisons to learn unified document quality scores, then projects these judgments through translation to train a multilingual evaluator on monolingual, cross-lingual, and parallel text pairs. Applied to web data, MuRating selects balanced subsets of English and multilingual content to pretrain LLaMA-architecture models of 1. 2B and 7B parameters. Compared to strong baselines, including QuRater, FineWeb2-HQ, AskLLM, DCLM, our approach increases average accuracy on both English benchmarks and multilingual evaluations. Extensive analyses further validate that pairwise training provides greater stability and robustness than pointwise scoring, underscoring the effectiveness of MuRating as a general multilingual data-selection framework.

JBHI Journal 2025 Journal Article

pscAdapt: Pre-Trained Domain Adaptation Network Based on Structural Similarity for Cell Type Annotation in Single Cell RNA-seq Data

  • Yan Zhao
  • Junliang Shang
  • Baojuan Qin
  • Limin Zhang
  • Xin He
  • Daohui Ge
  • Qianqian Ren
  • Jin-Xing Liu

Cell type annotation refers to the process of categorizing and labeling cells to identify their specific cell types, which is crucial for understanding cell functions and biological processes. Although many methods have been developed for automated cell type annotation, they often encounter challenges such as batch effects due to variations in data distribution across platforms and species, thereby compromising their performance. To address batch effects, in this study, a pre-trained domain adaptation model based on structural similarity, named pscAdapt, is proposed for cell type annotation. Specifically, a pre-trained strategy is employed to initialize model parameters to learn the data distribution of source domain. This strategy is also combined with an adversarial learning strategy to train the domain adaptation network for achieving domain level alignment and reducing domain discrepancy. Furthermore, to better distinguish different types of cells, a structural similarity loss is designed, aiming to shorten distances between cells of the same type and increase distances between cells of different types in feature space, thus achieving cell level alignment and enhancing the discriminability of cell types. Comprehensive experiments were conducted on simulated datasets, cross-platforms datasets and cross-species datasets to validate the effectiveness of pscAdapt, results of which demonstrate that pscAdapt outperforms several popular cell type annotation methods.

NeurIPS Conference 2025 Conference Paper

SPOT-Trip: Dual-Preference Driven Out-of-Town Trip Recommendation

  • Yinghui Liu
  • Hao Miao
  • Guojiang Shen
  • Yan Zhao
  • Xiangjie Kong
  • Ivan Lee

Out-of-town trip recommendation aims to generate a sequence of Points of Interest (POIs) for users traveling from their hometowns to previously unvisited regions based on personalized itineraries, e. g. , origin, destination, and trip duration. Modeling the complex user preferences--which often exhibit a two-fold nature of static and dynamic interests--is critical for effective recommendations. However, the sparsity of out-of-town check-in data presents significant challenges in capturing such user preferences. Meanwhile, existing methods often conflate the static and dynamic preferences, resulting in suboptimal performance. In this paper, we for the first time systematically study the problem of out-of-town trip recommendation. A novel framework SPOT-Trip is proposed to explicitly learns the dual static-dynamic user preferences. Specifically, to handle scarce data, we construct a POI attribute knowledge graph to enrich the semantic modeling of users’ hometown and out-of-town check-ins, enabling the static preference modeling through attribute relation-aware aggregation. Then, we employ neural ordinary differential equations (ODEs) to capture the continuous evolution of latent dynamic user preferences and innovatively combine a temporal point process to describe the instantaneous probability of each preference behavior. Further, a static-dynamic fusion module is proposed to merge the learned static and dynamic user preferences. Extensive experiments on real data offer insight into the effectiveness of the proposed solutions, showing that SPOT-Trip achieves performance improvement by up to 17. 01%.

EAAI Journal 2024 Journal Article

A study on attention-based fine-grained image recognition: Towards musical instrument performing hand shape assessment

  • Wenting Zhao
  • Shigang Wang
  • Yan Zhao
  • Jian Wei
  • Yecheng Liang
  • Jiehua Lin
  • Yu Tang

Automatic identification and professional evaluation makes musical instrument learning more intelligent. Since a proper hand shape is the basis of fingerings in playing instruments, this paper explores an integration of intelligent recognition technique into hand shape assessment of instrument players in an attempt of taking Chinese zither (Zheng) as an example. The fine-grained image recognition is novelly applied to automatically assessing basic hand shapes, as a tentative exploration of interdisciplinary research. First, this paper formulates an assessment scales by combining fine-grained image features with hand shape evaluation indicators in musical instrument learning. Then, an image dataset for hand shapes of Chinese zither performance (CZ-Dataset V2) is established based on free multi-view acquisition. Finally, we propose a fine-grained hand shape image recognition method using attention mechanism. Experimental results show that the basic instrumental hand shapes can be effectively recognized and reasonable suggestions for hand shape assessment can be provided.

EAAI Journal 2024 Journal Article

A user review data-driven supplier ranking model using aspect-based sentiment analysis and fuzzy theory

  • Bingli Sun
  • Xiao Song
  • Wenxin Li
  • Lu Liu
  • Guanghong Gong
  • Yan Zhao

Background: The supplier selection problem is a sophisticated decision-making process that involves evaluating multiple factors. While previous research has primarily focused on objective attributes, such as supplier qualifications, product quality, and price, the subjective opinions of users have often been overlooked. However, with the growing importance of user reviews and sentiment analysis in e-commerce, incorporating users’ opinions on supplier products can provide valuable insights. Purpose: This study aims to address the limitations of existing supplier selection approaches by proposing a comprehensive framework that integrates aspect-level sentiment analysis and a fuzzy multi-attribute decision model. The goal is to enhance the decision-making process by considering both objective attributes and subjective opinions. Methods: To achieve this, we develop a novel convolutional neural network (CNN) model with a gating mechanism to perform aspect-level sentiment analysis. Furthermore, we propose a fuzzy multi-attribute decision model that combines the predefined sentiment aspects with traditional evaluation criteria. The model is applied to a dataset specifically designed for automotive component supplier selection. Results: Experimental results demonstrate the superior performance of our approach compared to existing methods and datasets. A case study demonstrates the combination of aspect-level sentiment analysis and the fuzzy decision model allows for a more comprehensive evaluation of suppliers. Conclusion: By integrating aspect-level sentiment analysis and the fuzzy multi-attribute decision model, our proposed framework offers a novel perspective on supplier selection problems. The results highlight the feasibility and superiority of our approach, providing valuable insights for management in making informed decisions. This research contributes to the fields of supplier selection, sentiment analysis, and decision-making, with potential applications in various industries beyond the automotive sector.

ECAI Conference 2023 Conference Paper

FedCoop: Cooperative Federated Learning for Noisy Labels

  • Kahou Tam
  • Li Li 0064
  • Yan Zhao
  • ChengZhong Xu 0001

Federated Learning coordinates multiple clients to collaboratively train a shared model while preserving data privacy. However, the training data with noisy labels located on the participating clients severely harm the model performance. In this paper, we propose FedCoop, a cooperative Federated Learning framework for noisy labels. FedCoop mainly contains three components and conducts robust training in two phases, data selection and model training. In the data selection phase, in order to mitigate the confirmation bias caused by a single client, the Loss Transformer intelligently estimates the probability of each sample’s label to be clean through cooperating with the helper clients, which have high data trustability and similarity. After that, the Feature Comparator evaluates the label quality for each sample in terms of latent feature space in order to further improve the robustness of noisy label detection. In the model training phase, the Feature Matcher trains the model on both the noisy and clean data in a semi-supervised manner to fully utilize the training data and exploits the feature of global class to increase the consistency of pseudo labeling across the clients. The experimental results show FedCoop outperforms the baselines on various datasets with different noise settings. It effectively improves the model accuracy up to 62% and 27% on average compared with the baselines.

JBHI Journal 2023 Journal Article

FSTIF-UNet: A Deep Learning-Based Method Towards Automatic Segmentation of Intracranial Aneurysms in Un-Reconstructed 3D-RA

  • Jianhua Zhang
  • Yan Zhao
  • Xinke Liu
  • Jincheng Jiang
  • Youxiang Li

Segmentation of intracranial aneurysms (IAs) is an important step for the diagnosis and treatment of IAs. However, the process by which clinicians manually recognize and localize IAs is overly labor intensive. This study aims to develop a deep-learning-based framework (defined as FSTIF-UNet) towards IAs segmentation in un-reconstructed 3D Rotational Angiography (3D-RA) images. 3D-RA sequences from 300 patients with IAs from Beijing Tiantan Hospital are enrolled. Inspired by radiologists' clincial skills, a Skip-Review attention mechanism is proposed to repeatedly fuse the long-term spatiotemporal features of several images with the most obvious IA's features (sellected by a pre-detection network). Then, a Conv-LSTM is used to fuse the short-term spatiotemporal features of the selected 15 3D-RA images from the equally-spaced viewing angles. The combination of the two modules realizes the full-scale spatiotemporal information fusion of the 3D-RA sequence. FSTIF-UNet achieves DSC, IoU, Sens, Haus, and F1-Score of 0. 9109, 0. 8586, 0. 9314, 1. 358 and 0. 8883, respectively, and time taken for network segmentation is 0. 89 s/case. The results show significant improvement in IA segmentation performance with FSTIF-UNet compared with baseline networks (with DSC from 0. 8486 - 0. 8794). The proposed FSTIF-UNet establishes a practical method to assist the radiologists in clinical diagnosis.

NeurIPS Conference 2023 Conference Paper

Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions

  • Ruihai Wu
  • Kai Cheng
  • Yan Zhao
  • Chuanruo Ning
  • Guanqi Zhan
  • Hao Dong

Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e. g. , occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints.

EAAI Journal 2023 Journal Article

Optimal production scheduling with multi-round information interaction for demander-dominated decentralized scheduling problem

  • Like Zhang
  • Qianwang Deng
  • Xiaoyu Wen
  • Yan Zhao
  • Guiliang Gong

Demander-dominated market scenarios are becoming increasingly common owing to the emergence of alternative service providers and competitive market environment. However, these scenarios have not been considered in previous studies pertaining to decentralized scheduling problems. Thus, service providers cannot formulate optimal production scheduling schemes for such scenarios. In this study, we investigate a demander-dominated decentralized scheduling problem in which the demander can adopt the private-strategic behavior of transferring partial orders towards its alternative service providers. The aim of this study is to provide guidance to service providers for developing high-quality production scheduling solutions under asymmetric information. First, we design a multi-round information interaction mechanism with a learning strategy to realize information interaction. Subsequently, a metaheuristic algorithm termed MAM is developed based on the multi-round information interaction mechanism to solve the proposed problem. A problem-dependent initialization method and a solution generation method integrating the learning strategy are developed to improve the search efficiency. Experimental results indicate the usefulness of the initialization method and learning strategy. Based on a comparison with two well-established adapted algorithms, the effectiveness of the proposed algorithm is confirmed, particularly for instances with loose due dates. Furthermore, we analyze the effect of MAM on both the service provider and the demander by comparing it with traditional centralized approaches. Statistical results show that the proposed algorithm yields high-quality solutions for the service provider and that maintaining the confidentiality of private information is conducive to the demander, particularly when the due dates are tight.

AAAI Conference 2022 Conference Paper

Adaptive Logit Adjustment Loss for Long-Tailed Visual Recognition

  • Yan Zhao
  • Weicong Chen
  • Xu Tan
  • Kai Huang
  • Jihong Zhu

Data in the real world tends to exhibit a long-tailed label distribution, which poses great challenges for the training of neural networks in visual recognition. Existing methods tackle this problem mainly from the perspective of data quantity, i. e. , the number of samples in each class. To be specific, they pay more attention to tail classes, like applying larger adjustments to the logit. However, in the training process, the quantity and difficulty of data are two intertwined and equally crucial problems. For some tail classes, the features of their instances are distinct and discriminative, which can also bring satisfactory accuracy; for some head classes, although with sufficient samples, the high semantic similarity with other classes and lack of discriminative features will bring bad accuracy. Based on these observations, we propose Adaptive Logit Adjustment Loss (ALA Loss) to apply an adaptive adjusting term to the logit. The adaptive adjusting term is composed of two complementary factors: 1) quantity factor, which pays more attention to tail classes, and 2) difficulty factor, which adaptively pays more attention to hard instances in the training process. The difficulty factor can alleviate the over-optimization on tail yet easy instances and under-optimization on head yet hard instances. The synergy of the two factors can not only advance the performance on tail classes even further, but also promote the accuracy on head classes. Unlike previous logit adjusting methods that only concerned about data quantity, ALA Loss tackles the long-tailed problem from a more comprehensive, finegrained and adaptive perspective. Extensive experimental results show that our method achieves the state-of-the-art performance on challenging recognition benchmarks, including ImageNet-LT, iNaturalist 2018, and Places-LT.

EAAI Journal 2022 Journal Article

An improved brain storm optimization algorithm with new solution generation strategies for classification

  • Yu Xue
  • Qi Zhang
  • Yan Zhao

In recent years, brain storm optimization (BSO) algorithm has received much attention in solving classical optimization problems and is used to implement evolutionary classification models. However, in practical applications, large-scale datasets complicate the structure of the classification model, which can have a great impact on the classification performance. In the optimization process, the traditional single-strategy BSO cannot preserve the information of dominant solution well, and its generation strategy is inefficient in solving various complex practical problems. To solve this problem, we introduce feature selection to improve the optimization model structure. Meanwhile, in order to enhance the search capability of BSO, three new generation strategy are embedded in the BSO algorithm in this paper. With the three generation methods of global optimal, local optimal and nearest neighbor, the information of the dominant solution can be better preserved and the search efficiency can be improved. The performance of the proposed generation strategy in solving classification problems is demonstrated on ten datasets with different sizes and dimensions. The experimental results reveal that the new generation strategy can enhance the performance of BSO algorithm for solving classification problems.

TIST Journal 2022 Journal Article

Efficient and Effective Similar Subtrajectory Search: A Spatial-aware Comprehension Approach

  • Liwei Deng
  • Hao Sun
  • Rui Sun
  • Yan Zhao
  • Han Su

Although many applications take subtrajectories as basic units for analysis, there is little research on the similar subtrajectory search problem aiming to return a portion of a trajectory (i.e., subtrajectory), which is the most similar to a query trajectory. We find that in some special cases, when a grid-based metric is used, this problem can be formulated as a reading comprehension problem, which has been studied extensively in the field of natural language processing (NLP). By this formulation, we can obtain faster models with better performance than existing methods. However, due to the difference between natural language and trajectory (e.g., spatial relationship), it is impossible to directly apply NLP models to this problem. Therefore, we propose a Similar Subtrajectory Search with a Graph Neural Networks framework. This framework contains four modules including a spatial-aware grid embedding module, a trajectory embedding module, a query-context trajectory fusion module, and a span prediction module. Specifically, in the spatial-aware grid embedding module, the spatial-based grid adjacency is constructed and delivered to the graph neural network to learn spatial-aware grid embedding. The trajectory embedding module aims to model the sequential information of trajectories. The purpose of the query-context trajectory fusion module is to fuse the information of the query trajectory to each grid of the context trajectories. Finally, the span prediction module aims to predict the start and the end of a subtrajectory for the context trajectory, which is the most similar to the query trajectory. We conduct comprehensive experiments on two real world datasets, where the proposed framework outperforms the state-of-the-art baselines consistently and significantly.

YNIMG Journal 2021 Journal Article

Decomposition of individual-specific and individual-shared components from resting-state functional connectivity using a multi-task machine learning method

  • Xuetong Wang
  • Qiongling Li
  • Yan Zhao
  • Yirong He
  • Baoqiang Ma
  • Zhenrong Fu
  • Shuyu Li

Resting-state functional connectivity (RSFC) can be used for mapping large-scale human brain networks during rest. There is considerable interest in distinguishing the individual-shared and individual-specific components in RSFC for the better identification of individuals and prediction of behavior. Therefore, we propose a multi-task learning based sparse convex alternating structure optimization (MTL-sCASO) method to decompose RSFC into individual-specific connectivity and individual-shared connectivity. We used synthetic data to validate the efficacy of the MTL-sCASO method. In addition, we verified that individual-specific connectivity achieves higher identification rates than the Pearson correlation (PC) method, and the individual-specific components observed in 886 individuals from the Human Connectome Project (HCP) examined in two sessions over two consecutive days might serve as individual fingerprints. Individual-specific connectivity has low inter-subject similarity (-0.005±0.023), while individual-shared connectivity has high inter-subject similarity (0.822±0.061). We also determined the anatomical locations (region or subsystem) related to individual attributes and common features. We find that individual-specific connectivity exhibits low degree centrality in the sensorimotor processing system but high degree centrality in the control system. Importantly, the individual-specific connectivity estimated by the MTL-sCASO method accurately predicts behavioral scores (improved by 9.4% compared to the PC method) in the cognitive dimension. The decomposition of individual-specific and individual-shared components from RSFC provides a new approach for tracing individual traits and group analysis using functional brain networks.

JBHI Journal 2021 Journal Article

Prediction of Alzheimer's Disease Progression with Multi-Information Generative Adversarial Network

  • Yan Zhao
  • Baoqiang Ma
  • Pengbo Jiang
  • Debin Zeng
  • Xuetong Wang
  • Shuyu Li

Alzheimer's disease (AD) is a chronic neurodegenerative disease, and its long-term progression prediction is definitely important. The structural Magnetic Resonance Imaging (sMRI) can be used to characterize the cortical atrophy that is closely coupled with clinical symptoms in AD and its prodromal stages. Many existing methods have focused on predicting the cognitive scores at future time-points using a set of morphological features derived from sMRI. The 3D sMRI can provide more massive information than the cognitive scores. However, very few works consider to predict an individual brain MRI image at future time-points. In this article, we propose a disease progression prediction framework that comprises a 3D multi-information generative adversarial network (mi-GAN) to predict what one's whole brain will look like with an interval, and a 3D DenseNet based multi-class classification network optimized with a focal loss to determine the clinical stage of the estimated brain. The mi-GAN can generate high-quality individual 3D brain MRI image conditioning on the individual 3D brain sMRI and multi-information at the baseline time-point. Experiments are implemented on the Alzheimer's Disease Neuroimaging Initiative (ADNI). Our mi-GAN shows the state-of-the-art performance with the structural similarity index (SSIM) of 0. 943 between the real MRI images at the fourth year and the generated ones. With mi-GAN and focal loss, the pMCI vs. sMCI accuracy achieves 6. 04% improvement in comparison with conditional GAN and cross entropy loss.

ICRA Conference 2020 Conference Paper

VALID: A Comprehensive Virtual Aerial Image Dataset

  • Lyujie Chen
  • Feng Liu
  • Yan Zhao
  • Wufan Wang
  • Xiaming Yuan
  • Jihong Zhu 0001

Aerial imagery plays an important role in land-use planning, population analysis, precision agriculture, and unmanned aerial vehicle tasks. However, existing aerial image datasets generally suffer from the problem of inaccurate labeling, single ground truth type, and few category numbers. In this work, we implement a simulator that can simultaneously acquire diverse visual ground truth data in the virtual environment. Based on that, we collect a comprehensive Virtual AeriaL Image Dataset named VALID, consisting of 6690 high-resolution images, all annotated with panoptic segmentation on 30 categories, object detection with oriented bounding box, and binocular depth maps, collected in 6 different virtual scenes and 5 various ambient conditions (sunny, dusk, night, snow and fog). To our knowledge, VALID is the first aerial image dataset that can provide panoptic level segmentation and complete dense depth maps. We analyze the characteristics of VALID and evaluate state-of-the-art methods for multiple tasks to provide reference baselines. The experiment results demonstrate that VALID is well presented and challenging. The dataset is available at https://sites.google.com/view/valid-dataset/.

AAAI Conference 2019 Conference Paper

Preference-Aware Task Assignment in Spatial Crowdsourcing

  • Yan Zhao
  • Jinfu Xia
  • Guanfeng Liu
  • Han Su
  • Defu Lian
  • Shuo Shang
  • Kai Zheng

With the ubiquity of smart devices, Spatial Crowdsourcing (SC) has emerged as a new transformative platform that engages mobile users to perform spatio-temporal tasks by physically traveling to specified locations. Thus, various SC techniques have been studied for performance optimization, among which one of the major challenges is how to assign workers the tasks that they are really interested in and willing to perform. In this paper, we propose a novel preference-aware spatial task assignment system based on workers’ temporal preferences, which consists of two components: History-based Context-aware Tensor Decomposition (HCTD) for workers’ temporal preferences modeling and preference-aware task assignment. We model worker preferences with a three-dimension tensor (worker-task-time). Supplementing the missing entries of the tensor through HCTD with the assistant of historical data and other two context matrices, we recover worker preferences for different categories of tasks in different time slots. Several preference-aware task assignment algorithms are then devised, aiming to maximize the total number of task assignments at every time instance, in which we give higher priorities to the workers who are more interested in the tasks. We conduct extensive experiments using a real dataset, verifying the practicability of our proposed methods.

IJCAI Conference 2019 Conference Paper

Profit-driven Task Assignment in Spatial Crowdsourcing

  • Jinfu Xia
  • Yan Zhao
  • Guanfeng Liu
  • Jiajie Xu
  • Min Zhang
  • Kai Zheng

In Spatial Crowdsourcing (SC) systems, mobile users are enabled to perform spatio-temporal tasks by physically traveling to specified locations with the SC platforms. SC platforms manage the systems and recruit mobile users to contribute to the SC systems, whose commercial success depends on the profit attained from the task requesters. In order to maximize its profit, an SC platform needs an online management mechanism to assign the tasks to suitable workers. How to assign the tasks to workers more cost-effectively with the spatio-temporal constraints is one of the most difficult problems in SC. To deal with this challenge, we propose a novel Profit-driven Task Assignment (PTA) problem, which aims to maximize the profit of the platform. Specifically, we first establish a task reward pricing model with tasks' temporal constraints (i. e. , expected completion time and deadline). Then we adopt an optimal algorithm based on tree decomposition to achieve the optimal task assignment and propose greedy algorithms to improve the computational efficiency. Finally, we conduct extensive experiments using real and synthetic datasets, verifying the practicability of our proposed methods.

ICRA Conference 2010 Conference Paper

A micromanipulation system for single cell deposition

  • Zhe Lu
  • Christopher Moraes
  • Yan Zhao
  • LiDan You
  • Craig A. Simmons
  • Yu Sun 0001

Many microfabricated devices have been developed to quantify cellular response to a multitude of stimuli at a single-cell level in a high throughput manner. These single-cell studies require cells to be individually positioned at defined locations on a microdevice. This paper presents a micromanipulation system for automated pick-place of single cells. Integrating computer vision and motion control algorithms, the system visually tracks a cell in real time and controls multiple motion devices coordinately. Via fine manipulation of picoliter fluids and pressure of a few Pascals, the system accurately picks up a single cell, transfers the cell, and deposits it at a target location at a speed of 15-30 sec/cell. The micromanipulation system has the advantages of non-invasiveness, high specificity, and high precision. It is suitable to pick-place both non-labeled and labeled cells and applicable to standard cell culture substrates and microdevices with an open top.