Author name cluster

Junjie Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

20 papers

2 author rows

EAAI Journal 2026 Journal Article

Exercise quality assessment in monocular video streaming

Yongchang Zhang
Boxuan Xu
Zhaowen Lin
Junjie Li
Anlong Ming

The recent proliferation of home-based exercise content has garnered significant attention. This has led to an increasing demand for Artificial Intelligence (AI) devices capable of automatically assessing exercise quality and providing guidance. However, existing real-time exercise quality assessment algorithms require instructors and learners to share similar camera views. Furthermore, these methods often rely on pre-labeled data, support a limited number of exercise actions, and offer restricted feedback. Among the numerous videos where data is hard to pre-label, instructors and learners may have unrestricted camera views and inconsistent body shape, while instructors may demonstrate unpredicted actions. To address the aforementioned challenges, we propose a method for Exercise Quality Assessment in Monocular Videos (MV-EQA), which incorporates a Skeleton Mapping and View Aligning (SMVA) module, a Multi-Feature Dynamic Time Warping (MF-DTW) module, and online/offline Exercise Quality Assessment (EQA) modules. Specifically, SMVA utilizes a lightweight encoder–decoder network based on transformer architecture that effectively handles differences in view and skeleton between learners and instructors while preserving inherent variations in their movements; MF-DTW utilizes multiple body information for temporal alignment; online/offline EQA modules enable online feedback (scoring with visual comparison) and offline feedback (reviews with comments). Extensive experiments indicate the superiority of our approach over other methods in EQA tasks. The code is available at link.

Details DOI

AAAI Conference 2026 Conference Paper

Training-free Boosting for Few-shot Segmentation via Generalizing Semantic Mining

Kangyu Xiao
Zilei Wang
Yixin Zhang
Junjie Li

Few-shot Semantic Segmentation (FSS) aims to segment the novel target objects with the guidance of minimal annotated reference examples. The affinity-based method has great advantages in the FSS inference stage for both specialist model and foundation model. However, current affinity calculation merely relies on only support-query matching, without considering the query-specific semantic or the semantic correlation among inter-support samples, which limits the representation ability of affinity map. In this paper, we propose the Generalizing Semantic Mining (GSM) that focuses on exploiting generalizing semantic to improve the affinity calculation. Concretely, we first organize the affinity-based inference into three main steps to reveal the crucial role of affinity map. To address the low-data problem, Target Semantic Reusing module considers the query sample as a proxy reference and assigns it with proxy mask identifying its most generalizing semantic regions. Then, to generate the high-fidelity proxy mask, Query-specific Semantic Modeling module pinpoints the most generalizing regions through prior semantic analysis. Finally, Representative Re-weighting module explicitly modulates affinity calculation via generalization-aware weighting. Experiments on FSS benchmarks demonstrate that our GSM can serve as a plug-and-play free lunch for both specialist models and foundation models.

PDF Details DOI

AAAI Conference 2026 Conference Paper

UVLM: Benchmarking Video Language Model for Underwater World Understanding

Xizhe Xue
Yang Zhou
Dawei Yan
Lijie Tao
Junjie Li
Ying Li
Haokui Zhang
Rong Xiao

Recently, video-language models (VidLMs) have gained widespread attention and adoption. However, existing works primarily focus on terrestrial scenarios, overlooking the highly demanding application needs of underwater observation. To overcome this gap, we introduce UVLM, an under water observation benchmark which is build through a collaborative approach combining human expertise and AI models. To ensure data quality, we have conducted in-depth considerations from multiple perspectives. First, to address the unique challenges of underwater environments, we selected videos that represent typical underwater challenges including light variations, water turbidity, and diverse viewing angles to construct the dataset. Second, to ensure data diversity, the dataset covers a wide range of frame rates, resolutions, 419 classes of marine animals, and various static plants and terrains. Next, for task diversity, we adopted a structured design where observation targets are categorized into two major classes: biological and environmental. Each category includes content observation and change/action observation, totaling 20 subtask types. Finally, we designed several challenging evaluation metrics to enable quantitative comparison and analysis of different methods. Experiments on two representative VidLMs demonstrate that fine-tuning VidLMs on UVLM significantly improves underwater world understanding while also showing potential for slight improvements on existing in-air VidLM benchmarks.

PDF Details DOI

EAAI Journal 2025 Journal Article

A hyperparameter-fusion neural networks for deposition prediction

Li Ding
Kun Pang
Junjie Li
Hua Shao
Nan Liu
Rui Chen
Zhiqiang Li
Zhenjie Yao

As integrated circuit manufacturing processes develop into the nanometer scale, precise control and prediction of the deposition process have become crucial. Nanoscale manufacturing imposes unprecedentedly high demands on film quality, uniformity, and consistency, presenting significant challenges to traditional control and prediction methodologies. This study proposes a novel approach that, for the first time, formulates the thin-film deposition process as a video prediction task, enabling the use of deep learning for morphological forecasting under varying process conditions, and introduces a novel hyperparameter-fusion neural network, referred to as DepositionNet (DepoNet). Unlike conventional video prediction models, DepoNet specifically accounts for the influence of deposition parameters on the entire simulation process. We have incorporated a novel Hyper Projector that allows the model to flexibly adapt to varying deposition conditions and material characteristics. Through comprehensive comparative experimental analyses, we demonstrate that DepoNet significantly outperforms existing deep-learning models and achieves a mean squared error of 17. 34, representing a 3. 67% improvement over the second best model and a 1, 435 × speedup over physics-based methods, thereby validating its exceptional generalization capability. Extensive experiments reveal that the model maintains high performance even under conditions of limited training data, for instance, achieving a peak signal-to-noise ratio (PSNR) of 41. 516 decibels (dB) when trained with only 20% of the available data.

Details DOI

EAAI Journal 2025 Journal Article

Dynamic mask network based on spiking neural convolutional model for missing modality brain tumor segmentation

Junjie Li
Rui Cai
Bing Li
Hong Peng

Brain tumor segmentation is a medical image processing task aimed at accurately locating and isolating tumor regions from brain scan images (e. g. , Magnetic Resonance Imaging, MRI) in order to help doctors in diagnosis, treatment planning and surgical navigation. Automatic brain tumor segmentation is extremely challenging due to incomplete feature representation in the case of missing modalities and insufficient inter-modal information interaction. To this end, this paper proposes a novel dynamic threshold mask Transformer network for the missing modality brain tumor segmentation task, which is designed based on the nonlinear spiking neural convolutional model. The network consists of four independent encoders and a shared decoder to extract the features of each modality and perform shared representation learning. Among them, the dynamic threshold mask Transformer introduces learnable embedding vectors, generates dynamic masks on top of static masks to achieve fine-grained feature filtering, and enhances the ability of inter-modal information interaction. The adaptive gating weighting module and the channel cross spiking neural P attention module fuse modal features layer by layer in both spatial and channel dimensions to strengthen the modeling capability of local and global features. We conducted extensive comparative experiments on different missing modal cases in the BraTS2020 and BraTS2018 datasets. The experimental results show that the method effectively improves the robustness of missing modalities and the performance of brain tumor segmentation while maintaining the computational efficiency, and has good generalization ability and practicality.

Details DOI

ICRA Conference 2025 Conference Paper

Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-Based Planner and Graph-Based Policy

Ziqi Jia
Junjie Li
Xiaoyang Qu
Jianzong Wang

Multi-agent systems (MAS) have shown great potential in executing complex tasks, but coordination and safety remain significant challenges. Multi-Agent Reinforcement Learning (MARL) offers a promising framework for agent collaboration, but it faces difficulties in handling complex tasks and designing reward functions. The introduction of Large Language Models (LLMs) has brought stronger reasoning and cognitive abilities to MAS, but existing LLM-based systems struggle to respond quickly and accurately in dynamic environments. To address these challenges, we propose LLM-based Graph Collaboration MARL (LGC-MARL), a framework that efficiently combines LLMs and MARL. This framework decomposes complex tasks into executable subtasks and achieves efficient collaboration among multiple agents through graph-based coordination. Specifically, LGC-MARL consists of two main components: an LLM planner and a graph-based collaboration meta policy. The LLM planner transforms complex task instructions into a series of executable subtasks, evaluates the rationality of these subtasks using a critic model, and generates an action dependency graph. The graph-based collaboration meta policy facilitates communication and collaboration among agents based on the action dependency graph, and adapts to new task environments through meta-learning. Experimental results on the AI2-THOR simulation platform demonstrate the superior performance and scalability of LGC-MARL in completing various complex tasks.

Details

EAAI Journal 2025 Journal Article

Etching process prediction based on cascade recurrent neural network

Zhenjie Yao
Ziyi Hu
Panpan Lai
Fengling Qin
Wenrui Wang
Zhicheng Wu
Lingfei Wang
Hua Shao

Etching is one of the most critical processes in semiconductor manufacturing. Etch models have been developed to reveal the underlying etch mechanisms, which employs rigorous physical and chemical process simulation. Traditional simulation is very time consuming. The data-driven artificial intelligence model provides an alternative modeling approach. In this paper, a Cascade Recurrent Neural Networks (CRNN) is proposed to model and predict etching profiles. The etching profile is represented by polar coordinates and modeled by the recurrent neural networks, the corresponding etching parameters (e. g. , pressure, power, temperature, and voltage) are integrated into the network through cascade combination layers. Experimental results on a dataset of 10, 000 simulated etching profiles demonstrated the effectiveness of our method: compared with traditional etching simulation methods, CRNN can speedup 21, 000 × with an average error of less than 0. 7 nm for 1 step prediction. Furthermore, compared to simple deep neural networks, the Mean Absolute Errors (MAE) could be reduced from 1. 7329 nm to 1. 3845 nm for 10 steps prediction. Finally, the effectiveness and accuracy of CRNN etching predictor is validated through fine-tuning on experimental data.

Details DOI

ICLR Conference 2025 Conference Paper

MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

Junjie Li
Yang Liu 0278
Weiqing Liu
Shikai Fang
Lewen Wang
Chang Xu 0008
Jiang Bian 0002

Generative models aim to simulate realistic effects of various actions across different contexts, from text generation to visual effects. Despite significant efforts to build real-world simulators, the application of generative models to virtual worlds, like financial markets, remains under-explored. In financial markets, generative models can simulate complex market effects of participants with various behaviors, enabling interaction under different market conditions, and training strategies without financial risk. This simulation relies on the finest structured data in financial market like orders thus building the finest realistic simulation. We propose Large Market Model (LMM), an order-level generative foundation model, for financial market simulation, akin to language modeling in the digital world. Our financial Market Simulation engine (MarS), powered by LMM, addresses the domain-specific need for realistic, interactive and controllable order generation. Key observations include LMM's strong scalability across data size and model complexity, and MarS's robust and practicable realism in controlled generation with market impact. We showcase MarS as a forecast tool, detection system, analysis platform, and agent training environment, thus demonstrating MarS's ``paradigm shift'' potential for a variety of financial applications. We release the code of MarS at https://github.com/microsoft/MarS/.

Details

JBHI Journal 2025 Journal Article

Pathological Image Segmentation of Breast Cancer via Template Matching

Kaixiang Yan
Yanyu Liu
Junjie Li
Jinde Cao
Dongming Zhou

Accurate pathological image segmentation is crucial for the clinical diagnosis of breast cancer. However, existing methods of pathological segmentation face challenges due to the variability and complexity of breast cancer on pathological images. To address these issues, we propose a novel segmentaion network called template-matching pathological segmentation network. Our method incorporates an innovative template matching strategy inspired by the diagnostic process of pathologists. The template matching strategy is to utilize visual transformer to establish a correlative relationship between cancer lesions and corresponding templates. To improve feature utilization of pathological images, PSVTNet introduces detailed information attention and information entropy attention. Detailed information attention aims to exploit detailed information by serving as the path connecting shallow-layer and deep-layer features. Meanwhile, information entropy attention can redistribute feature weights to high-entropy regions according to the information-entropy attention map. Additionally, this work releases a comprehensive pathological dataset that comprises labeled pathological images. These images are collected from breast and stomach cancers with hematoxylin&eosin and human epidermal growth factor receptor-2 staining. Extensive experiments demonstrate that PSVTNet significantly outperforms state-of-the-art methods on pathologic images of breast cancer, but can also process pathologic images of stomach cancer carrying with same diagnosed features as the breast cancer.

Details DOI

JBHI Journal 2025 Journal Article

Swallow-PPG: Photoplethysmography Templates for Comprehensive Temporal Analysis of Swallowing Anatomical Actions

Ying Zhang
Junjie Li
Ping Wang
Huaiyu Zhu
Bo Wang
Wei Luo
Yun Pan

In clinical practice, Videofluoroscopic Swallowing Study (VFSS) is commonly used to monitor the activity of anatomical structures during swallowing. However, it is limited by ionizing radiation exposure, adverse effects of barium contrast agents, and the high cost of specialized equipment. In this study, we propose a framework for analyzing swallowing behaviors in photoplethysmography (PPG) waveforms, which includes generalizing the manifestation of swallowing in PPG (i. e. , swallowing templates generation) and conducting comprehensive temporal analysis of swallowing anatomical actions (TASAA). For swallowing templates generation, we cluster and average the samples to obtain waveforms of templates, followed by conducting shape-based mapping and averaging on 28 time indicators to derive template unified time indicators (TUTIs). For comprehensive TASAA, we leverage templates waveforms and TUTIs to estimate time indicators based on the mapping relationship between samples and their respective templates. We evaluate the proposed framework on 357 swallowing PPG samples from 41 elderly subjects. The average relative error across all time indicators is 0. 123, and 6 indicators notably excel with errors below 0. 1. The proposed template-based swallowing analysis framework is expected to become a low-cost and non-ionizing alternative to VFSS for comprehensive TASAA.

Details DOI

IROS Conference 2024 Conference Paper

A Piecewise-weighted RANSAC Method Utilizing Abandoned Hypothesis Model Information with a New Application on Robot Self-calibration

Jianhui He
Yiyang Feng
Guilin Yang
Wenjun Shen
Si-Lu Chen 0001
Tianjiang Zheng
Junjie Li

Industrial robots and collaborative robots are widely employed in industry and are progressively being utilized to assist individuals in their daily routines. To improve their absolute accuracy, self-calibration methods using portable local measurement devices are cost-effective solutions. However, compared with the conventional external calibration methods, self-calibration methods employing two configurations as a calibration sample introduce more non-kinematic errors to the robot. Therefore, noise reduction is significantly necessary in self-calibration. A novel Piecewise-weighted Random Sample Consensus (RANSAC) method is proposed in this paper. Instead of choosing an optimal model with all inliers, the proposed method employs a general weight considering both the sample and hypothesis model qualities to generate a new model with Weighted Least Square (WLS) method. Besides, the proposed method turns the target of finding an uncontaminated set of inliers into the training of the proper weight coefficient for WLS, which not only improves the accuracy but also greatly enhances the speed. The self-calibration experiment on a 6 degree-of-freedom(DOF) robot CR10 shows that the accuracy of the proposed Piecewise-weighted RANSAC method makes a 27. 7% accuracy improvement from that employing Least Square method, a 20. 0% accuracy improvement from that employing standard RANSAC method, and a 5. 5% accuracy improvement from that employing LO-RANSAC method. Besides, the proposed method is also over 10. 9 times faster than the standard RANSAC method and 18. 6 times faster than the LO-RANSAC method.

Details

EAAI Journal 2024 Journal Article

Multi-objective optimization of a laterally perforated-finned heat sink with computational fluid dynamics method and statistical modeling using response surface methodology

Junjie Li
Dheyaa J. Jasim
Dler Hussein Kadir
Hamid Maleki
Navid Nasajpour Esfahani
Mahmoud Shamsborhan
Davood Toghraie

The present paper examines turbulent flow and heat transfer characteristics of a three-dimensional laterally perforated finned heat sink (LPFHS) using response surface methodology (RSM) and computational fluid dynamics (CFD) approaches. The effects of the Reynolds number and perforations geometry, such as porosity (size) and cross-sectional shape (square, circular, hexagonal, and triangular), were investigated on the average friction factor and average Nusselt number. RSM is an applicable approach to improve process conditions by analyzing the impact of different factors and their interdependence on the measured outcomes during a specific technological operation. Three accurate models for the percentage of heat transfer enhancement (PHTE), the percentage of friction factor reduction (PFCR), and the percentage of weight reduction (PWR) as the most important design objectives for heat sinks were proposed using the RSM technique. RSM models served as the foundation for two- and three-objective optimizations. Results demonstrate that fins with square perforations exhibited the highest PHTE value. Additionally, fins with square, circular, and hexagonal perforations experienced increased PFCR with rising porosity at different Reynolds numbers. Fins with square perforations were particularly effective in PWR because the perforations matched well with the lateral surfaces of the fin. The optimal conditions for the LPFHS were achieved using square perforations with Reynolds number and porosity values of 2101 and 0. 494, respectively. In the optimal case, PHTE, PFCR, and PWR increased by 7. 965%, 24. 198%, and 49. 367%, respectively, compared to the base case (solid fin). In addition, the optimization strategy implemented in the present study allows weighting of the objectives, which leads to flexible designs. By assigning importance coefficients to each objective, the designer can obtain a set of optimal designs that meet different requirements or trade-offs.

Details DOI

IJCAI Conference 2024 Conference Paper

Probabilistic Contrastive Learning for Domain Adaptation

Junjie Li
Yixin Zhang
Zilei Wang
Saihui Hou
Keyu Tu
Man Zhang

Contrastive learning has shown impressive success in enhancing feature discriminability for various visual tasks in a self-supervised manner, but the standard contrastive paradigm (features+l2 normalization) has limited benefits when applied in domain adaptation. We find that this is mainly because the class weights (weights of the final fully connected layer) are ignored in the domain adaptation optimization process, which makes it difficult for features to cluster around the corresponding class weights. To solve this problem, we propose the simple but powerful Probabilistic Contrastive Learning (PCL), which moves beyond the standard paradigm by removing l2 normalization and replacing the features with probabilities. PCL can guide the probability distribution towards a one-hot configuration, thus minimizing the discrepancy between features and class weights. We conduct extensive experiments to validate the effectiveness of PCL and observe consistent performance gains on five tasks, i. e. , Unsupervised/Semi-Supervised Domain Adaptation (UDA/SSDA), Semi-Supervised Learning (SSL), UDA Detection and Semantic Segmentation. Notably, for UDA Semantic Segmentation on SYNTHIA, PCL surpasses the sophisticated CPSL-D by 2% in terms of mean IoU with a much lower training cost (PCL: 1*3090, 5 days v. s. CPSL-D: 4*V100, 11 days). Code is available at https: //github. com/ljjcoder/Probabilistic-Contrastive-Learning.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Exploit Domain-Robust Optical Flow in Domain Adaptive Video Semantic Segmentation

Yuan Gao
Zilei Wang
Jiafan Zhuang
Yixin Zhang
Junjie Li

Domain adaptive semantic segmentation aims to exploit the pixel-level annotated samples on source domain to assist the segmentation of unlabeled samples on target domain. For such a task, the key is to construct reliable supervision signals on target domain. However, existing methods can only provide unreliable supervision signals constructed by segmentation model (SegNet) that are generally domain-sensitive. In this work, we try to find a domain-robust clue to construct more reliable supervision signals. Particularly, we experimentally observe the domain-robustness of optical flow in video tasks as it mainly represents the motion characteristics of scenes. However, optical flow cannot be directly used as supervision signals of semantic segmentation since both of them essentially represent different information. To tackle this issue, we first propose a novel Segmentation-to-Flow Module (SFM) that converts semantic segmentation maps to optical flows, named the segmentation-based flow (SF), and then propose a Segmentation-based Flow Consistency (SFC) method to impose consistency between SF and optical flow, which can implicitly supervise the training of segmentation model. The extensive experiments on two challenging benchmarks demonstrate the effectiveness of our method, and it outperforms previous state-of-the-art methods with considerable performance improvement. Our code is available at https://github.com/EdenHazardan/SFC.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

GAIA: Delving into Gradient-based Attribution Abnormality for Out-of-distribution Detection

Jinggang Chen
Junjie Li
Xiaoyang Qu
Jianzong Wang
Jiguang Wan
Jing Xiao

Detecting out-of-distribution (OOD) examples is crucial to guarantee the reliability and safety of deep neural networks in real-world settings. In this paper, we offer an innovative perspective on quantifying the disparities between in-distribution (ID) and OOD data---analyzing the uncertainty that arises when models attempt to explain their predictive decisions. This perspective is motivated by our observation that gradient-based attribution methods encounter challenges in assigning feature importance to OOD data, thereby yielding divergent explanation patterns. Consequently, we investigate how attribution gradients lead to uncertain explanation outcomes and introduce two forms of abnormalities for OOD detection: the zero-deflation abnormality and the channel-wise average abnormality. We then propose GAIA, a simple and effective approach that incorporates Gradient Abnormality Inspection and Aggregation. The effectiveness of GAIA is validated on both commonly utilized (CIFAR) and large-scale (ImageNet-1k) benchmarks. Specifically, GAIA reduces the average FPR95 by 23. 10% on CIFAR10 and by 45. 41% on CIFAR100 compared to advanced post-hoc methods.

PDF Details

IJCAI Conference 2022 Conference Paper

Targeted Multimodal Sentiment Classification based on Coarse-to-Fine Grained Image-Target Matching

Jianfei Yu
Jieming Wang
Rui Xia
Junjie Li

Targeted Multimodal Sentiment Classification (TMSC) aims to identify the sentiment polarities over each target mentioned in a pair of sentence and image. Existing methods to TMSC failed to explicitly capture both coarse-grained and fine-grained image-target matching, including 1) the relevance between the image and the target and 2) the alignment between visual objects and the target. To tackle this issue, we propose a new multi-task learning architecture named coarse-to-fine grained Image-Target Matching network (ITM), which jointly performs image-target relevance classification, object-target alignment, and targeted sentiment classification. We further construct an Image-Target Matching dataset by manually annotating the image-target relevance and the visual object aligned with the input target. Experiments on two benchmark TMSC datasets show that our model consistently outperforms the baselines, achieves state-of-the-art results, and presents interpretable visualizations.

PDF Details DOI

YNICL Journal 2022 Journal Article

Thalamic-insomnia phenotype in E200K Creutzfeldt–Jakob disease: A PET/MRI study

Hong Ye
Min Chu
Zhongyun Chen
Kexin Xie
Li Liu
Haitian Nan
Yue Cui
Jing Zhang

BACKGROUND: Insomnia and thalamic involvement were frequently reported in patients with genetic Creutzfeldt-Jakob disease (gCJD) with E200K mutations, suggesting E200K might have discrepancy with typical sporadic CJD (sCJD). The study aimed to explore the clinical and neuroimage characteristics of genetic E200K CJD patients by comprehensive neuroimage analysis. METHODS: Six patients with gCJD carried E200K mutation on Prion Protein (PRNP) gene, 13 patients with sporadic CJD, and 22 age- and sex-matched normal controls were enrolled in the study. All participants completed a hybrid positron emission tomography/magnetic resonance imaging (PET/MRI) examination. Signal intensity on diffusion-weighted imaging (DWI) and metabolism on PET were visually rating analyzed, statistical parameter mapping analysis was performed on PET and 3D-T1 images. Clinical and imaging characteristics were compared between the E200K, sCJD, and control groups. RESULTS: There was no group difference in age or gender among the E200K, sCJD, and control groups. Insomnia was a primary complaint in patients with E200K gCJD (4/2 versus 1/12, p = 0.007). Hyperintensity on DWI and hypometabolism on PET of the thalamus were observed during visual rating analysis of images in patients with E200K gCJD. Gray matter atrophy (uncorrected p < 0.001) and hypometabolism (uncorrected p < 0.001) of the thalamus were more pronounced in patients with E200K gCJD. CONCLUSION: The clinical and imaging characteristics of patients with gCJD with PRNP E200K mutations manifested as a thalamic-insomnia phenotype. PET is a sensitive approach to help identify the functional changes in the thalamus in prion disease.

Details DOI

AAAI Conference 2021 Conference Paper

Learning Intact Features by Erasing-Inpainting for Few-shot Classification

Junjie Li
Zilei Wang
Xiaoming Hu

Few-shot classification aims to categorize the samples from unseen classes with only few labeled samples. To address such a challenge, many methods exploit a base set consisting of massive labeled samples to learn an instance embedding function, i. e. , image feature extractor, and it is expected to possess good transferability among different tasks. Such characteristics of few-shot learning are essentially different from that of traditional image classification only pursuing to get discriminative image representations. In this paper, we propose to learn intact features by erasing-inpainting for fewshot classification. Specifically, we argue that extracting intact features of target objects is more transferable, and then propose a novel cross-set erasing-inpainting (CSEI) method. CSEI processes the images in the support set using erasing and inpainting, and then uses them to augment the query set of the same task. Consequently, the feature embedding produced by our proposed method can contain more complete information of target objects. In addition, we propose taskspecific feature modulation to make the features adaptive to the current task. The extensive experiments on two widely used benchmarks well demonstrates the effectiveness of our proposed method, which can consistently get considerable performance gains for different baseline methods.

PDF Details

YNIMG Journal 2021 Journal Article

Tau-related white-matter alterations along spatially selective pathways

Qiuting Wen
Shannon L. Risacher
Linhui Xie
Junjie Li
Jaroslaw Harezlak
Martin R. Farlow
Frederick W. Unverzagt
Sujuan Gao

Progressive accumulation of tau neurofibrillary tangles in the brain is a defining pathologic feature of Alzheimer's disease (AD). Tau pathology exhibits a predictable spatiotemporal spreading pattern, but the underlying mechanisms of this spread are poorly understood. Although AD is conventionally considered a disease of the gray matter, it is also associated with pronounced and progressive deterioration of the white matter (WM). A link between abnormal tau and WM degeneration is suggested by findings from both animal and postmortem studies, but few studies demonstrated their interplay in vivo. Recent advances in diffusion magnetic resonance imaging and the availability of tau positron emission tomography (PET) have made it possible to evaluate the association of tau and WM degeneration (tau-WM) in vivo. In this study, we explored the spatial pattern of tau-WM associations across the whole brain to evaluate the hypothesis that tau deposition is associated with WM microstructural alterations not only in isolated tracts, but in continuous structural connections in a stereotypic pattern. Sixty-two participants, including 22 cognitively normal subjects, 22 individuals with subjective cognitive decline, and 18 with mild cognitive impairment were included in the study. WM characteristics were inferred by classic diffusion tensor imaging (DTI) and a complementary diffusion compartment model - neurite orientation dispersion and density imaging (NODDI) that provides a proxy for axonal density. A data-driven iterative searching (DDIS) approach, coupled with whole-brain graph theory analyses, was developed to continuously track tau-WM association patterns. Without applying prior knowledge of the tau spread, we observed a distinct spatial pattern that resembled the typical propagation of tau pathology in AD. Such association pattern was not observed between diffusion and amyloid-β PET signal. Tau-related WM degeneration is characterized by an increase in the mean diffusivity (with a dominant change in the radial direction) and a decrease in the intra-axonal volume fraction. These findings suggest that cortical tau deposition (as measured in tau PET) is associated with a lower axonal packing density and greater diffusion freedom. In conclusion, our in vivo findings using a data-driven method on cross-sectional data underline the important role of WM alterations in the AD pathological cascade with an association pattern similar to the postmortem Braak staging of AD. Future studies will focus on longitudinal analyses to provide in vivo evidence of tau pathology spreads along neuroanatomically connected brain areas.

Details DOI

AAAI Conference 2019 Conference Paper

Towards Personalized Review Summarization via User-Aware Sequence Network

Junjie Li
Haoran Li
Chengqing Zong

We address personalized review summarization, which generates a condensed summary for a user’s review, accounting for his preference on different aspects or his writing style. We propose a novel personalized review summarization model named User-aware Sequence Network (USN) to consider the aforementioned users’ characteristics when generating summaries, which contains a user-aware encoder and a useraware decoder. Specifically, the user-aware encoder adopts a user-based selective mechanism to select the important information of a review, and the user-aware decoder incorporates user characteristic and user-specific word-using habits into word prediction process to generate personalized summaries. To validate our model, we collected a new dataset Trip, comprising 536, 255 reviews from 19, 400 users. With quantitative and human evaluation, we show that USN achieves state-ofthe-art performance on personalized review summarization.

PDF Details