Author name cluster

Ke Yan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

26 papers

2 author rows

EAAI Journal 2026 Journal Article

Context-aware and deformation-adaptive small unmanned aerial vehicles detection via parallel attention and multi-scale fusion

Hang Yu
Jialin Bao
Yibo Sun
Suiping Zhou
Zhengfa Yu
Yilu Chen
Ke Yan
Yifan Yang

Details DOI

AAAI Conference 2026 Conference Paper

D²Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning

Evelyn Zhang
Fufu Yu
Aoqi Wu
Zichen Wen
Ke Yan
Shouhong Ding
Biqing Qi
Linfeng Zhang

Processing long visual token sequences poses a significant computational burden on Multimodal Large Language Models (MLLMs). While token pruning offers a path to acceleration, we find that current methods, while adequate for general understanding, catastrophically fail on fine-grained localization tasks. We attribute this failure to the inherent flaws of the two prevailing strategies: importance-based methods suffer from a strong positional bias, an inherent model artifact that distracts from semantic content, while diversity-based methods exhibit structural blindness, disregarding the user's prompt and spatial redundancy. To address this, we introduce D²Pruner, a framework that rectifies these issues by uniquely combining debiased importance with a structural pruning mechanism. Our method first secures a core set of the most critical tokens as pivots based on a debiased attention score. It then performs a Maximal Independent Set (MIS) selection on the remaining tokens, which are modeled on a hybrid graph where edges signify spatial proximity and semantic similarity. This process iteratively preserves the most important and available token while removing its neighbors, ensuring that the supplementary tokens are chosen to maximize importance and diversity. Extensive experiments demonstrate that D²Pruner achieves exceptional efficiency and fidelity.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Extracting Events Like Code: A Multi-Agent Programming Framework for Zero-Shot Event Extraction

Quanjiang Guo
Sijie Wang
Jinchuan Zhang
Ben Zhang
Zhao Kang
Ling Tian
Ke Yan

Zero-shot event extraction (ZSEE) remains a significant challenge for large language models (LLMs) due to the need for complex reasoning and domain-specific understanding. Direct prompting often yields incomplete or structurally invalid outputs—such as misclassified triggers, missing arguments, and schema violations. To address these limitations, we present Agent-Event-Coder (AEC), a novel multi-agent framework that treats event extraction like software engineering: as a structured, iterative code-generation process. AEC decomposes ZSEE into specialized subtasks—retrieval, planning, coding, and verification—each handled by a dedicated LLM agent. Event schemas are represented as executable class definitions, enabling deterministic validation and precise feedback via a verification agent. This programming-inspired approach allows for systematic disambiguation and schema enforcement through iterative refinement. By leveraging collaborative agent workflows, AEC enables LLMs to produce precise, complete, and schema-consistent extractions in zero-shot settings. Experiments across five diverse domains and six LLMs demonstrate that AEC consistently outperforms prior zero-shot baselines, showcasing the power of treating event extraction like code generation.

PDF Details DOI

AAAI Conference 2026 Conference Paper

ITPP: Learning Disentangled Event Dynamics in Marked Temporal Point Processes

Wang-Tao Zhou
Zhao Kang
Ke Yan
Ling Tian

Marked Temporal Point Processes (MTPPs) provide a principled framework for modeling asynchronous event sequences by conditioning on the history of past events. However, most existing MTPP models rely on channel-mixing strategies that encode information from different event types into a single, fixed-size latent representation. This entanglement can obscure type-specific dynamics, leading to performance degradation and increased risk of overfitting. In this work, we introduce ITPP, a novel channel-independent architecture for MTPP modeling that decouples event type information using an encoder-decoder framework with an ODE-based backbone. Central to ITPP is a type-aware inverted self-attention mechanism, designed to explicitly model inter-channel correlations among heterogeneous event types. This architecture enhances effectiveness and robustness while reducing overfitting. Comprehensive experiments on multiple real-world and synthetic datasets demonstrate that ITPP consistently outperforms state-of-the-art MTPP models in both predictive accuracy and generalization.

PDF Details DOI

AAAI Conference 2026 Conference Paper

MUSE: Multi-Scale Dense Self-Distillation for Nucleus Detection and Classification

Zijiang Yang
Hanqing Chao
Bokai Zhao
Yelin Yang
Yunshuo Zhang
Dongmei Fu
Junping Zhang
Le Lu

Nucleus detection and classification (NDC) in histopathology analysis is a fundamental task that underpins a wide range of high-level pathology applications. However, existing methods heavily rely on labor-intensive nucleus-level annotations and struggle to fully exploit large-scale unlabeled data for learning discriminative nucleus representations. In this work, we propose MUSE (MUlti-scale denSE self-distillation), a novel self-supervised learning method tailored for NDC. At its core is NuLo (Nucleus-based Local self-distillation), a coordinate-guided mechanism that enables flexible local self-distillation based on predicted nucleus positions. By removing the need for strict spatial alignment between augmented views, NuLo allows critical cross-scale alignment, thus unlocking the capacity of models for fine-grained nucleus-level representation. To support MUSE, we design a simple yet effective encoder-decoder architecture and a large field-of-view semi-supervised fine-tuning strategy that together maximize the value of unlabeled pathology images. Extensive experiments on three widely used benchmarks demonstrate that MUSE effectively addresses the core challenges of histopathological NDC. The resulting models not only surpass state-of-the-art supervised baselines but also outperform generic pathology foundation models.

PDF Details DOI

EAAI Journal 2026 Journal Article

Semi-supervised image segmentation via selective self-ensembling and boundary uncertainty suppression

Xiaoguo Yang
Yabo Wu
Shouxiang Ni
Hongmei He
Hao Zhang
Yu Chen
Ke Yan
Chuoying Tan

Details DOI

EAAI Journal 2025 Journal Article

A Transformer-based self-supervised learning model for fault diagnosis of air-conditioning systems with limited labeled data

Mei Hua
Ke Yan
Xin Li

Details DOI

JBHI Journal 2025 Journal Article

Domain Anchored Features for Classification of OCT Images

Zhiyu Ning
Ke Yan
Zhiyuan Ning
Changyang Li
Cong Liu
Xun Xu
Kun Liu
Yupeng Xu

Optical coherence tomography is a crucial imaging technique for the detection and analysis of retinal diseases. Precise classification of optical coherence tomography images helps ophthalmologists and healthcare providers design personalized treatment plans in clinical practice. In this paper, we focus on optical coherence tomography image classification for seven types of retinal diseases and normal retina. Although existing deep neural networks could be applied to optical coherence tomography images for the classification, the features were extracted within same hyperspace, causing “feature congestion”. Moreover, the class of normal retina was regarded as a “type of retinal disease”, impeding the extraction of true imaging structures for retinal diseases. To deal with the two issues, we innovate a deep neural network module to enhance imaging features so that the enhanced features are more distinct for classification. Consistent to medical findings, we propose two domains of retinal diseases and anchor imaging features onto cross-domains. We tested and evaluated our model on two datasets for eight-classes and four-classes classification, respectively. Our experimental results demonstrated that the proposed module outperforms state-of-the-art methods. We also conducted ablation studies and sensitivity tests for comprehensive evaluation of our method.

Details DOI

JBHI Journal 2025 Journal Article

Med-Query: Steerable Parsing of 9-DoF Medical Anatomies With Query Embedding

Heng Guo
Jianfeng Zhang
Ke Yan
Le Lu
Minfeng Xu

Automatic parsing of human anatomies at the instance-level from 3D computed tomography (CT) is a prerequisite step for many clinical applications. The presence of pathologies, broken structures or limited field-of-view (FOV) can all make anatomy parsing algorithms vulnerable. In this work, we explore how to leverage and implement the successful detection-then-segmentation paradigm for 3D medical data, and propose a steerable, robust, and efficient computing framework for detection, identification, and segmentation of anatomies in CT scans. Considering the complicated shapes, sizes, and orientations of anatomies, without loss of generality, we present a nine degrees of freedom (9-DoF) pose estimation solution in full 3D space using a novel single-stage, non-hierarchical representation. Our whole framework is executed in a steerable manner where any anatomy of interest can be directly retrieved to further boost inference efficiency. We have validated our method on three medical imaging parsing tasks: ribs, spine, and abdominal organs. For rib parsing, CT scans have been annotated at the rib instance-level for quantitative evaluation, similarly for spine vertebrae and abdominal organs. Extensive experiments on 9-DoF box detection and rib instance segmentation demonstrate the high efficiency and effectiveness of our framework (with the identification rate of 97. 0% and the segmentation Dice score of 90. 9%), compared favorably against several strong baselines (e. g. , CenterNet, FCOS, and nnU-Net). For spine parsing and abdominal multi-organ segmentation, our method achieves competitive results on par with state-of-the-art methods on the public CTSpine1K dataset and FLARE22 competition, respectively.

Details DOI

AAAI Conference 2025 Conference Paper

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Ke Yan
Qing Cai
Fan Zhang
Ziyan Cao
Zhi Liu

Although semi-supervised learning has made significant advances in the field of medical image segmentation, fully annotating a volumetric sample slice by slice remains a costly and time-consuming task. Even worse, most of the existing approaches pay much attention to image-level information and ignore semantic features, resulting in the inability to perceive weak boundaries. To address these issues, we propose a novel Semantic-Guided Triplet Co-training (SGTC) framework, which achieves high-end medical image segmentation by only annotating three orthogonal slices of a few volumetric samples, significantly alleviating the burden of radiologists. Our method consist of two main components. Specifically, to enable semantic-aware, fine-granular segmentation and enhance the quality of pseudo-labels, a novel semantic-guided auxiliary learning mechanism is proposed based on the pretrained CLIP. In addition, focusing on a more challenging but clinically realistic scenario, a new triple-view disparity training strategy is proposed, which uses sparse annotations (i.e., only three labeled slices of a few volumes) to perform co-training between three sub-networks, significantly improving the robustness. Extensive experiments on three public medical datasets demonstrate that our method outperforms most state-of-the-art semi-supervised counterparts under sparse annotation settings.

PDF Details DOI

ICLR Conference 2025 Conference Paper

ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts

Yuanchen Wu
Junlong Du
Ke Yan
Shouhong Ding
Xiaoqiang Li 0002

Vision-language (VL) learning requires extensive visual perception capabilities, such as fine-grained object recognition and spatial perception. Recent works typically rely on training huge models on massive datasets to develop these capabilities. As a more efficient alternative, this paper proposes a new framework that Transfers the knowledge from a hub of Vision Experts (ToVE) for efficient VL learning, leveraging pre-trained vision expert models to promote visual perception capability. Specifically, building on a frozen CLIP image encoder that provides vision tokens for image-conditioned language generation, ToVE introduces a hub of multiple vision experts and a token-aware gating network that dynamically routes expert knowledge to vision tokens. In the transfer phase, we propose a "residual knowledge transfer" strategy, which not only preserves the generalizability of the vision tokens but also allows selective detachment of low-contributing experts to improve inference efficiency. Further, we explore to merge these expert knowledge to a single CLIP encoder, creating a knowledge-merged CLIP that produces more informative vision tokens without expert inference during deployment. Experiment results across various VL tasks demonstrate that the proposed ToVE achieves competitive performance with two orders of magnitude fewer training data.

Details

AAAI Conference 2025 Conference Paper

Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model Using 3D Whole-Body CT Scans

Heng Guo
Jianfeng Zhang
Jiaxing Huang
Tony C. W. Mok
Dazhou Guo
Ke Yan
Le Lu
Dakai Jin

Segment anything model (SAM) demonstrates strong generalization ability on natural image segmentation. However, its direct adaptation in medical image segmentation tasks shows significant performance drops. It also requires an excessive number of prompt points to obtain a reasonable accuracy. Although quite a few studies explore adapting SAM into medical image volumes, the efficiency of 2D adaptation methods is unsatisfactory and 3D adaptation methods are only capable of segmenting specific organs/tumors. In this work, we propose a comprehensive and scalable 3D SAM model for whole-body CT segmentation, named CT-SAM3D. Instead of adapting SAM, we propose a 3D promptable segmentation model using a (nearly) fully labeled CT dataset. To train CT-SAM3D effectively, ensuring the model's accurate responses to higher-dimensional spatial prompts is crucial, and 3D patch-wise training is required due to GPU memory constraints. Therefore, we propose two key technical developments: 1) a progressively and spatially aligned prompt encoding method to effectively encode click prompts in local 3D space; and 2) a cross-patch prompt scheme to capture more 3D spatial context, which is beneficial for reducing the editing workloads when interactively prompting on large organs. CT-SAM3D is trained using a curated dataset of 1204 CT scans containing 107 whole-body anatomies and extensively validated using five datasets, achieving significantly better results against all previous SAM-derived models.

PDF Details DOI

ICML Conference 2025 Conference Paper

Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration

Yuanchen Wu
Ke Yan
Shouhong Ding
Ziyin Zhou
Xiaoqiang Li 0002

Large Vision-Language Models (LVLMs) have manifested strong visual question answering capability. However, they still struggle with aligning the rationale and the generated answer, leading to inconsistent reasoning and incorrect responses. To this end, this paper introduces Self-Rationale Calibration (SRC) framework to iteratively calibrate the alignment between rationales and answers. SRC begins by employing a lightweight “rationale fine-tuning” approach, which modifies the model’s response format to require a rationale before deriving answer without explicit prompts. Next, SRC searches a diverse set of candidate responses from the fine-tuned LVLMs for each sample, followed by a proposed pairwise scoring strategy using a tailored scoring model, R-Scorer, to evaluate both rationale quality and factual consistency of candidates. Based on a confidence-weighted preference curation process, SRC decouples the alignment calibration into a preference fine-tuning manner, leading to significant improvements of LVLMs in perception, reasoning, and generalization across multiple benchmarks. Our results emphasize the rationale-oriented alignment in exploring the potential of LVLMs.

Details

AAAI Conference 2024 Conference Paper

MmAP: Multi-Modal Alignment Prompt for Cross-Domain Multi-Task Learning

Yi Xin
Junlong Du
Qiang Wang
Ke Yan
Shouhong Ding

Multi-Task Learning (MTL) is designed to train multiple correlated tasks simultaneously, thereby enhancing the performance of individual tasks. Typically, a multi-task network structure consists of a shared backbone and task-specific decoders. However, the complexity of the decoders increases with the number of tasks. To tackle this challenge, we integrate the decoder-free vision-language model CLIP, which exhibits robust zero-shot generalization capability. Recently, parameter-efficient transfer learning methods have been extensively explored with CLIP for adapting to downstream tasks, where prompt tuning showcases strong potential. Nevertheless, these methods solely fine-tune a single modality (text or visual), disrupting the modality structure of CLIP. In this paper, we first propose Multi-modal Alignment Prompt (MmAP) for CLIP, which aligns text and visual modalities during fine-tuning process. Building upon MmAP, we develop an innovative multi-task prompt learning framework. On the one hand, to maximize the complementarity of tasks with high similarity, we utilize a gradient-driven task grouping method that partitions tasks into several disjoint groups and assign a group-shared MmAP to each group. On the other hand, to preserve the unique characteristics of each task, we assign an task-specific MmAP to each task. Comprehensive experiments on two large multi-task learning datasets demonstrate that our method achieves significant performance improvements compared to full fine-tuning while only utilizing approximately ~ 0.09% of trainable parameters.

PDF Details DOI

ICML Conference 2024 Conference Paper

Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models

Didi Zhu
Zhongyi Sun 0002
Zexi Li 0001
Tao Shen 0002
Ke Yan
Shouhong Ding
Chao Wu 0001
Kun Kuang 0001

Catastrophic forgetting emerges as a critical challenge when fine-tuning multi-modal large language models (MLLMs), where improving performance on unseen tasks often leads to a significant performance drop on the original tasks. This paper presents a comprehensive analysis of catastrophic forgetting in MLLMs and introduces a post-training adjustment method called Model Tailor. Our method primarily preserves the pre-trained parameters while replacing a small number ($\leq$ 10%) of fine-tuned parameters, maintaining $\sim$ 99% effectiveness on original tasks versus pre-training, and achieving $\sim$ 97% on new tasks compared to standard fine-tuning. Specifically, we derive a sparse mask to identify the model patch, based on a fusion strategy that integrates salience and sensitivity analysis. Subsequently, a compensation mechanism is introduced to decorate the patch, enhancing the model’s performance on both target and original tasks. Additionally, our method is adaptable to multi-task scenarios. Through extensive experiments on InstructBLIP and LLaVA-1. 5 in both image captioning and visual question answering tasks, our approach demonstrates significant task adaptability while preserving inherent pre-trained capabilities.

Details

EAAI Journal 2024 Journal Article

Model-agnostic explainable artificial intelligence for object detection in image data

Milad Moradi
Ke Yan
David Colwell
Matthias Samwald
Rhona Asgari

Details DOI

NeurIPS Conference 2024 Conference Paper

SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models

Linglan Zhao
Xuerui Zhang
Ke Yan
Shouhong Ding
Weiran Huang

Continual learning aims to incrementally acquire new concepts in data streams while resisting forgetting previous knowledge. With the rise of powerful pre-trained models (PTMs), there is a growing interest in training incremental learning systems using these foundation models, rather than learning from scratch. Existing works often view PTMs as a strong initial point and directly apply parameter-efficient tuning (PET) in the first session for adapting to downstream tasks. In the following sessions, most methods freeze model parameters for tackling forgetting issues. However, applying PET directly to downstream data cannot fully explore the inherent knowledge in PTMs. Additionally, freezing the parameters in incremental sessions hinders models' plasticity to novel concepts not covered in the first session. To solve the above issues, we propose a Slow And Fast parameter-Efficient tuning (SAFE) framework. In particular, to inherit general knowledge from foundation models, we include a transfer loss function by measuring the correlation between the PTM and the PET-applied model. After calibrating in the first session, the slow efficient tuning parameters can capture more informative features, improving generalization to incoming classes. Moreover, to further incorporate novel concepts, we strike a balance between stability and plasticity by fixing slow efficient tuning parameters and continuously updating the fast ones. Specifically, a cross-classification loss with feature alignment is proposed to circumvent catastrophic forgetting. During inference, we introduce an entropy-based aggregation strategy to dynamically utilize the complementarity in the slow and fast learners. Extensive experiments on seven benchmark datasets verify the effectiveness of our method by significantly surpassing the state-of-the-art.

PDF Details DOI

AAAI Conference 2024 Conference Paper

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Yi Xin
Junlong Du
Qiang Wang
Zhiwen Lin
Ke Yan

Large-scale pre-trained models have achieved remarkable success in various computer vision tasks. A standard approach to leverage these models is to fine-tune all model parameters for downstream tasks, which poses challenges in terms of computational and storage costs. Recently, inspired by Natural Language Processing (NLP), parameter-efficient transfer learning has been successfully applied to vision tasks. However, most existing techniques primarily focus on single-task adaptation, and despite limited research on multi-task adaptation, these methods often exhibit suboptimal training/inference efficiency. In this paper, we first propose an once-for-all Vision Multi-Task Adapter (VMT-Adapter), which strikes approximately O(1) training and inference efficiency w.r.t task number. Concretely, VMT-Adapter shares the knowledge from multiple tasks to enhance cross-task interaction while preserves task-specific knowledge via independent knowledge extraction modules. Notably, since task-specific modules require few parameters, VMT-Adapter can handle an arbitrary number of tasks with a negligible increase of trainable parameters. We also propose VMT-Adapter-Lite, which further reduces the trainable parameters by learning shared parameters between down- and up-projections. Extensive experiments on four dense scene understanding tasks demonstrate the superiority of VMT-Adapter(-Lite), achieving a 3.96% (1.34%) relative improvement compared to single-task full fine-tuning, while utilizing merely ～1% (0.36%) trainable parameters of the pre-trained model.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Few-Shot Object Detection via Variational Feature Aggregation

Jiaming Han
Yuqiang Ren
Jian Ding
Ke Yan
Gui-Song Xia

As few-shot object detectors are often trained with abundant base samples and fine-tuned on few-shot novel examples, the learned models are usually biased to base classes and sensitive to the variance of novel examples. To address this issue, we propose a meta-learning framework with two novel feature aggregation schemes. More precisely, we first present a Class-Agnostic Aggregation (CAA) method, where the query and support features can be aggregated regardless of their categories. The interactions between different classes encourage class-agnostic representations and reduce confusion between base and novel classes. Based on the CAA, we then propose a Variational Feature Aggregation (VFA) method, which encodes support examples into class-level support features for robust feature aggregation. We use a variational autoencoder to estimate class distributions and sample variational features from distributions that are more robust to the variance of support examples. Besides, we decouple classification and regression tasks so that VFA is performed on the classification branch without affecting object localization. Extensive experiments on PASCAL VOC and COCO demonstrate that our method significantly outperforms a strong baseline (up to 16%) and previous state-of-the-art methods (4% in average).

PDF Details DOI

EAAI Journal 2023 Journal Article

Generative adversarial networks driven by multi-domain information for improving the quality of generated samples in fault diagnosis

Zhijun Ren
Dawei Gao
Yongsheng Zhu
Qing Ni
Ke Yan
Jun Hong

Details DOI

AAAI Conference 2023 Conference Paper

Inter-image Contrastive Consistency for Multi-Person Pose Estimation

Xixia Xu
Yingguo Gao
Xingjia Pan
Ke Yan
Xiaoyu Chen
Qi Zou

Multi-person pose estimation (MPPE) has achieved impressive progress in recent years. However, due to the large variance of appearances among images or occlusions, the model can hardly learn consistent patterns enough, which leads to severe location jitter and missing issues. In this study, we propose a novel framework, termed Inter-image Contrastive consistency (ICON), to strengthen the keypoint consistency among images for MPPE. Concretely, we consider two-fold consistency constraints, which include single keypoint contrastive consistency (SKCC) and pair relation contrastive consistency (PRCC). The SKCC learns to strengthen the consistency of individual keypoints across images in the same category to improve the category-specific robustness. Only with SKCC, the model can effectively reduce location errors caused by large appearance variations, but remains challenging with extreme postures (e.g., occlusions) due to lack of relational guidance. Therefore, PRCC is proposed to strengthen the consistency of pair-wise joint relation between images to preserve the instructive relation. Cooperating with SKCC, PRCC further improves structure aware robustness in handling extreme postures. Extensive experiments on kinds of architectures across three datasets (i.e., MS-COCO, MPII, CrowdPose) show the proposed ICON achieves substantial improvements over baselines. Furthermore, ICON under the semi-supervised setup can obtain comparable results with the fully-supervised methods using only 30% labeled data.

PDF Details DOI

JBHI Journal 2022 Journal Article

An Improved Neural Network Based on SENet for Sleep Stage Classification

Jing Huang
Lifeng Ren
Xiaokang Zhou
Ke Yan

Sleep staging is an important step in analyzing sleep quality. Traditional manual analysis by psychologists is time-consuming. In this paper, we propose an automatic sleep staging model with an improved attention module and hidden Markov model (HMM). The model is driven by single-channel electroencephalogram (EEG) data. It automatically extracts features through two convolution kernels with different scales. Subsequently, an improved attention module based on Squeeze-and-Excitation Networks (SENet) will perform feature fusion. The neural network will give a preliminary sleep stage based on the learned features. Finally, an HMM will apply sleep transition rules to refine the classification. The proposed method is tested on the sleep-EDFx dataset and achieves excellent performance. The accuracy on the Fpz-Cz channel is 84. 6%, and the kappa coefficient is 0. 79. For the Pz-Oz channel, the accuracy is 82. 3% and kappa is 0. 76. The experimental results show that the attention mechanism plays a positive role in feature fusion. And our improved attention module improves the classification performance. In addition, applying sleep transition rules through HMM helps to improve performance, especially N1, which is difficult to identify.

Details DOI

AAAI Conference 2022 Conference Paper

Lifelong Person Re-identification by Pseudo Task Knowledge Preservation

Wenhang Ge
Junlong Du
Ancong Wu
Yuqiao Xian
Ke Yan
Feiyue Huang
Wei-Shi Zheng

In real world, training data for person re-identification (Re- ID) is collected discretely with spatial and temporal variations, which requires a model to incrementally learn new knowledge without forgetting old knowledge. This problem is called lifelong person re-identification (LReID). Variations of illumination and background for images of each task exhibit task-specific image style and lead to task-wise domain gap. In addition to missing data from the old tasks, task-wise domain gap is a key factor for catastrophic forgetting in LReID, which is ignored in existing approaches for LReID. The model tends to learn task-specific knowledge with task-wise domain gap, which results in stability and plasticity dilemma. To overcome this problem, we cast LReID as a domain adaptation problem and propose a pseudo task knowledge preservation framework to alleviate the domain gap. Our framework is based on a pseudo task transformation module which maps the features of the new task into the feature space of the old tasks to complement the limited saved exemplars of the old tasks. With extra transformed features in the task-specific feature space, we propose a task-specific domain consistency loss to implicitly alleviate the task-wise domain gap for learning task-shared knowledge instead of task-specific one. Furthermore, to guide knowledge preservation with the feature distributions of the old tasks, we propose to preserve knowledge on extra pseudo tasks which jointly distills knowledge and discriminates identity, in order to achieve a better tradeoff between stability and plasticity for lifelong learning with task-wise domain gap. Extensive experiments demonstrate the superiority of our method 1 as compared with the stateof-the-art lifelong learning and LReID methods.

PDF Details

YNICL Journal 2020 Journal Article

Sensitivity of functional connectivity to periaqueductal gray localization, with implications for identifying disease-related changes in chronic visceral pain: A MAPP Research Network neuroimaging study

Sonja J. Fenske
Douglas Bierer
Gisela Chelimsky
Lisa Conant
Candida Ustine
Ke Yan
Thomas Chelimsky
Jason J. Kutch

Details DOI

AAMAS Conference 2018 Conference Paper

Information Design in Crowdfunding under Thresholding Policies

Wen Shen
Jacob W. Crandall
Ke Yan
Cristina V. Lopes

Crowdfunding has emerged as a prominent way for entrepreneurs to secure funding without sophisticated intermediation. In crowdfunding, an entrepreneur often has to decide how to disclose the campaign status in order to collect as many contributions as possible. Such decisions are difficult to make primarily due to incomplete information. We propose information design as a tool to help the entrepreneur to improve revenue by influencing backers’ beliefs. We introduce a heuristic algorithm to dynamically compute information-disclosure policies for the entrepreneur, followed by an empirical evaluation to demonstrate its competitiveness over the widely-adopted immediate-disclosure policy. Our results demonstrate that the immediate-disclosure policy is not optimal when backers follow thresholding policies despite its ease of implementation. With appropriate heuristics, an entrepreneur can benefit from dynamic information disclosure. Our work sheds light on information design in a dynamic setting where agents make decisions using thresholding policies.

PDF

AIIM Journal 2017 Journal Article

Protein fold recognition based on sparse representation based classification

Ke Yan
Yong Xu
Xiaozhao Fang
Chunhou Zheng
Bin Liu

Details DOI