Arrow Research search

Author name cluster

Kun Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

JBHI Journal 2026 Journal Article

FourierMask: Explain EEG-Based End-to-End Deep Learning Models in the Frequency Domain

  • Hanqi Wang
  • Jingyu Zhang
  • Kun Yang
  • Jichuan Xiong
  • Xuefeng Liu
  • Tao Chen
  • Liang Song

The rise of EEG-based end-to-end deep learning models has underscored the need to elucidate how these models process time-series raw EEG signals to generate predictions. The frequency domain provides a more suitable perspective for this task due to two key advantages: the strong correlation with cognitive states and the inherent capacity to model long-range temporal dependencies. However, this perspective remains underexplored in existing research. To bridge this gap, we propose FourierMask, the first mask perturbation framework specifically designed for frequency-domain explanation of EEG-based end-to-end models. Our method introduces three key innovations. First, the Fourier-based domain transformation enables direct manipulation of spectral components. Second, A learnable mask mechanism jointly models the spectral-spatial couplings relationship for EEG explanation. Third, a perturbation generator constrained by a target alignment loss ensures natural perturbations by minimizing distribution shift via cluster-aware regularization. We validate our method through experiments on an EEG benchmark dataset across EEGNet, TSCeption, and DeepConvNet models. Our method reaches a 36. 0% average accuracy drop gap (vs. 8. 6% for LIME and 6. 6% for easyPEASI) at the group-level. And, it reaches a 17. 8% average accuracy drop gap (vs. 8. 9% for LIME and 9. 9% for easyPEASI) at the instance-level. Our model-agnostic framework provides a plug-and-play solution for enhancing transparency of EEG-based end-to-end deep learning models. It links model decisions to frequency biomarkers, with potential applications in neuromedicine and brain-computer interfaces.

AAAI Conference 2026 Conference Paper

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement

  • Weitao Jia
  • Jinghui Lu
  • Haiyang Yu
  • Siqi Wang
  • Guozhi Tang
  • An-Lan Wang
  • Weijie Yin
  • Dingkang Yang

Recent advances demonstrate that reinforcement learning with verifiable rewards (RLVR) significantly enhances the reasoning capabilities of large language models (LLMs). However, standard RLVR faces challenges with reward sparsity, where zero rewards from consistently incorrect candidate answers provide no learning signal, particularly in challenging tasks. To address this,we propose Multi-Expert Mutual Learning GRPO (MEML-GRPO), an innovative framework that utilizes diverse expert prompts as system prompts to generate a broader range of responses, substantially increasing the likelihood of identifying correct solutions. Additionally, we introduce an inter-expert mutual learning mechanism that facilitates knowledge sharing and transfer among experts, further boosting the model’s performance through RLVR. Extensive experiments across multiple reasoning benchmarks show that MEML-GRPO delivers significant improvements, achieving an average performance gain of 4.89% with Qwen and 11.33% with Llama, effectively overcoming the core limitations of traditional RLVR methods.

IROS Conference 2025 Conference Paper

Autonomous Suturing Method for Robot-Assisted Minimally Invasive Surgery

  • Mei Feng
  • Haoju Li
  • Yao Li
  • Kun Yang
  • Dong He
  • Xiuquan Lu

Robot-assisted minimally invasive surgery is widely used because of its superior postoperative recovery outcomes. However, the workload for surgeons remains high. The development of autonomous suturing capabilities in surgical robots is poised to significantly reduce surgeon workload. In this study, we present a novel method or autonomous suturing using a minimally invasive surgical robot. We quantify the surgical suturing requirements and propose corresponding metrics for evaluating the suturing effect. We also use the dynamic adjustment of stitch position to optimize the surgical robot autonomous suturing scheme. Furthermore, we employ particle swarm algorithms to enhance the grasping posture of surgical instruments, enabling the robot to achieve optimal suture needle clamping. Our method maintains the same level of expert operator in the suturing parametric index of suturing when suturing two types of wounds: gauze and egg membrane. The autonomous suturing method proposed in this study is currently deployed on our own surgical robot, and it can be generalized to other surgical robots. This will lay the foundation for surgical robots to achieve fully autonomous surgery. The experimental results show that the stitching effect of our proposed autonomous robot stitching method is already close to that of surgeons using the same robot, and it maintains good consistency in multiple sets of experiments. The method proposed in this study can be generalized to various other surgical robots, laying the foundation for surgical robots to achieve fully autonomous surgery.

AAAI Conference 2024 Conference Paper

A Unified Self-Distillation Framework for Multimodal Sentiment Analysis with Uncertain Missing Modalities

  • Mingcheng Li
  • Dingkang Yang
  • Yuxuan Lei
  • Shunli Wang
  • Shuaibing Wang
  • Liuzhen Su
  • Kun Yang
  • Yuzheng Wang

Multimodal Sentiment Analysis (MSA) has attracted widespread research attention recently. Most MSA studies are based on the assumption of modality completeness. However, many inevitable factors in real-world scenarios lead to uncertain missing modalities, which invalidate the fixed multimodal fusion approaches. To this end, we propose a Unified multimodal Missing modality self-Distillation Framework (UMDF) to handle the problem of uncertain missing modalities in MSA. Specifically, a unified self-distillation mechanism in UMDF drives a single network to automatically learn robust inherent representations from the consistent distribution of multimodal data. Moreover, we present a multi-grained crossmodal interaction module to deeply mine the complementary semantics among modalities through coarse- and fine-grained crossmodal attention. Eventually, a dynamic feature integration module is introduced to enhance the beneficial semantics in incomplete modalities while filtering the redundant information therein to obtain a refined and robust multimodal representation. Comprehensive experiments on three datasets demonstrate that our framework significantly improves MSA performance under both uncertain missing-modality and complete-modality testing conditions.

JBHI Journal 2024 Journal Article

Automatically Extracting and Utilizing EEG Channel Importance Based on Graph Convolutional Network for Emotion Recognition

  • Kun Yang
  • Zhenning Yao
  • Keze Zhang
  • Jing Xu
  • Li Zhu
  • Shichao Cheng
  • Jianhai Zhang

Graph convolutional network (GCN) based on the brain network has been widely used for EEG emotion recognition. However, most studies train their models directly without considering network dimensionality reduction beforehand. In fact, some nodes and edges are invalid information or even interference information for the current task. It is necessary to reduce the network dimension and extract the core network. To address the problem of extracting and utilizing the core network, a core network extraction model (CWGCN) based on channel weighting and graph convolutional network and a graph convolutional network model (CCSR-GCN) based on channel convolution and style-based recalibration for emotion recognition have been proposed. The CWGCN model automatically extracts the core network and the channel importance parameter in a data-driven manner. The CCSR-GCN model innovatively uses the output information of the CWGCN model to identify the emotion state. The experimental results on SEED show that: 1) the core network extraction can help improve the performance of the GCN model; 2) the models of CWGCN and CCSR-GCN achieve better results than the currently popular methods. The idea and its implementation in this paper provide a novel and successful perspective for the application of GCN in brain network analysis of other specific tasks.

JBHI Journal 2024 Journal Article

DSFE: Decoding EEG-Based Finger Motor Imagery Using Feature-Dependent Frequency, Feature Fusion and Ensemble Learning

  • Kun Yang
  • Ruochen Li
  • Jing Xu
  • Li Zhu
  • Wanzeng Kong
  • Jianhai Zhang

Accurate decoding finger motor imagery is essential for fine motor control using EEG signals. However, decoding finger motor imagery is particularly challenging compared with ordinary motor imagery. This paper proposed a novel EEG decoding method of feature-dependent frequency band selection, feature fusion, and ensemble learning (DSFE) for finger motor imagery. First, a feature-dependent frequency band selection method based on correlation coefficient (FDCC) was proposed to select feature-specific effective bands. Second, a feature fusion method was proposed to fuse different types of candidate features to produce multiple refined sets of decoding features. Finally, an ensemble model using the weighted voting strategy was proposed to make full use of these diverse sets of final features. The results on a public EEG dataset of five fingers motor imagery showed that the DSFE method is effective and achieves the highest decoding accuracy of 50. 64%, which is 7. 64% higher than existing studies using exactly the same data. The experiments further revealed that both the effective frequency bands of different subjects and the effective frequency bands of different types of features are different in finger motor imagery. Furthermore, compared with two-hand motor imagery, the effective decoding information of finger motor imagery is transferred to the lower frequency. The idea and findings in this paper provide a valuable perspective for understanding fine motor imagery in-depth.

NeurIPS Conference 2024 Conference Paper

Efficient Prompt Optimization Through the Lens of Best Arm Identification

  • Chengshuai Shi
  • Kun Yang
  • Zihan Chen
  • Jundong Li
  • Jing Yang
  • Cong Shen

The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i. e. , prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection method. Especially, the cost incurred during the selection (e. g. , accessing LLM and evaluating the responses) is rarely explicitly considered. To overcome this limitation, this work provides a principled framework, TRIPLE, to efficiently perform prompt selection under an explicit budget constraint. TRIPLE is built on a novel connection established between prompt optimization and fixed-budget best arm identification (BAI-FB) in multi-armed bandits (MAB); thus, it is capable of leveraging the rich toolbox from BAI-FB systematically and also incorporating unique characteristics of prompt optimization. Extensive experiments on multiple well-adopted tasks using various LLMs demonstrate the remarkable performance improvement of TRIPLE over baselines while satisfying the limited budget constraints. As an extension, variants of TRIPLE are proposed to efficiently select examples for few-shot prompts, also achieving superior empirical performance.

TMLR Journal 2024 Journal Article

Harnessing the Power of Federated Learning in Federated Contextual Bandits

  • Chengshuai Shi
  • Ruida Zhou
  • Kun Yang
  • Cong Shen

Federated learning (FL) has demonstrated great potential in revolutionizing distributed machine learning, and tremendous efforts have been made to extend it beyond the original focus on supervised learning. Among many directions, federated contextual bandits (FCB), a pivotal integration of FL and sequential decision-making, has garnered significant attention in recent years. Despite substantial progress, existing FCB approaches have largely employed their tailored FL components, often deviating from the canonical FL framework. Consequently, even renowned algorithms like FedAvg remain under-utilized in FCB, let alone other FL advancements. Motivated by this disconnection, this work takes one step towards building a tighter relationship between the canonical FL study and the investigations on FCB. In particular, a novel FCB design, termed FedIGW, is proposed to leverage a regression-based CB algorithm, i.e., inverse gap weighting. Compared with existing FCB approaches, the proposed FedIGW design can better harness the entire spectrum of FL innovations, which is concretely reflected as (1) flexible incorporation of (both existing and forthcoming) FL protocols; (2) modularized plug-in of FL analyses in performance guarantees; (3) seamless integration of FL appendages (such as personalization, robustness, and privacy). We substantiate these claims through rigorous theoretical analyses and empirical evaluations.

NeurIPS Conference 2024 Conference Paper

Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models

  • Chengshuai Shi
  • Kun Yang
  • Jing Yang
  • Cong Shen

The in-context learning (ICL) capability of pre-trained models based on the transformer architecture has received growing interest in recent years. While theoretical understanding has been obtained for ICL in reinforcement learning (RL), the previous results are largely confined to the single-agent setting. This work proposes to further explore the in-context learning capabilities of pre-trained transformer models in competitive multi-agent games, i. e. , in-context game-playing (ICGP). Focusing on the classical two-player zero-sum games, theoretical guarantees are provided to demonstrate that pre-trained transformers can provably learn to approximate Nash equilibrium in an in-context manner for both decentralized and centralized learning settings. As a key part of the proof, constructional results are established to demonstrate that the transformer architecture is sufficiently rich to realize celebrated multi-agent game-playing algorithms, in particular, decentralized V-learning and centralized VI-ULCB.

NeurIPS Conference 2023 Conference Paper

How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception

  • Dingkang Yang
  • Kun Yang
  • Yuzheng Wang
  • Jing Liu
  • Zhi Xu
  • Rongbin Yin
  • Peng Zhai
  • Lihua Zhang

Multi-agent collaborative perception has recently received widespread attention as an emerging application in driving scenarios. Despite the advancements in previous efforts, challenges remain due to various noises in the perception procedure, including communication redundancy, transmission delay, and collaboration heterogeneity. To tackle these issues, we propose \textit{How2comm}, a collaborative perception framework that seeks a trade-off between perception performance and communication bandwidth. Our novelties lie in three aspects. First, we devise a mutual information-aware communication mechanism to maximally sustain the informative features shared by collaborators. The spatial-channel filtering is adopted to perform effective feature sparsification for efficient communication. Second, we present a flow-guided delay compensation strategy to predict future characteristics from collaborators and eliminate feature misalignment due to temporal asynchrony. Ultimately, a pragmatic collaboration transformer is introduced to integrate holistic spatial semantics and temporal context clues among agents. Our framework is thoroughly evaluated on several LiDAR-based collaborative detection datasets in real-world and simulated scenarios. Comprehensive experiments demonstrate the superiority of How2comm and the effectiveness of all its vital components. The code will be released at https: //github. com/ydk122024/How2comm.

TMLR Journal 2022 Journal Article

An Efficient One-Class SVM for Novelty Detection in IoT

  • Kun Yang
  • Samory Kpotufe
  • Nick Feamster

One-Class Support Vector Machines (OCSVM) are a common approach for novelty detection, due to their flexibility in fitting complex nonlinear boundaries between {normal} and {novel} data. Novelty detection is important in the Internet of Things (``IoT'') due to the threats these devices can present, and OCSVM often performs well in these environments due to the variety of devices, traffic patterns, and anomalies that IoT devices present. Unfortunately, conventional OCSVMs can introduce prohibitive memory and computational overhead at detection time. This work designs, implements and evaluates an efficient OCSVM for such practical settings. We extend Nystr\"om and (Gaussian) Sketching approaches to OCSVM, combining these methods with clustering and Gaussian mixture models to achieve 15-30x speedup in prediction time and 30-40x reduction in memory requirements without sacrificing detection accuracy. Here, the very nature of IoT devices is crucial: they tend to admit few modes of \emph{normal} operation, allowing for efficient pattern compression.

ICRA Conference 2020 Conference Paper

An Autonomous Intercept Drone with Image-based Visual Servo

  • Kun Yang
  • Quan Quan

For most people on the ground, facing an unwanted drone buzzing around overhead, there is not a lot that we can do, especially if it is out of gun (radio wave gun or shotgun) range. A solution to this is to use intercept drones that seek out and bring down other drones. In order to make the interception autonomous, an image-based visual servo algorithm is designed with a forward-looking monocular camera. The control command, namely the angular velocity and thrust, is generated for intercept drones to implement accurate and fast interception. The proposed method is demonstrated in both hardware-in-the-loop simulation and demonstrative flight experiments.

NeurIPS Conference 2016 Conference Paper

Density Estimation via Discrepancy Based Adaptive Sequential Partition

  • Dangna Li
  • Kun Yang
  • Wing Hung Wong

Given $iid$ observations from an unknown continuous distribution defined on some domain $\Omega$, we propose a nonparametric method to learn a piecewise constant function to approximate the underlying probability density function. Our density estimate is a piecewise constant function defined on a binary partition of $\Omega$. The key ingredient of the algorithm is to use discrepancy, a concept originates from Quasi Monte Carlo analysis, to control the partition process. The resulting algorithm is simple, efficient, and has provable convergence rate. We demonstrate empirically its efficiency as a density estimation method. We also show how it can be utilized to find good initializations for k-means.