Author name cluster

Kai Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

AAAI Conference 2026 Conference Paper

Action-and-object Aware Alignment for Partially Relevant Video Retrieval

Chuanshen Chen
Kai Zhou
Zhiquan Wen
Zeng You
Yirui Li
Tianhang Xiang
Mingkui Tan

Partially Relevant Video Retrieval (PRVR) aims to retrieve untrimmed videos containing relevant moments for a given text query. This task is extremely challenging, as untrimmed videos often include numerous actions and objects unrelated to the query. However, existing methods usually struggle with fine-grained action-object modeling, limiting their retrieval performance. To tackle this challenge, we introduce Action-and-object Aware Alignment for Partially Relevant Video Retrieval (A3PRVR), a dual-branch framework designed to enhance retrieval by improving the modeling of action-object relationships. Specifically, we propose a Query-specific Deformable Temporal Attention (Q-DTA) module to effectively capture action-relevant object information in video features, while filtering out irrelevant content. Additionally, we propose an action-and-object aware alignment module to enable fine-grained textual understanding and video-text alignment. It uses action- and object-aware contrastive losses to enhance the model's sensitivity to action-object distinctions in the text query. Compared to state-of-the-art methods, A3PRVR achieves an average relative gain of 6.5% in SumR across the Charades-STA, ActivityNet-Caption, and TVR datasets.

PDF Details DOI

AAAI Conference 2026 Conference Paper

RoSE: A Role Correlation Structure-Enhanced Model for Multi-Event Argument Extraction

Geting Huang
Jilong Zhang
Kai Zhou
Zhang Yi
Xiuyuan Xu

Event co-occurrences have been proven effective for event argument extraction (EAE) in previous studies; however, few have considered intra- and inter-event role correlations. Since role varies among different event types, event structure heterogeneity and overlap pose significant challenges to EAE. To address this issue, we propose a Role Correlation Structure-Enhanced model for Multi-Event Argument Extraction (RoSE), capable of capturing both heterogeneity and overlap of event structures through modeling role correlations. The proposed RoSE model employs a joint context-prompts input, role-centric graph-guided encoder (RoGE), and role-specific information fusion (RoIF). The RoGE is designed to enhance the intra- and inter-event role correlation between prompts and their corresponding event contexts. The RoIF module utilizes intra-event role information to improve multi-event arguments extraction. Extensive experiments on four widely-used benchmarks (RAMS, WikiEvents, MLEE, and ACE05) demonstrate that our proposed approach achieves state-of-the-art performance, validating the effectiveness of incorporating both intra- and inter-event role correlations.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Efficient Dynamic Ensembling for Multiple LLM Experts

Jinwu Hu
Yufeng Wang
Shuhai Zhang
Kai Zhou
Guohao Chen
Yu Hu
Bin Xiao
Mingkui Tan

LLMs have demonstrated impressive performance across various language tasks. However, the strengths of LLMs can vary due to different architectures, model sizes, areas of training data, etc. Therefore, ensemble reasoning for the strengths of different LLM experts is critical to achieving consistent and satisfactory performance on diverse inputs across a wide range of tasks. However, existing LLM ensemble methods are either computationally intensive or incapable of leveraging complementary knowledge among LLM experts for various inputs. In this paper, we propose an efficient Dynamic Ensemble Reasoning paradigm, called DER to integrate the strengths of multiple LLM experts conditioned on dynamic inputs. Specifically, we model the LLM ensemble reasoning problem as a Markov Decision Process, wherein an agent sequentially takes inputs to request knowledge from an LLM candidate and passes the output to a subsequent LLM candidate. Moreover, we devise a reward function to train a DER-Agent to dynamically select an optimal answering route given the input questions, aiming to achieve the highest performance with as few computational resources as possible. Last, to fully transfer the expert knowledge from the prior LLMs, we develop a Knowledge Transfer Prompt that enables the subsequent LLM candidates to transfer complementary knowledge effectively. Experiments demonstrate that our method uses fewer computational resources to achieve better performance compared to state-of-the-art baselines. Code and appendix are available at https: //github. com/Fhujinwu/DER.

PDF Details DOI

EAAI Journal 2025 Journal Article

Epidemiology-informed Spatiotemporal Graph Neural Network for heterogeneity-driven interpretable epidemic forecasting

Shuai Han
Lukas Stelz
Thomas R. Sokolowski
Kai Zhou
Horst Stöcker

Accurate epidemic forecasting is crucial for effective disease control and prevention. Traditional mechanistic models often struggle to estimate epidemiological parameters that vary spatiotemporally, while deep learning-based approaches typically disregard intrinsic transmission dynamics and lack interpretability. To overcome these limitations, we propose a novel Epidemiology-informed Spatiotemporal Graph Neural Network (EISTGNN): a hybrid framework integrating the proposed Spatio-Contact Susceptible–Infectious–Recovered (SCSIR) model with a spatiotemporal graph neural network to model epidemic transmission dynamics across regions. The inherently smooth and continuous nature of interregional epidemic transmission indicates that consecutive graph structures share underlying latent patterns. To leverage this property, we employ an adaptive graph to model stable contact patterns, a temporal module to capture their fluctuating interactions, and fuse both into a spatiotemporal contact graph. To address the limitations of recurrent structures, we introduce a temporal decomposition module to extract long-term trends and short-term variations, which is then integrated with a spatiotemporal graph convolutional network to simultaneously identify epidemiological parameters and forecast outbreaks. We validate EISTGNN on real-world datasets at the provincial level in China and at the state level in Germany. Experimental results demonstrate that our method effectively models the spatiotemporal dynamics of infectious diseases, offering a valuable tool for epidemic modeling and forecasting. Furthermore, by analyzing the learned parameters through the effective reproduction number ( R t ), we derive valuable insights into transmission mechanisms and enhance both interpretability and practical utility of the underlying model.

Details DOI

IJCAI Conference 2025 Conference Paper

GraphProt: Certified Black-Box Shielding Against Backdoored Graph Models

Xiao Yang
Yuni Lai
Kai Zhou
Gaolei Li
Jianhua Li
Hang Zhang

Graph learning models have been empirically proven to be vulnerable to backdoor threats, wherein adversaries submit trigger-embedded inputs to manipulate the model predictions. Current graph backdoor defenses manifest several limitations: 1) dependence on model-related details, 2) necessitation of additional fine-tuning, and 3) reliance on extra explainability tools, all of which are infeasible under stringent privacy policies. To address those limitations, we propose GraphProt, a certified black-box defense method to suppress backdoor attacks on GNN-based graph classifiers. Our GraphProt operates in a model-agnostic manner and solely leverages graph input. Specifically, GraphProt first introduces designed topology-feature-filtration to mitigate graph anomalies. Subsequently, subgraphs are sampled via a formulated strategy integrating topology and features, followed by a robust model inference through a majority vote-based subgraph prediction ensemble. Our results across benchmark attacks and datasets show GraphProt effectively reduces attack success rates while preserving regular graph classification accuracy.

PDF Details DOI

AIJ Journal 2024 Journal Article

Adversarial analysis of similarity-based sign prediction

Michał T. Godziszewski
Marcin Waniek
Yulin Zhu
Kai Zhou
Talal Rahwan
Tomasz P. Michalak

Adversarial social network analysis explores how social links can be altered or otherwise manipulated to hinder unwanted information collection. To date, however, problems of this kind have not been studied in the context of signed networks in which links have positive and negative labels. Such formalism is often used to model social networks with positive links indicating friendship or support and negative links indicating antagonism or opposition. In this work, we present a computational analysis of the problem of attacking sign prediction in signed networks, whereby the aim of the attacker (a network member) is to hide from the defender (an analyst) the signs of a target set of links by removing the signs of some other, non-target, links. While the problem turns out to be NP-hard if either local or global similarity measures are used for sign prediction, we provide a number of positive computational results, including an FPT-algorithm for eliminating common signed neighborhood and heuristic algorithms for evading local similarity-based link prediction in signed networks.

Details DOI

EAAI Journal 2022 Journal Article

A multi-task learning for cavitation detection and cavitation intensity recognition of valve acoustic signals

Yu Sha
Johannes Faber
Shuiping Gou
Bo Liu
Wei Li
Stefan Schramm
Horst Stoecker
Thomas Steckenreiter

With the rapid development of smart manufacturing, data-driven machinery health management has received a growing attention. As one of the most popular methods in machinery health management, deep learning (DL) has achieved remarkable successes. However, due to the issues of limited samples and poor separability of different cavitation states of acoustic signals, which greatly hinder the eventual performance of DL modes for cavitation intensity recognition and cavitation detection. Also different tasks were performed separately conventionally. In this work, a novel multi-task learning framework for simultaneous cavitation detection and cavitation intensity recognition framework using 1-D double hierarchical residual networks (1-D DHRN) is proposed for analyzing valves acoustic signals. Firstly, a data augmentation method based on sliding window with fast Fourier transform (Swin-FFT) is developed to alleviate the small-sample issue confronted in this study. Secondly, a 1-D double hierarchical residual block (1-D DHRB) is constructed to capture sensitive features from the frequency domain acoustic signals of valve. Then, a new structure of 1-D DHRN is proposed. Finally, the devised 1-D DHRN is evaluated on two datasets of valve acoustic signals without noise ( Dataset 1 and Dataset 2 ) and one dataset of valve acoustic signals with realistic surrounding noise ( Dataset 3 ) provided by SAMSON AG (Frankfurt). Our method has achieved state-of-the-art results. The prediction accuracies of 1-D DHRN for cavitation intensitys recognition are as high as 93. 75%, 94. 31% and 100%, which indicates that 1-D DHRN outperforms other DL models and conventional methods. At the same time, the testing accuracies of 1-D DHRN for cavitation detection are as high as 97. 02%, 97. 64% and 100%. In addition, 1-D DHRN has also been tested for different frequencies of samples and shows excellent results for frequency of samples that mobile phones can accommodate.

Details DOI

AAMAS Conference 2021 Conference Paper

Strategic Evasion of Centrality Measures

Marcin Waniek
Jan Woźnica
Kai Zhou
Yevgeniy Vorobeychik
Talal Rahwan
Tomasz P. Michalak

Among the most fundamental tools for social network analysis are centrality measures, which quantify the importance of every node in the network. This centrality analysis typically disregards the possibility that the network may have been deliberately manipulated to mislead the analysis. To solve this problem, a recent study attempted to understand how a member of a social network could rewire the connections therein to avoid being identified as a leader of that network. However, the study was based on the assumption that the network analyzer—the seeker—is oblivious to any evasion attempts by the evader. In this paper, we relax this assumption by modelling the seeker and evader as strategic players in a Bayesian Stackelberg game. In this context, we study the complexity of various optimization problems, and analyze the equilibria of the game under different assumptions, thereby drawing the first conclusions in the literature regarding which centralities the seeker should use to maximize the chances of detecting a strategic evader.

PDF

AAAI Conference 2020 Conference Paper

Computing Equilibria in Binary Networked Public Goods Games

Sixie Yu
Kai Zhou
Jeffrey Brantingham
Yevgeniy Vorobeychik

Public goods games study the incentives of individuals to contribute to a public good and their behaviors in equilibria. In this paper, we examine a speciﬁc type of public goods game where players are networked and each has binary actions, and focus on the algorithmic aspects of such games. First, we show that checking the existence of a pure-strategy Nash equilibrium is NP-complete. We then identify tractable instances based on restrictions of either utility functions or of the underlying graphical structure. In certain cases, we also show that we can efﬁciently compute a socially optimal Nash equilibrium. Finally, we propose a heuristic approach for computing approximate equilibria in general binary networked public goods games, and experimentally demonstrate its effectiveness. Due to space limitation, some proofs are deferred to the extended version1.

PDF Details

AAMAS Conference 2019 Conference Paper

Attacking Similarity-Based Link Prediction in Social Networks

Kai Zhou
Tomasz P. Michalak
Marcin Waniek
Talal Rahwan
Yevgeniy Vorobeychik

Link prediction is one of the fundamental problems in computational social science. A particularly common means to predict existence of unobserved links is via structural similarity metrics, such as the number of common neighbors; node pairs with higher similarity are thus deemed more likely to be linked. However, a number of applications of link prediction, such as predicting links in gang or terrorist networks, are adversarial, with another party incentivized to minimize its e�ectiveness by manipulating observed information about the network. We o�er a comprehensive algorithmic investigation of the problem of attacking similarity-based link prediction through link deletion, focusing on two broad classes of such approaches, one which uses only local information about target links, and another which uses global network information. While we show several variations of the general problem to be NP-Hard for both local and global metrics, we exhibit a number of well-motivated special cases which are tractable. Additionally, we provide principled and empirically e�ective algorithms for the intractable cases, in some cases proving worst-case approximation guarantees.

PDF

AAMAS Conference 2019 Conference Paper

Multi-agent Path Planning with Non-constant Velocity Motion

Ngai Meng Kou
Cheng Peng
Xiaowei Yan
Zhiyuan Yang
Heng Liu
Kai Zhou
Haibing Zhao
Lijun Zhu

Multi-agent path planning has wide application in fields such as robotics, transportation, logistics, computer games, etc. . To formulate the multi-agent path finding as a concisely discretized problem, most of the previous works did not construct a detailed motion model of each agent. While many elegant algorithms were proposed in the literature, a method to efficiently plan the paths for multi agents with non-constant velocity is still lacking. In this paper, we propose two methods CRISE and COB to extend the existing algorithms for non-constant velocity motion path planning.

PDF

IROS Conference 2010 Conference Paper

Mechanical support as a spatial abstraction for mobile robots

Kristoffer Sjöö
Alper Aydemir
Thomas Morwald
Kai Zhou
Patric Jensfelt

Motivated by functional interpretations of spatial language terms, and the need for cognitively plausible and practical abstractions for mobile service robots, we present a spatial representation based on the physical support of one object by another, corresponding to the preposition “on”. A perceptual model for evaluating this relation is suggested, and experiments - simulated as well as using a real robot - are presented. We indicate how this model can be used for important tasks such as communication of spatial knowledge, abstract reasoning and learning, taking as an example direct and indirect visual search. We also demonstrate the model experimentally and show that it produces intuitively feasible results from visual scene analysis as well as synthetic distributions that can be put to a number of uses.

Details