Arrow Research search

Author name cluster

Simeng Qin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
1 author row

Possible papers

6

AAAI Conference 2026 Conference Paper

GeoShield: Safeguarding Geolocation Privacy from Vision-Language Models via Adversarial Perturbations

  • Xinwei Liu
  • Xiaojun Jia
  • Yuan Xun
  • Simeng Qin
  • Xiaochun Cao

Vision-Language Models (VLMs) such as GPT-4o now demonstrate a remarkable ability to infer users' locations from public shared images, posing a substantial risk to geoprivacy. Although adversarial perturbations offer a potential defense, current methods are ill-suited for this scenario: they often perform poorly on high-resolution images and low perturbation budgets, and may introduce irrelevant semantic content. To address these limitations, we propose GeoShield, a novel adversarial framework designed for robust geoprivacy protection in real-world scenarios. GeoShield comprises three key modules: a feature disentanglement module that separates geographical and non-geographical information, an exposure element identification module that pinpoints geo-revealing regions within an image, and a scale-adaptive enhancement module that jointly optimizes perturbations at both global and local levels to ensure effectiveness across resolutions. Extensive experiments on challenging benchmarks show that GeoShield consistently surpasses prior methods in black-box settings, achieving strong privacy protection with minimal impact on visual or semantic quality. To our knowledge, this work is the first to explore adversarial perturbations for defending against geolocation inference by advanced VLMs, providing a practical solution to escalating privacy concerns.

AAAI Conference 2026 Conference Paper

MPAS: Breaking Sequential Constraints of Multi-Agent Communication Topologies via Individual-Epistemic Message Propagation

  • Jingxuan Yu
  • Ju Jia
  • Simeng Qin
  • Xiaojun Jia
  • Siqi Ma
  • Yihao Huang
  • Yali Yuan
  • Guang Cheng

Large language model (LLM)-driven agents are designed to handle a wide range of tasks autonomously. As tasks become increasingly composite, the integration of multiple agents into a graph-structured system offers a promising solution. Recent advances mainly architect the communication order among agents into a specified directed acyclic graph, from which a one-by-one execution can be determined by topological sort. However, sequential architectures restrict the diversity of the information flow, hinder parallel computation, and exhibit vulnerabilities to potential backdoor threats. To overcome underlying shortcomings of sequential structures, we propose a node-wise multi-agent scheme, named message passing agent system (MPAS). Specifically, to parallelize the communication across agents, we extend the message propagation mechanism in graph representation learning to multi-agent scenarios and introduce our individual-epistemic message propagation. To further enhance expressiveness and robustness, we investigate three self-driven message aggregators. To achieve desired working flows, collaborative connections can be optimized without constraints. The experimental results reveal that compared to state-of-the-art sequential designs, MPAS could architect more advanced algorithms in 93.8% of the evaluations, reduce the average communication time from 84.6 seconds to 14.2 seconds per round on AQuA, and improve resilience against backdoor misinformation injection in 94.4% tests.

AAAI Conference 2026 Conference Paper

PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

  • Qi Guo
  • Xiaojun Jia
  • Shanmin Pang
  • Simeng Qin
  • Lin Wang
  • Ju Jia
  • Yang Liu
  • Qing Guo

Multimodal Large Language Models (MLLMs) are becoming integral to autonomous driving (AD) systems due to their strong vision-language reasoning capabilities. However, MLLMs are vulnerable to adversarial attacks—particularly adversarial patch attacks—which can pose serious threats in real-world scenarios. Existing patch-based attack methods are primarily designed for object detection models. Due to the more complex architectures and strong reasoning capabilities of MLLMs, these approaches perform poorly when transferred to MLLM-based systems. To address these limitations, we propose PhysPatch, a physically realizable and transferable adversarial patch framework tailored for MLLM-based AD systems. PhysPatch jointly optimizes patch location, shape, and content to enhance attack effectiveness and real-world applicability. It introduces a semantic-based mask initialization strategy for realistic placement, an SVD-based local alignment loss with patch-guided crop-resize to improve transferability, and a potential field-based mask refinement method. Extensive experiments across open-source, commercial, and reasoning-capable MLLMs demonstrate that PhysPatch significantly outperforms state-of-the-art (SOTA) methods in steering MLLM-based AD systems toward target-aligned perception and planning outputs. Moreover, PhysPatch consistently places adversarial patches in physically feasible regions of AD scenes, ensuring strong real-world applicability and deployability.

AAAI Conference 2026 Conference Paper

The Emotional Baby Is Truly Deadly: Does Your Multimodal Large Reasoning Model Have Emotional Flattery Towards Humans?

  • Yuan Xun
  • Xiaojun Jia
  • Xinwei Liu
  • Simeng Qin
  • Hua Zhang

Multimodal large reasoning models (MLRMs) have advanced visual-textual integration, enabling sophisticated human-AI interaction. While prior work has exposed MLRMs to visual jailbreaks, it remains underexplored how their reasoning capabilities reshape the security landscape under adversarial inputs. To fill this gap, we conduct a systematic security assessment of MLRMs and uncover a security-reasoning paradox: although deeper reasoning boosts cross‑modal risk recognition, it also creates cognitive blind spots that adversaries can exploit. We observe that MLRMs oriented toward human-centric service are highly susceptible to users' emotional cues during the deep-thinking stage, often overriding safety protocols or built‑in safety checks under high emotional intensity. Inspired by this key insight, we propose EmoAgent, an autonomous adversarial emotion-agent that orchestrates exaggerated affective prompts to hijack reasoning pathways. Even when visual risks are correctly identified, models can still produce harmful completions through emotional misalignment. We further identify persistent high-risk failure modes in transparent deep-thinking scenarios, such as MLRMs generating harmful reasoning masked behind seemingly safe responses. These failures expose misalignments between internal inference and surface-level behavior, eluding existing content-based safeguards. To quantify these risks, we introduce three metrics: (1) Risk-Reasoning Stealth Score (RRSS) for harmful reasoning beneath benign outputs; (2) Risk-Visual Neglect Rate (RVNR) for unsafe completions despite visual risk recognition; and (3) Refusal Attitude Inconsistency (RAIC) for evaluating refusal unstability under prompt variants. Extensive experiments on advanced MLRMs demonstrate the effectiveness of EmoAgent and reveal deeper emotional cognitive misalignments in model safety.

NeurIPS Conference 2025 Conference Paper

Adversarial Attacks against Closed-Source MLLMs via Feature Optimal Alignment

  • Xiaojun Jia
  • Sensen Gao
  • Simeng Qin
  • Tianyu Pang
  • Chao Du
  • Yihao Huang
  • Xinfeng Li
  • Yiming Li

Multimodal large language models (MLLMs) remain vulnerable to transferable adversarial examples. While existing methods typically achieve targeted attacks by aligning global features—such as CLIP’s [CLS] token—between adversarial and target samples, they often overlook the rich local information encoded in patch tokens. This leads to suboptimal alignment and limited transferability, particularly for closed-source models. To address this limitation, we propose a targeted transferable adversarial attack method based on feature optimal alignment, called FOA-Attack, to improve adversarial transfer capability. Specifically, at the global level, we introduce a global feature loss based on cosine similarity to align the coarse-grained features of adversarial samples with those of target samples. At the local level, given the rich local representations within Transformers, we leverage clustering techniques to extract compact local patterns to alleviate redundant local features. We then formulate local feature alignment between adversarial and target samples as an optimal transport (OT) problem and propose a local clustering optimal transport loss to refine fine-grained feature alignment. Additionally, we propose a dynamic ensemble model weighting strategy to adaptively balance the influence of multiple models during adversarial example generation, thereby further improving transferability. Extensive experiments across various models demonstrate the superiority of the proposed method, outperforming state-of-the-art methods, especially in transferring to closed-source MLLMs.

NeurIPS Conference 2025 Conference Paper

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

  • Xiaonan Si
  • Meilin Zhu
  • Simeng Qin
  • Lijia Yu
  • Lijun Zhang
  • Shuaitong Liu
  • Xinfeng Li
  • Ranjie Duan

Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) with external knowledge but are vulnerable to corpus poisoning and contamination attacks, which can compromise output integrity. Existing defenses often apply aggressive filtering, leading to unnecessary loss of valuable information and reduced reliability in generation. To address this problem, we propose a two-stage semantic filtering and conflict-free framework for trustworthy RAG. In the first stage, we perform a joint filter with semantic and cluster-based filtering which is guided by the Entity-intent-relation extractor (EIRE). EIRE extracts entities, latent objectives, and entity relations from both the user query and filtered documents, scores their semantic relevance, and selectively adds valuable documents into the clean retrieval database. In the second stage, we proposed an EIRE-guided conflict-aware filtering module, which analyzes semantic consistency between the query, candidate answers, and retrieved knowledge before final answer generation, filtering out internal and external contradictions that could mislead the model. Through this two-stage process, SeCon-RAG effectively preserves useful knowledge while mitigating conflict contamination, achieving significant improvements in both generation robustness and output trustworthiness. Extensive experiments across various LLMs and datasets demonstrate that the proposed SeCon-RAG markedly outperforms state-of-the-art defense methods.