Arrow Research search

Author name cluster

Siyu Lu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

AAAI Conference 2026 Conference Paper

Unveiling the Attribute Misbinding Threat in Identity-Preserving Models

  • Junming Fu
  • Jishen Zeng
  • Yi Jiang
  • Peiyu Zhuang
  • Baoying Chen
  • Siyu Lu
  • Jianquan Yang

Identity-preserving models have led to notable progress in generating personalized content. Unfortunately, such models also exacerbate risks when misused, for instance, by generating threatening content targeting specific individuals. This paper introduces the Attribute Misbinding Attack, a novel method that poses a threat to identity-preserving models by inducing them to produce Not-Safe-For-Work (NSFW) content. The attack's core idea involves crafting benign-looking textual prompts to circumvent text-filter safeguards and leverage a key model vulnerability: flawed attribute binding that stems from its internal attention bias. This results in misattributing harmful descriptions to a target identity and generating NSFW outputs. To facilitate the study of this attack, we present the Misbinding Prompt evaluation set, which examines the content generation risks of current state-of-the-art identity-preserving models across four risk dimensions: pornography, violence, discrimination, and illegality. Additionally, we introduce the Attribute Binding Safety Score (ABSS), a metric for concurrently assessing both content fidelity and safety compliance. Experimental results show that our Misbinding Prompt evaluation set achieves a 5.28 % higher success rate in bypassing five leading text filters (including GPT-4o) compared to existing main-stream evaluation sets, while also demonstrating the highest proportion of NSFW content generation. The proposed ABSS metric enables a more comprehensive evaluation of identity-preserving models by concurrently assessing both content fidelity and safety compliance.

EAAI Journal 2023 Journal Article

Scaling-up medical vision-and-language representation learning with federated learning

  • Siyu Lu
  • Zheng Liu
  • Tianlin Liu
  • Wangchunshu Zhou

Medical Vision-and-Language Pre-training (MedVLP), which learns generic vision-language representations from medical images and texts to benefit various downstream medical tasks, has drawn remarkable attention in both artificial intelligence and clinical medicine. However, existing works ignore the privacy issues and the heavy computation burden in MedVLP. In this study, we propose a FedMedVLP model, which adopts federated learning to unify the datasets from different clients, e. g. , centers and hospitals, to form a large-scale pre-training dataset. As a result, the unified large-scale pre-training dataset can be used to pre-train the MedVLP to achieve strong performance. Overall, our FedMedVLP can improve the performance of MedVLP while preventing data leakage. Extensive experiments prove that the proposed model sets new state-of-the-art results on five benchmark datasets across three medical mainstream tasks, i. e. , medical image–text retrieval, medical text-image retrieval, and medical visual question answering tasks. Besides, we further evaluate our method on our curated well-balanced medical dataset COVID-Fed.

ICRA Conference 2021 Conference Paper

Lywal: a Leg-Wheel Transformable Quadruped Robot with Picking up and Transport Functions

  • Yongjiang Xue
  • Xichen Yuan
  • Yuhai Wang
  • Yang Yang
  • Siyu Lu
  • Bo Zhang
  • Juezhu Lai
  • Jianming Wang

This paper introduces a leg-wheel transformable quadruped robot named Lywal which can switch to the leg-mode and the wheel-mode for locomotion, and the claw-mode for picking up and transport functions. First, the mechanical structure of Lywal is designed by using an innovative 2-DoF transformable mechanism. Second, the calculation of kinematics is analyzed in detail. Then, the switching-mode strategy and the mobile control strategies in different modes are designed. Finally, the prototype of Lywal is built. The properties of the mobile modes are analyzed, and the picking-up and transport functions of the claw-mode are verified through physical experiments.

AAAI Conference 2018 Conference Paper

Deep Multi-View Spatial-Temporal Network for Taxi Demand Prediction

  • Huaxiu Yao
  • Fei Wu
  • Jintao Ke
  • Xianfeng Tang
  • Yitian Jia
  • Siyu Lu
  • Pinghua Gong
  • Jieping Ye

Taxi demand prediction is an important building block to enabling intelligent transportation systems in a smart city. An accurate prediction model can help the city pre-allocate resources to meet travel demand and to reduce empty taxis on streets which waste energy and worsen the traffic congestion. With the increasing popularity of taxi requesting services such as Uber and Didi Chuxing (in China), we are able to collect large-scale taxi demand data continuously. How to utilize such big data to improve the demand prediction is an interesting and critical real-world problem. Traditional demand prediction methods mostly rely on time series forecasting techniques, which fail to model the complex non-linear spatial and temporal relations. Recent advances in deep learning have shown superior performance on traditionally challenging tasks such as image classification by learning the complex features and correlations from largescale data. This breakthrough has inspired researchers to explore deep learning techniques on traffic prediction problems. However, existing methods on traffic prediction have only considered spatial relation (e. g. , using CNN) or temporal relation (e. g. , using LSTM) independently. We propose a Deep Multi-View Spatial-Temporal Network (DMVST-Net) framework to model both spatial and temporal relations. Specifically, our proposed model consists of three views: temporal view (modeling correlations between future demand values with near time points via LSTM), spatial view (modeling local spatial correlation via local CNN), and semantic view (modeling correlations among regions sharing similar temporal patterns). Experiments on large-scale real taxi demand data demonstrate effectiveness of our approach over state-ofthe-art methods.