Arrow Research search

Author name cluster

Rui Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

45 papers
2 author rows

Possible papers

45

AAAI Conference 2026 Conference Paper

Codebook-Centric Deep Hashing: End-to-End Joint Learning of Semantic Hash Centers and Neural Hash Function

  • Shuo Yin
  • Zhiyuan Yin
  • Yuqing Hou
  • Rui Liu
  • Yong Chen
  • Dell Zhang

Hash center-based deep hashing methods improve upon pairwise or triplet-based approaches by assigning fixed hash centers to each class as learning targets, thereby avoiding the inefficiency of local similarity optimization. However, random center initialization often disregards inter-class semantic relationships. While existing two-stage methods mitigate this by first refining hash centers with semantics and then training the hash function, they introduce additional complexity, computational overhead, and suboptimal performance due to stage-wise discrepancies. To address these limitations, we propose Center-Reassigned Hashing (CRH), an end-to-end framework that dynamically reassigns hash centers from a preset codebook while jointly optimizing the hash function. Unlike previous methods, CRH adapts hash centers to the data distribution without explicit center optimization phases, enabling seamless integration of semantic relationships into the learning process. Furthermore, a multi-head mechanism enhances the representational capacity of hash centers, capturing richer semantic structures. Extensive experiments on three benchmarks demonstrate that CRH learns semantically meaningful hash centers and outperforms state-of-the-art deep hashing methods in retrieval tasks.

YNIMG Journal 2026 Journal Article

Frequency-dependent modulation of human reward circuitry: A comparative study of theta, gamma, and high-frequency temporal interference

  • Yongxi Zhang
  • Zhenxiang Zang
  • Rui Liu
  • Ke Liu
  • Gang Wang
  • Zhi Yang

BACKGROUND: Temporal interference (TI) stimulation offers a noninvasive neuromodulation technique for targeting deep brain structures while sparing overlying cortical tissue. While early applications have validated TI's capacity to engage subcortical targets such as the hippocampus and striatum, the frequency-dependent mechanisms governing its efficacy remain poorly understood. This is particularly critical for the nucleus accumbens (NAc), a key hub in reward circuitry where invasive deep brain stimulation (DBS) typically operates at high frequencies (∼130 Hz). METHODS: In this study, we investigated whether TI stimulation induces frequency-specific modulation of NAc activity and its functional coupling with the prefrontal cortex. Using a within-subject, counterbalanced design, we applied individualized NAc-targeted TI stimulation at three distinct envelope frequencies (5 Hz, 40 Hz, and 130 Hz) in 24 healthy adults. Resting-state fMRI was acquired pre- and post-stimulation. RESULTS: Results revealed a distinct dissociation between local and circuit-level effects: TI stimulation induced no statistically significant changes in local spontaneous activity within the NAc across any frequency condition. In contrast, 130 Hz stimulation selectively reduced functional connectivity between the NAc and the medial prefrontal cortex (mPFC), whereas 5 Hz and 40 Hz conditions produced no such effect. Notably, despite the absence of significant group-level local modulation, the magnitude of individual NAc activity reduction under 130 Hz stimulation was significantly correlated with the extent of NAc-mPFC decoupling (r = -0.53). Exploratory analyses further revealed increased activity in the adjacent dorsal striatum (right putamen), consistent with a conduction-block model at the target core. CONCLUSIONS: These findings suggest that high-frequency TI mimics the network-disrupting effects of high-frequency DBS, offering evidence that TI can noninvasively modulate deep reward circuits in a parameter-specific manner for potential clinical application.

EAAI Journal 2026 Journal Article

Information gain-based diffusion model for group recommendation

  • Lijin Mu
  • Nan Wang
  • Rui Liu
  • Ziqi Liu

With the rapid development of social networks and their technologies, group recommendation systems have received increasing attention. However, traditional deep learning-based group recommendation methods still face two key challenges: vulnerability to noise, and sparse user-item interactions that hinder accurate preference modeling. To tackle these issues, we propose a group recommendation model based on information gain-driven diffusion. From the artificial intelligence perspective, our contribution lies in the engineering-oriented adaptation and enhancement of artificial intelligence techniques: we design a self-iterative diffusion denoising module to suppress noise in representation learning and an adaptive edge entropy algorithm based on information gain to guide graph augmentation under sparse interactions. These modules demonstrate how diffusion models and entropy principles can be structurally adapted to solve practical recommendation challenges. From the engineering application perspective, we further construct a Term Frequency–Inverse Document Frequency (TF-IDF) based weighted group graph to capture inter-group similarity and propose a multi-view fusion strategy to comprehensively model group preferences. Extensive experiments on real-world public datasets show that the proposed method effectively enhances denoising capability. It also significantly improves group recommendation accuracy, demonstrating strong practical value in scenarios such as social platforms and recommendation services.

AAAI Conference 2026 Conference Paper

MetaGDPO: Alleviating Catastrophic Forgetting with Metacognitive Knowledge Through Group Direct Preference Optimization

  • Lanxue Zhang
  • Yuqiang Xie
  • Fang Fang
  • Fanglong Dong
  • Rui Liu
  • Yanan Cao

Large Language Models demonstrate strong reasoning capabilities, which can be effectively compressed into smaller models. However, existing datasets and fine-tuning approaches still face challenges that lead to catastrophic forgetting, particularly for models smaller than 8B. First, most datasets typically ignore the relationship between training data knowledge and the model's inherent abilities, making it difficult to preserve prior knowledge. Second, conventional training objectives often fail to constrain inherent knowledge preservation, which can result in forgetting of previously learned skills. To address these issues, we propose a comprehensive solution that alleviates catastrophic forgetting from both the data and fine-tuning approach perspectives. On the data side, we construct a dataset of 5K instances that covers multiple reasoning tasks and incorporates metacognitive knowledge, making it more tolerant and effective for distillation into smaller models. We annotate the metacognitive knowledge required to solve each question and filter the data based on task knowledge and the model's inherent skills. On the training side, we introduce GDPO (Group Direction Preference Optimization), which is better suited for resource-limited scenarios and can efficiently approximate the performance of GRPO. Guided by the large model and by implicitly constraining the optimization path through a reference model, GDPO enables more effective knowledge transfer from the large model and constrains excessive parameter drift. Extensive experiments demonstrate that our approach significantly alleviates catastrophic forgetting and improves reasoning performance on smaller models.

TIST Journal 2026 Journal Article

Mitigating Data Redundancy to Revitalize Transformer-Based Long-Term Time Series Forecasting System

  • Mingjie Li
  • Rui Liu
  • Guangsi Shi
  • Mingfei Han
  • Changlin Li
  • Lina Yao
  • Xiaojun Chang
  • Ling Chen

Long-term time series forecasting (LTSF) is fundamental to various real-world applications, where Transformer-based models have become the dominant framework due to their ability to capture long-range dependencies. However, these models often experience overfitting due to data redundancy in rolling forecasting settings, limiting their generalization ability particularly evident in longer sequences with highly similar adjacent data. In this work, we introduce CLMFormer, a novel framework that mitigates redundancy through curriculum learning and a memory-driven decoder. Specifically, we progressively introduce Bernoulli noise to the training samples, which effectively breaks the high similarity between adjacent data points. This curriculum-driven noise introduction aids the memory-driven decoder by supplying more diverse and representative training data, enhancing the decoder’s ability to model seasonal tendencies and dependencies in the time series data. To further enhance forecasting accuracy, we introduce a memory-driven decoder. This component enables the model to capture seasonal tendencies and dependencies in the time series data and leverages temporal relationships to facilitate the forecasting process. Extensive experiments on six real-world LTSF benchmarks show that CLMFormer consistently improves Transformer-based models by up to 30%, demonstrating its effectiveness in long-horizon forecasting.

AAAI Conference 2026 Conference Paper

MMMamba: A Versatile Cross-Modal in Context Fusion Framework for Pan-Sharpening and Zero-Shot Image Enhancement

  • Yingying Wang
  • Xuanhua He
  • Chen Wu
  • Jialing Huang
  • Suiyun Zhang
  • Rui Liu
  • Xinghao Ding
  • Haoxuan Che

Pan-sharpening aims to generate high-resolution multispectral (HRMS) images by integrating a high-resolution panchromatic (PAN) image with its corresponding low-resolution multispectral (MS) image. To achieve effective fusion, it is crucial to fully exploit the complementary information between the two modalities. Traditional CNN-based methods typically rely on channel-wise concatenation with fixed convolutional operators, which limits their adaptability to diverse spatial and spectral variations. While cross-attention mechanisms enable global interactions, they are computationally inefficient and may dilute fine-grained correspondences, making it difficult to capture complex semantic relationships. Recent advances in the Multimodal Diffusion Transformer (MMDiT) architecture have demonstrated impressive success in image generation and editing tasks. Unlike cross-attention, MMDiT employs in-context conditioning to facilitate more direct and efficient cross-modal information exchange. In this paper, we propose MMMamba, a cross-modal in-context fusion framework for pan-sharpening, with the flexibility to support image super-resolution in a zero-shot manner. Built upon the Mamba architecture, our design ensures linear computational complexity while maintaining strong cross-modal interaction capacity. Furthermore, we introduce a novel multimodal interleaved (MI) scanning mechanism that facilitates effective information exchange between the PAN and MS modalities. Extensive experiments demonstrate the superior performance of our method compared to existing state-of-the-art (SOTA) techniques across multiple tasks and benchmarks.

AAAI Conference 2026 Conference Paper

Towards Authentic Movie Dubbing with Retrieve-Augmented Director-Actor Interaction Learning

  • Rui Liu
  • Yuan Zhao
  • Zhenqi Jia

The automatic movie dubbing model generates vivid speech from given scripts, replicating a speaker's timbre from a brief timbre prompt while ensuring lip-sync with the silent video. Existing approaches simulate a simplified workflow where actors dub directly without preparation, overlooking the critical director–actor interaction. In contrast, authentic workflows involve a dynamic collaboration: directors actively engage with actors, guiding them to internalize the context cues, specifically emotion, before performance. To address this issue, we propose a new Retrieve-Augmented Director-Actor Interaction Learning scheme to achieve authentic movie dubbing, termed Authentic-Dubber, which contains three novel mechanisms: (1) We construct a multimodal Reference Footage library to simulate the learning footage provided by directors. Note that we integrate Large Language Models (LLMs) to achieve deep comprehension of emotional representations across multimodal signals. (2) To emulate how actors efficiently and comprehensively internalize director-provided footage during dubbing, we propose an Emotion-Similarity-based Retrieval-Augmentation strategy. This strategy retrieves the most relevant multimodal information that aligns with the target silent video. (3) We develop a Progressive Graph-based speech generation approach that incrementally incorporates the retrieved multimodal emotional knowledge, thereby simulating the actor's final dubbing process. The above mechanisms enable the Authentic-Dubber to faithfully replicate the authentic dubbing workflow, achieving comprehensive improvements in emotional expressiveness. Both subjective and objective evaluations on the V2C-Animation benchmark dataset validate the effectiveness.

AAAI Conference 2026 Conference Paper

VBF++: Variational Bayesian Fusion with Context-Aware Priors and Recommendation-Guided Adversarial Refinement for Multimodal Video Recommendation

  • Ziyi Cao
  • Rui Liu
  • Yong Chen

Multimodal video recommendation systems face fundamental challenges in determining optimal fusion strategies across diverse content types and user preferences. Existing methods suffer from two critical limitations: (1) their fusion strategies are guided by context-agnostic priors that ignore the semantic structure of content, assuming the same simple distribution (typically a standard multivariate Gaussian prior) governs optimal fusion for all video types, and (2) their optimization objectives, particularly the Evidence Lower Bound (ELBO), are misaligned with the final recommendation goal, optimizing for feature reconstruction rather than ranking performance. To address these fundamental issues, this work proposes VBF++, a novel framework that introduces context-aware structured priors and recommendation-guided adversarial refinement. First, the method designs context-aware priors that learn cluster-specific distributions based on video semantic categories, replacing uninformative priors with structured, content-aware prior distributions. Second, it introduces a Recommendation-Guided Adversarial Refinement (RAR) paradigm that explicitly steers the learning process towards generating recommendation-optimal fusion strategies, resolving the objective misalignment inherent in variational learning. Enhanced with domain-adaptive meta-learning, extensive experiments on three real-world datasets demonstrate consistent improvements of 4.7-8.3 percent in Precision@10 over state-of-the-art methods. Analysis reveals that learned fusion strategies exhibit semantically meaningful patterns, prioritizing visual features for action content, acoustic information for music videos, and textual descriptions for documentary material.

AAAI Conference 2026 Conference Paper

What You See Is What You Reach: Towards Spatial Navigation with High-Level Human Instructions

  • Lingfeng Zhang
  • Haoxiang Fu
  • Xiaoshuai Hao
  • Shuyi Zhang
  • Qiang Zhang
  • Rui Liu
  • Long Chen
  • Wenbo Ding

Embodied navigation is a fundamental capability that enables embodied agents to effectively interact with the physical world in various complex environments. However, a significant gap remains between current embodied navigation tasks and real-world requirements, as existing methods often struggle to integrate high-level human instructions with spatial understanding. To address this gap, we propose a new task of embodied navigation called spatial navigation, which encompasses two key components: spatial object navigation (SpON) for object-specific guidance and spatial area navigation (SpAN) for navigating to designated areas. Specifically, SpON guides agents to specific objects by leveraging spatial relationships and contextual understanding, while SpAN focuses on navigating to defined areas within complex environments. Together, these components significantly enhance agents’ navigation capabilities, enabling more effective interactions in real-world scenarios. To support this task, we have generated a spatial navigation dataset consisting of 10K trajectories within the simulator. This dataset includes high-level human instructions, detailed observations, and corresponding navigation actions, providing a comprehensive resource to enhance agent training and performance. Building on the spatial navigation dataset, we introduce SpNav, a hierarchical navigation framework. Specifically, SpNav employs vision-language model (VLM) to interpret high-level human instructions and accurately identify goal objects or areas within the observation range, achieving precise point-to-point navigation using a map and enhancing the agent’s ability to oper- ate effectively in complex environments by bridging the gap between perception and action. Extensive experiments show that SpNav achieves state-of-the-art (SOTA) performance in spatial navigation tasks across both simulated and real-world environments, validating the effectiveness of our method.

TIST Journal 2026 Journal Article

You Can Only Tune Normalization: A Simple and Effective Approach to Parameter-Efficient Fine-Tuning

  • Lingyun Huang
  • JianXu Mao
  • Junfei Yi
  • Ziming Tao
  • Ziyang Peng
  • Wei He
  • Rui Liu
  • Yaonan Wang

To tackle the issue of excessive parameter volumes during fine-tuning of large-scale pre-trained models with full parameters, Parameter-Efficient Fine-Tuning (PEFT) methods have been introduced. The core concept involves freezing the backbone network of the model and updating only a small subset of parameters. This strategy not only decreases the number of parameters needed for training but also delivers performance comparable to Full-Tuning, even surpassing it on certain datasets. However, most popular PEFT methods introduce extra parameters or modules for fine-tuning, which come with inherent limitations. In response, we propose a straightforward and efficient PEFT method called You Can Only Tune Normalization (YONO). YONO focuses solely on tuning the normalization layer and the final classification layer of the model. This method avoids adding extra modules, making it easily applicable to any model without causing inference delays. We extensively tested YONO on 28 benchmark datasets, and the results indicate that it requires significantly fewer parameters compared to other advanced PEFT methods. Additionally, we validated YONO’s efficiency and generalizability across various vision models. Finally, we further explore the essence of PEFT methods, whether they learn new knowledge or expose the capabilities that a model has already learned. Our findings suggest that YONO is more sensitive to improvements in dataset quality, making it a promising candidate for future scaling to larger models.

EAAI Journal 2025 Journal Article

A lightweight model based on multi-scale feature fusion for ultrasonic welding surface defect detection

  • Rui Liu
  • Lun Zhao
  • Yu Ren
  • Zhonghua Shen
  • Liya Li
  • Jianfeng Luo
  • Zeshan Abbas

Ultrasonic welding technology is crucial in industrial and medical fields, relying on precise surface defect detection for quality assurance. Traditional methods suffer from low accuracy, efficiency, high costs, and complex implementation. Additionally, current neural networks for ultrasonic surface defect detection struggle to balance parameter optimization with detection accuracy. To solve this problem, we proposed a lightweight model based on multi-scale feature fusion for the Ultrasonic Weld Surface Defect Detection Network (UWSDNet). First, the feature extraction module with reparameterization technology (FRT) and application of efficient multi-scale attention (EMA) are proposed to alleviate network redundant parameters and computational overhead brought by welding background. Secondly, the multi-core feature enhancement module (MCM) is introduced. It enhances multi-scale object detection with fewer parameters to cope with the actual edge deployment of ultrasonic welding. Finally, the lightweight asymmetric detection head (LADH) and contextual and spatial feature calibration network (CSFCN) are introduced into the network. To improve the multi-core dimensional feature capture capability, to solve the problem of large size span of ultrasonic welding surface defects. Experimental evaluations on a self-built ultrasonic welding wire harness defect dataset show that UWSDNet achieves the mean average precision (mAP) of 88. 9%, the precision of 95. 6% with parameters of 12. 7M. In addition, UWSDNet achieves excellent performance on the publicly available NEU-DET dataset, demonstrating strong generalization and application potential in industrial defect detection.

NeurIPS Conference 2025 Conference Paper

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization

  • Haotian Luo
  • Haiying He
  • Yibo Wang
  • Jinluan Yang
  • Rui Liu
  • Naiqiang Tan
  • Xiaochun Cao
  • Dacheng Tao

Recently, long-thought reasoning models achieve strong performance on complex reasoning tasks, but often incur substantial inference overhead, making efficiency a critical concern. Our empirical analysis reveals that the benefit of using Long-CoT varies across problems: while some problems require elaborate reasoning, others show no improvement—or even degraded accuracy. This motivates adaptive reasoning strategies that tailor reasoning depth to the input. However, prior work primarily reduces redundancy within long reasoning paths, limiting exploration of more efficient strategies beyond the Long-CoT paradigm. To address this, we propose a novel two-stage framework for adaptive and efficient reasoning. First, we construct a hybrid reasoning model by merging long and short CoT models to enable diverse reasoning styles. Second, we apply bi-level preference training to guide the model to select suitable reasoning styles (group-level), and prefer concise and correct reasoning within each style group (instance-level). Experiments demonstrate that our method significantly reduces inference costs compared to other baseline approaches, while maintaining performance. Notably, on five mathematical datasets, the average length of reasoning is reduced by more than 50\%, highlighting the potential of adaptive strategies to optimize reasoning efficiency in large language models.

NeurIPS Conference 2025 Conference Paper

CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems

  • Rui Liu
  • Yu Shen
  • Peng Gao
  • Pratap Tokekar
  • Ming C. Lin

Multi-modal learning has emerged as a key technique for improving performance across domains such as autonomous driving, robotics, and reasoning. However, in certain scenarios, particularly in resource-constrained environments, some modalities available during training may be absent during inference. While existing frameworks effectively utilize multiple data sources during training and enable inference with reduced modalities, they are primarily designed for single-agent settings. This poses a critical limitation in dynamic environments such as connected autonomous vehicles (CAV), where incomplete data coverage can lead to decision-making blind spots. Conversely, some works explore multi-agent collaboration but without addressing missing modality at test time. To overcome these limitations, we propose Collaborative Auxiliary Modality Learning (CAML), a novel multi-modal multi-agent framework that enables agents to collaborate and share multi-modal data during training, while allowing inference with reduced modalities during testing. Experimental results in collaborative decision-making for CAV in accident-prone scenarios demonstrate that CAML achieves up to a 58. 1% improvement in accident detection. Additionally, we validate CAML on real-world aerial-ground robot data for collaborative semantic segmentation, achieving up to a 10. 6% improvement in mIoU.

JBHI Journal 2025 Journal Article

EEG-Deformer: A Dense Convolutional Transformer for Brain-Computer Interfaces

  • Yi Ding
  • Yong Li
  • Hao Sun
  • Rui Liu
  • Chengxuan Tong
  • Chenyu Liu
  • Xinliang Zhou
  • Cuntai Guan

Effectively learning the temporal dynamics in electroencephalogram (EEG) signals is challenging yet essential for decoding brain activities using brain-computer interfaces (BCIs). Although Transformers are popular for their long-term sequential learning ability in the BCI field, most methods combining Transformers with convolutional neural networks (CNNs) fail to capture the coarse-to-fine temporal dynamics of EEG signals. To overcome this limitation, we introduce EEG-Deformer, which incorporates two main novel components into a CNN-Transformer: (1) a Hierarchical Coarse-to-Fine Transformer (HCT) block that integrates a Fine-grained Temporal Learning (FTL) branch into Transformers, effectively discerning coarse-to-fine temporal patterns; and (2) a Dense Information Purification (DIP) module, which utilizes multi-level, purified temporal information to enhance decoding accuracy. Comprehensive experiments on three representative cognitive tasksâcognitive attention, driving fatigue, and mental workload detectionâconsistently confirm the generalizability of our proposed EEG-Deformer, demonstrating that it either outperforms or performs comparably to existing state-of-the-art methods. Visualization results show that EEG-Deformer learns from neurophysiologically meaningful brain regions for the corresponding cognitive tasks.

JBHI Journal 2025 Journal Article

FIND: A Framework for Iterative to Non-Iterative Distillation for Lightweight Deformable Registration

  • Yongtai Zhuo
  • Mingkang Liu
  • Jie Liu
  • Zhikai Yang
  • Rui Liu
  • Peng Xue
  • Lixu Gu

Deformable image registration is crucial for medical image analysis, yet the complexity of deep learning networks often limits their deployment on resource-limited devices. Current distillation methods in registration tasks fail to effectively transfer complex deformation handling capabilities to non-iterative lightweight networks, leading to insignificant performance improvement. To address this, we propose the Framework for Iterative to Non-iterative Distillation (FIND), which efficiently transfers these capabilities to a Non-Iterative Lightweight (NIL) network. FIND employs a dual-step process: first, using recurrent distillation to derive a high-performance non-iterative teacher assistant from an iterative network; second, using advanced feature distillation from the assistant to the lightweight network. This enables NIL to perform rapid, effective registration on resource-limited devices. Experiments across four datasets show that NIL can achieve up to 60 times faster performance on CPU and 89 times on GPU than compared deep learning methods, with superior registration accuracy improvements of up to 3. 5 points in Dice scores.

AAAI Conference 2025 Conference Paper

Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech

  • Rui Liu
  • Shuwei He
  • Yifan Hu
  • Haizhou Li

Visual Text-to-Speech (VTTS) aims to take the environmental image as the prompt to synthesize the reverberant speech for the spoken content. The challenge of this task lies in understanding the spatial environment from the image. Many attempts have been made to extract global spatial visual information from the RGB space of an spatial image. However, local and depth image information are crucial for understanding the spatial environment, which previous works have ignored. To address the issues, we propose a novel multi-modal and multi-scale spatial environment understanding scheme to achieve immersive VTTS, termed M2SE-VTTS. The multi-modal aims to take both the RGB and Depth spaces of the spatial image to learn more comprehensive spatial information, and the multi-scale seeks to model the local and global spatial knowledge simultaneously. Specifically, we first split the RGB and Depth images into patches and adopt the Gemini-generated environment captions to guide the local spatial understanding. After that, the multi-modal and multi-scale features are integrated by the local-aware global spatial understanding. In this way, M2SE-VTTS effectively models the interactions between local and global spatial contexts in the multi-modal spatial environment. Objective and subjective evaluations suggest that our model outperforms the advanced baselines in environmental speech generation.

NeurIPS Conference 2025 Conference Paper

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

  • Yibo Wang
  • Tiansheng Huang
  • Li Shen
  • Huanjin Yao
  • Haotian Luo
  • Rui Liu
  • Naiqiang Tan
  • Jiaxing Huang

Harmful fine-tuning attack introduces significant security risks to the fine-tuning services. Main-stream defenses aim to vaccinate the model such that the later harmful fine-tuning attack is less effective. However, our evaluation results show that such defenses are fragile-- with a few fine-tuning steps, the model still can learn the harmful knowledge. To this end, we do further experiment and find that an embarrassingly simple solution-- adding purely random perturbations to the fine-tuned model, can recover the model from harmful behaviors, though it leads to a degradation in the model’s fine-tuning performance. To address the degradation of fine-tuning performance, we further propose \methodname, which optimizes an adaptive perturbation that will be applied to the model after fine-tuning. \methodname maintains model's safety alignment performance without compromising downstream fine-tuning performance. Comprehensive experiments are conducted on different harmful ratios, fine-tuning tasks and mainstream LLMs, where the average harmful scores are reduced by up-to 21. 2%, while maintaining fine-tuning performance. As a by-product, we analyze the adaptive perturbation and show that different layers in various LLMs have distinct safety coefficients. Source code available at https: //github. com/w-yibo/Panacea.

YNICL Journal 2025 Journal Article

Role of baseline resting-state functional connectivity of the nucleus accumbens subregions in antidepressant treatment in major depressive disorder

  • Yun Wang
  • Jingjing Zhou
  • Xiongying Chen
  • Rui Liu
  • Zhifang Zhang
  • Yuan Feng
  • Yuan Zhou
  • Gang Wang

The nucleus accumbens (NAc) plays a crucial role in the pathophysiology of major depressive disorder (MDD), and abnormal resting-state functional connectivity (rsFC) of NAc subregions has been found in MDD. However, it is unclear whether the altered rsFC of NAc subregions can predict the efficacy of antidepressant treatment, and whether antidepressants are capable of restoring the altered rsFC of NAc subregions in MDD. The purpose of this study was to investigate the role of rsFC of the NAc subregions in antidepressant treatment for MDD. Resting-state functional magnetic resonance imaging (fMRI) data were collected from 46 unmedicated MDD patients at baseline and after 12 weeks of escitalopram treatment, along with fMRI data from 58 healthy controls (HCs). We examined group differences in rsFC of the NAc subregions between MDD patients and HCs, explored whether the altered rsFC at baseline was associated with treatment efficacy, and evaluated whether antidepressant treatment could normalize rsFC abnormalities in the NAc subregions in MDD. Compared to HCs, MDD patients exhibited decreased rsFC between the NAc subregions and the middle cingulate cortex (MCC). Lower levels of rsFC between the NAc subregions and the MCC at baseline predicted greater improvement in depressive symptoms. Furthermore, rsFC between the NAc subregions and the MCC increased following antidepressant treatment in MDD. Our findings suggest that rsFC alterations between the NAc subregions and the MCC may serve as a potential biomarker for predicting antidepressant treatment efficacy, and that dysfunction in the frontal-ventral striatum circuitry may represent a key therapeutic target for MDD.

JBHI Journal 2024 Journal Article

Deep Learning-Based Microscopic Cell Detection Using Inverse Distance Transform and Auxiliary Counting

  • Rui Liu
  • Wei Dai
  • Cong Wu
  • Tianyi Wu
  • Min Wang
  • Junxian Zhou
  • Xiaozhen Zhang
  • Wen Jung Li

Microscopic cell detection is a challenging task due to significant inter-cell occlusions in dense clusters and diverse cell morphologies. This paper introduces a novel framework designed to enhance automated cell detection. The proposed approach integrates a deep learning model that produces an inverse distance transform-based detection map from the given image, accompanied by a secondary network designed to regress a cell density map from the same input. The inverse distance transform-based map effectively highlights each cell instance in the densely populated areas, while the density map accurately estimates the total cell count in the image. Then, a custom counting-aided cell center extraction strategy leverages the cell count obtained by integrating over the density map to refine the detection process, significantly reducing false responses and thereby boosting overall accuracy. The proposed framework demonstrated superior performance with F-scores of 96. 93%, 91. 21%, and 92. 00% on the VGG, MBM, and ADI datasets, respectively, surpassing existing state-of-the-art methods. It also achieved the lowest distance error, further validating the effectiveness of the proposed approach. These results demonstrate significant potential for automated cell analysis in biomedical applications.

JBHI Journal 2024 Journal Article

Deeply Supervised Skin Lesions Diagnosis With Stage and Branch Attention

  • Wei Dai
  • Rui Liu
  • Tianyi Wu
  • Min Wang
  • Jianqin Yin
  • Jun Liu

Accurate and unbiased examinations of skin lesions are critical for the early diagnosis and treatment of skin diseases. Visual features of skin lesions vary significantly because the images are collected from patients with different lesion colours and morphologies by using dissimilar imaging equipment. Recent studies have reported that ensembled convolutional neural networks (CNNs) are practical to classify the images for early diagnosis of skin disorders. However, the practical use of these ensembled CNNs is limited as these networks are heavyweight and inadequate for processing contextual information. Although lightweight networks (e. g. , MobileNetV3 and EfficientNet) were developed to achieve parameter reduction for implementing deep neural networks on mobile devices, insufficient depth of feature representation restricts the performance. To address the existing limitations, we develop a new lite and effective neural network, namely HierAttn. The HierAttn applies a novel deep supervision strategy to learn the local and global features by using multi-stage and multi-branch attention mechanisms with only one training loss. The efficacy of HierAttn was evaluated by using the dermoscopy images dataset ISIC2019 and smartphone photos dataset PAD-UFES-20 (PAD2020). The experimental results show that HierAttn achieves the best accuracy and area under the curve (AUC) among the state-of-the-art lightweight networks.

AAAI Conference 2024 Conference Paper

Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling

  • Rui Liu
  • Yifan Hu
  • Yi Ren
  • Xiang Yin
  • Haizhou Li

Conversational Speech Synthesis (CSS) aims to accurately express an utterance with the appropriate prosody and emotional inflection within a conversational setting. While recognising the significance of CSS task, the prior studies have not thoroughly investigated the emotional expressiveness problems due to the scarcity of emotional conversational datasets and the difficulty of stateful emotion modeling. In this paper, we propose a novel emotional CSS model, termed ECSS, that includes two main components: 1) to enhance emotion understanding, we introduce a heterogeneous graph-based emotional context modeling mechanism, which takes the multi-source dialogue history as input to model the dialogue context and learn the emotion cues from the context; 2) to achieve emotion rendering, we employ a contrastive learning-based emotion renderer module to infer the accurate emotion style for the target utterance. To address the issue of data scarcity, we meticulously create emotional labels in terms of category and intensity, and annotate additional emotional information on the existing conversational dataset (DailyTalk). Both objective and subjective evaluations suggest that our model outperforms the baseline models in understanding and rendering emotions. These evaluations also underscore the importance of comprehensive emotional annotations. Code and audio samples can be found at: https://github.com/walker-hyf/ECSS.

NeurIPS Conference 2024 Conference Paper

Vision-Language Navigation with Energy-Based Policy

  • Rui Liu
  • Wenguan Wang
  • Yi Yang

Vision-language navigation (VLN) requires an agent to execute actions following human instructions. Existing VLN models are optimized through expert demonstrations by supervised behavioural cloning or incorporating manual reward engineering. While straightforward, these efforts overlook the accumulation of errors in the Markov decision process, and struggle to match the distribution of the expert policy. Going beyond this, we propose an Energy-based Navigation Policy (ENP) to model the joint state-action distribution using an energy-based model. At each step, low energy values correspond to the state-action pairs that the expert is most likely to perform, and vice versa. Theoretically, the optimization objective is equivalent to minimizing the forward divergence between the occupancy measure of the expert and ours. Consequently, ENP learns to globally align with the expert policy by maximizing the likelihood of the actions and modeling the dynamics of the navigation states in a collaborative manner. With a variety of VLN architectures, ENP achieves promising performances on R2R, REVERIE, RxR, and R2R-CE, unleashing the power of existing VLN models.

ICLR Conference 2023 Conference Paper

Coverage-centric Coreset Selection for High Pruning Rates

  • Haizhong Zheng
  • Rui Liu
  • Fan Lai 0001
  • Atul Prakash 0001

One-shot coreset selection aims to select a representative subset of the training data, given a pruning rate, that can later be used to train future models while retaining high accuracy. State-of-the-art coreset selection methods pick the highest importance examples based on an importance metric and are found to perform well at low pruning rates. However, at high pruning rates, they suffer from a catastrophic accuracy drop, performing worse than even random sampling. This paper explores the reasons behind this accuracy drop both theoretically and empirically. We first propose a novel metric to measure the coverage of a dataset on a specific distribution by extending the classical geometric set cover problem to a distribution cover problem. This metric helps explain why coresets selected by SOTA methods at high pruning rates perform poorly compared to random sampling because of worse data coverage. We then propose a novel one-shot coreset selection method, Coverage-centric Coreset Selection (CCS), that jointly considers overall data coverage upon a distribution as well as the importance of each example. We evaluate CCS on five datasets and show that, at high pruning rates (e.g., 90%), it achieves significantly better accuracy than previous SOTA methods (e.g., at least 19.56% higher on CIFAR10) as well as random selection (e.g., 7.04% higher on CIFAR10) and comparable accuracy at low pruning rates. We make our code publicly available at https://github.com/haizhongzheng/Coverage-centric-coreset-selection.

YNICL Journal 2022 Journal Article

Altered frequency-specific/universal amplitude characteristics of spontaneous brain oscillations in patients with bipolar disorder

  • Zhi-Fang Zhang
  • Qi-Jing Bo
  • Feng Li
  • Lei Zhao
  • Peng Gao
  • Yun Wang
  • Rui Liu
  • Xiong-Ying Chen

The human brain is a dynamic system with intrinsic oscillations in spontaneous neural activity. Whether the dynamic characteristics of these spontaneous oscillations are differentially altered across different frequency bands in patients with bipolar disorder (BD) remains unclear. This study recruited 65 patients with BD and 85 healthy controls (HCs). The entire frequency range of resting-state fMRI data was decomposed into four frequency intervals. Two-way repeated-measures ANCOVA was employed to detect frequency-specific/universal alterations in the dynamic oscillation amplitude in BD. The patients were then divided into two subgroups according to their mood states to explore whether these alterations were independent of their mood states. Finally, other window sizes, step sizes, and window types were tested to replicate all analyses. Frequency-specific abnormality of the dynamic oscillation amplitude was detected within the posterior medial parietal cortex (centered at the precuneus extending to the posterior cingulate cortex). This specific profile indicates decreased amplitudes in the lower frequency bands (slow-5/4) and no amplitude changes in the higher frequency bands (slow-3/2) compared with HCs. Frequency-universal abnormalities of the dynamic oscillation amplitude were also detectable, indicating increased amplitudes in the thalamus and left cerebellum anterior lobe but decreased amplitudes in the medial superior frontal gyrus. These alterations were independent of the patients' mood states and replicable across multiple analytic and parametric settings. In short, frequency-specific/universal amplitude characteristics of spontaneous oscillations were observed in patients with BD. These abnormal characteristics have important implications for specific functional changes in BD from multiple frequency and dynamic perspectives.

IJCAI Conference 2022 Conference Paper

Neutral Utterances are Also Causes: Enhancing Conversational Causal Emotion Entailment with Social Commonsense Knowledge

  • Jiangnan Li
  • Fandong Meng
  • Zheng Lin
  • Rui Liu
  • Peng Fu
  • Yanan Cao
  • Weiping Wang
  • Jie Zhou

Conversational Causal Emotion Entailment aims to detect causal utterances for a non-neutral targeted utterance from a conversation. In this work, we build conversations as graphs to overcome implicit contextual modelling of the original entailment style. Following the previous work, we further introduce the emotion information into graphs. Emotion information can markedly promote the detection of causal utterances whose emotion is the same as the targeted utterance. However, it is still hard to detect causal utterances with different emotions, especially neutral ones. The reason is that models are limited in reasoning causal clues and passing them between utterances. To alleviate this problem, we introduce social commonsense knowledge (CSK) and propose a Knowledge Enhanced Conversation graph (KEC). KEC propagates the CSK between two utterances. As not all CSK is emotionally suitable for utterances, we therefore propose a sentiment-realized knowledge selecting strategy to filter CSK. To process KEC, we further construct the Knowledge Enhanced Directed Acyclic Graph networks. Experimental results show that our method outperforms baselines and infers more causes with different emotions from the targeted utterance.

AAAI Conference 2022 Conference Paper

Transformer with Memory Replay

  • Rui Liu
  • Barzan Mozafari

Transformers achieve state-of-the-art performance for natural language processing tasks by pre-training on large-scale text corpora. They are extremely compute-intensive and have very high sample complexity. Memory replay is a mechanism that remembers and reuses past examples by saving to and replaying from a memory buffer. It has been successfully used in reinforcement learning and GANs due to better sample efficiency. In this paper, we propose Transformer with Memory Replay (TMR), which integrates memory replay with transformer, making transformer more sample-efficient. Experiments on GLUE and SQuAD benchmark datasets show that Transformer with Memory Replay achieves at least 1% point increase compared to the baseline transformer model when pretrained with the same number of examples. Further, by adopting a careful design that reduces the wall-clock time overhead of memory replay, we also empirically achieve a better runtime efficiency.

YNICL Journal 2021 Journal Article

Anhedonia correlates with functional connectivity of the nucleus accumbens subregions in patients with major depressive disorder

  • Rui Liu
  • Yun Wang
  • Xiongying Chen
  • Zhifang Zhang
  • Le Xiao
  • Yuan Zhou

BACKGROUND: The nucleus accumbens (NAc) is an important region in reward circuit that has been linked with anhedonia, which is a characteristic symptom of major depressive disorder (MDD). However, the relationship between the functional connectivity of the NAc subregions and anhedonia in MDD patients remains unclear. METHODS: We acquired resting-state functional magnetic resonance imaging (fMRI) scans from fifty-one subjects (23 MDD patients and 28 healthy controls). We assessed subjects' trait anhedonia with the Temporal Experience of Pleasure Scale (TEPS). Seed-based resting-state functional connectivity (rsFC) was conducted for each of the NAc subregions (bilateral core-like and shell-like subdivisions) separately to identify regions whose rsFCs with the NAc subregions were altered in the MDD patients and regions whose rsFCs with the NAc subregions showed different correlates with anhedonia between the MDD patients and the healthy controls. RESULTS: Compared with the health controls, the MDD patients showed decreased rsFCs of the right NAc core-like subdivision with the left mid-anterior orbital prefrontal cortex and the right inferior parietal lobe as well as decreased rsFC of the left NAc core-like subdivision with the right middle frontal gyrus. Moreover, the severity of anhedonia by the group interaction was significant for the rsFC of the right NAc shell-like subdivision with the subgenual/pregenual anterior cingulate cortex and the rsFC of the right NAc core-like subdivision with the precuneus. CONCLUSIONS: We found that the neural correlates of anhedonia indicated by the rsFCs of the NAc subregions were modulated by depression. The modulation effect was regionally-dependent. These findings enrich our understanding of the neural basis of anhedonia in MDD.

NeurIPS Conference 2021 Conference Paper

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark

  • Mingjie Li
  • Wenjia Cai
  • Rui Liu
  • Yuetian Weng
  • Xiaoyun Zhao
  • Cong Wang
  • Xin Chen
  • Zhong Liu

The automatic generation of long and coherent medical reports given medical images (e. g. Chest X-ray and Fundus Fluorescein Angiography (FFA)) has great potential to support clinical practice. Researchers have explored advanced methods from computer vision and natural language processing to incorporate medical domain knowledge for the generation of readable medical reports. However, existing medical report generation (MRG) benchmarks lack both explainable annotations and reliable evaluation tools, hindering the current research advances from two aspects: firstly, existing methods can only predict reports without accurate explanation, undermining the trustworthiness of the diagnostic methods; secondly, the comparison among the predicted reports from different MRG methods is unreliable using the evaluation metrics of natural-language generation (NLG). To address these issues, in this paper, we propose an explainable and reliable MRG benchmark based on FFA Images and Reports (FFA-IR). Specifically, FFA-IR is large, with 10, 790 reports along with 1, 048, 584 FFA images from clinical practice; it includes explainable annotations, based on a schema of 46 categories of lesions; and it is bilingual, providing both English and Chinese reports for each case. Besides using the widely used NLG metrics, we propose a set of nine human evaluation criteria to evaluate the generated reports. We envision FFA-IR as a testbed for explainable and reliable medical report generation. We also hope that it can broadly accelerate medical imaging research and facilitate interaction between the fields of medical imaging, computer vision, and natural language processing.

AAAI Conference 2021 Conference Paper

HiABP: Hierarchical Initialized ABP for Unsupervised Representation Learning

  • Jiankai Sun
  • Rui Liu
  • Bolei Zhou

Although Markov chain Monte Carlo (MCMC) is useful for generating samples from the posterior distribution, it often suffers from intractability when dealing with large-scale datasets. To address this issue, we propose Hierarchical Initialized Alternating Back-propagation (HiABP) for efficient Bayesian inference. Especially, we endow Alternating Backpropagation (ABP) method with a well-designed initializer and hierarchical structure, composing the pipeline of Initializing, Improving, and Learning back-propagation. It saves much time for the generative model to initialize the latent variable by constraining a sampler to be close to the true posterior distribution. The initialized latent variable is then improved significantly by an MCMC sampler. Thus the proposed method has the strengths of both methods, i. e. , the effectiveness of MCMC and the efficiency of variational inference. Experimental results validate our framework can outperform other popular deep generative models in modeling natural images and learning from incomplete data. We further demonstrate the unsupervised disentanglement of hierarchical latent representation with controllable image synthesis.

IROS Conference 2021 Conference Paper

Meta Preference Learning for Fast User Adaptation in Human-Supervisory Multi-Robot Deployments

  • Chao Huang 0027
  • Wenhao Luo
  • Rui Liu

As multi-robot systems (MRS) are widely used in various tasks such as natural disaster response and social security, people enthusiastically expect an MRS to be ubiquitous that a general user without heavy training can easily operate. However, humans have various preferences on balancing between task performance and safety, imposing different requirements onto MRS control. Failing to comply with preferences makes people feel difficult in operation and decreases human willingness of using an MRS. Therefore, to improve social acceptance as well as performance, there is an urgent need to adjust MRS behaviors according to human preferences before triggering human corrections, which increases cognitive load. In this paper, a novel Meta Preference Learning (MPL) method was developed to enable an MRS to fast adapt to user preferences. MPL based on meta learning mechanism can quickly assess human preferences from limited instructions; then, a neural network based preference model adjusts MRS behaviors for preference adaption. To validate method effectiveness, a task scenario "An MRS searches victims in an earthquake disaster site" was designed; 20 human users were involved to identify preferences as {"aggressive", "medium", "reserved"}; based on user guidance and domain knowledge, about 20, 000 preferences were simulated to cover different operations related to {"task quality", "task progress", "robot safety"}. The effectiveness of MPL in preference adaption was validated by the reduced duration and frequency of human interventions.

NeurIPS Conference 2021 Conference Paper

SSMF: Shifting Seasonal Matrix Factorization

  • Koki Kawabata
  • Siddharth Bhatia
  • Rui Liu
  • Mohit Wadhwa
  • Bryan Hooi

Given taxi-ride counts information between departure and destination locations, how can we forecast their future demands? In general, given a data stream of events with seasonal patterns that innovate over time, how can we effectively and efficiently forecast future events? In this paper, we propose Shifting Seasonal Matrix Factorization approach, namely SSMF, that can adaptively learn multiple seasonal patterns (called regimes), as well as switching between them. Our proposed method has the following properties: (a) it accurately forecasts future events by detecting regime shifts in seasonal patterns as the data stream evolves; (b) it works in an online setting, i. e. , processes each observation in constant time and memory; (c) it effectively realizes regime shifts without human intervention by using a lossless data compression scheme. We demonstrate that our algorithm outperforms state-of-the-art baseline methods by accurately forecasting upcoming events on three real-world data streams.

NeurIPS Conference 2020 Conference Paper

Adam with Bandit Sampling for Deep Learning

  • Rui Liu
  • Tianyi Wu
  • Barzan Mozafari

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also adapt to different training examples based on their importance in the model's convergence. To achieve this, we maintain a distribution over all examples, selecting a mini-batch in each iteration by sampling according to this distribution, which we update using a multi-armed bandit algorithm. This ensures that examples that are more beneficial to the model training are sampled with higher probabilities. We theoretically show that Adambs improves the convergence rate of Adam---$O(\sqrt{\frac{\log n}{T} })$ instead of $O(\sqrt{\frac{n}{T}})$ in some cases. Experiments on various models and datasets demonstrate Adambs's fast convergence in practice.

IJCAI Conference 2020 Conference Paper

HyperNews: Simultaneous News Recommendation and Active-Time Prediction via a Double-Task Deep Neural Network

  • Rui Liu
  • Huilin Peng
  • Yong Chen
  • Dell Zhang

Personalized news recommendation can help users stay on top of the current affairs without being overwhelmed by the endless torrents of online news. However, the freshness or timeliness of news has been largely ignored by current news recommendation systems. In this paper, we propose a novel approach dubbed HyperNews which explicitly models the effect of timeliness on news recommendation. Furthermore, we introduce an auxiliary task of predicting the so-called "active-time" that users spend on each news article. Our key finding is that it is beneficial to address the problem of news recommendation together with the related problem of active-time prediction in a multi-task learning framework. Specifically, we train a double-task deep neural network (with a built-in timeliness module) to carry out news recommendation and active-time prediction simultaneously. To the best of our knowledge, such a "kill-two-birds-with-one-stone" solution has seldom been tried in the field of news recommendation before. Our extensive experiments on real-life news datasets have not only confirmed the mutual reinforcement of news recommendation and active-time prediction but also demonstrated significant performance improvements over state-of-the-art news recommendation techniques.

TIST Journal 2020 Journal Article

STARS

  • Rui Liu
  • Runze Liu
  • Andrea Pugliese
  • V. S. Subrahmanian

Customers of virtually all online marketplaces rely upon reviews in order to select the product or service they wish to buy. These marketplaces in turn deploy review fraud detection systems so that the integrity of reviews is preserved. A well-known problem with review fraud detection systems is their underlying assumption that the majority of reviews are honest-this assumption leads to a vulnerability where an attacker can try to generate many fake reviews of a product. In this article, we consider the case where a company wishes to fraudulently promote its product through fake reviews and propose the Sockpuppet-based Targeted Attack on Reviewing Systems (STARS for short). STARS enables an attacker to enter fake reviews for a product from multiple, apparently independent, sockpuppet accounts. We show that the STARS attack enables companies to successfully promote their product against seven recent, well-known review fraud detectors on four datasets (Amazon, Epinions, and the BitcoinAlpha and OTC exchanges) by significant margins. To protect against the STARS attack, we propose a new fraud detection algorithm called RTV. RTV introduces a new class of users (called trusted users) and also considers reviews left by verified users which were not considered in existing review fraud detectors. We show that RTV significantly mitigates the impact of the STARS attack across the four datasets listed above.

AAAI Conference 2019 Conference Paper

A Bandit Approach to Maximum Inner Product Search

  • Rui Liu
  • Tianyi Wu
  • Barzan Mozafari

There has been substantial research on sub-linear time approximate algorithms for Maximum Inner Product Search (MIPS). To achieve fast query time, state-of-the-art techniques require significant preprocessing, which can be a burden when the number of subsequent queries is not sufficiently large to amortize the cost. Furthermore, existing methods do not have the ability to directly control the suboptimality of their approximate results with theoretical guarantees. In this paper, we propose the first approximate algorithm for MIPS that does not require any preprocessing, and allows users to control and bound the suboptimality of the results. We cast MIPS as a Best Arm Identification problem, and introduce a new bandit setting that can fully exploit the special structure of MIPS. Our approach outperforms state-of-the-art methods on both synthetic and real-world datasets.

AAMAS Conference 2019 Conference Paper

Trust-Aware Behavior Reflection for Robot Swarm Self-Healing

  • Rui Liu
  • Fan Jia
  • Wenhao Luo
  • Meghan Chandarana
  • Changjoo Nam
  • Michael Lewis
  • Katia Sycara

The deployment of robot swarms is influenced by real-world factors, such as motor issues, sensor failure, and wind disturbances. These factors cause the appearance of faulty robots. In a decentralized swarm, sharing incorrect information from faulty robots will lead to undesired swarm behaviors, such as swarm disconnection and incorrect heading directions. We envision a system where a human operator is exerting supervisory control over a remote swarm by indicating changes in trust to the swarm via a "trust-signal". By correcting faulty behaviors, trust between the human and the swarm is maintained to facilitate human-swarm cooperation. In this research, a trust-aware behavior reflection method – Trust-R – is designed based on a weighted mean subsequence reduced algorithm (WMSR). By using Trust-R, detected faulty behaviors are automatically corrected by the swarm in a decentralized fashion by referring to the motion status of their trusted neighbors and isolating failed robots from the others. Based on real-world scenarios, three types of robot faults – degraded performance caused by motor wear, abnormal motion caused by system uncertainty and motion deviation caused by an external disturbance such as wind – were simulated to test the effectiveness of Trust-R. Results show that Trust-R is effective in correcting swarm behaviors for swarm self-healing.

IJCAI Conference 2019 Conference Paper

VEST: A System for Vulnerability Exploit Scoring & Timing

  • Haipeng Chen
  • Jing Liu
  • Rui Liu
  • Noseong Park
  • V. S. Subrahmanian

Knowing if/when a cyber-vulnerability will be exploited and how severe the vulnerability is can help enterprise security officers (ESOs) come up with appropriate patching schedules. Today, this ability is severely compromised: our study of data from Mitre and NIST shows that on average there is a 132 day gap between the announcement of a vulnerability by Mitre and the time NIST provides an analysis with severity score estimates and 8 important severity attributes. Many attacks happen during this very 132-day window. We present Vulnerability Exploit Scoring \& Timing (VEST), a system for (early) prediction and visualization of if/when a vulnerability will be exploited, and its estimated severity attributes and score.

IJCAI Conference 2018 Conference Paper

Discrete Factorization Machines for Fast Feature-based Recommendation

  • Han Liu
  • Xiangnan He
  • Fuli Feng
  • Liqiang Nie
  • Rui Liu
  • Hanwang Zhang

User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e. g. , usually larger than 107, results in expensive storage and computational cost. This prohibits fast recommendation especially on mobile applications where the computational resource is very limited. In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation. DFM binarizes the real-valued model parameters (e. g. , float32) of every feature embedding into binary codes (e. g. , boolean), and thus supports efficient storage and fast user-item score computation. To avoid the severe quantization loss of the binarization, we propose a convergent updating rule that resolves the challenging discrete optimization of DFM. Through extensive experiments on two real-world datasets, we show that 1) DFM consistently outperforms state-of-the-art binarized recommendation models, and 2) DFM shows very competitive performance compared to its real-valued version (FM), demonstrating the minimized quantization loss.

IS Journal 2018 Journal Article

UIO-Based Testbed Augmentation for Simulating Cyber-Physical Systems

  • Rui Liu
  • Junbin Liang
  • Jiannong Cao
  • Weiping Zhu
  • Sebastian VanSyckel
  • Christian Becker

Cyber-physical systems (CPSs) are rapidly expanding into our daily lives. The development process for such systems is challenging due to the need of creating and testing numerous and heterogeneous devices in differing environments. Simulation systems can aid the development but many open challenges remain. First, creating devices is cumbersome due to the lack of proper supporting tools. Second, testing is scenario specific and mostly small scale. Third, its difficult to integrate physical devices and small systems in order to increase scale and verifiability. We propose a system that addresses these issues, and demonstrate its benefits in augmenting our intelligent transportation system testbed. For this, we present ubiquitous interactive objects (UIOs) as a uniform model for smart devices, and support dynamic adding to, deleting from, and updating of the simulation. The case study shows that our UIO model and simulation system ease the development of CPSs, while facilitating better applications through dependable results.

AAMAS Conference 2016 Conference Paper

An Adaptive Learning Framework for Efficient Emergence of Social Norms (Extended Abstract)

  • Chao Yu
  • Hongtao Lv
  • Sandip Sen
  • Jianye Hao
  • Fenghui Ren
  • Rui Liu

This paper investigates how norm emergence can be facilitated by agents’ adaptive learning behaviors. A general learning framework is proposed, in which agents can dynamically adapt their learning behaviors through social learning of their individual learning experience. Experimental results indicate that the proposed framework outperforms the static learning framework in various comparison criteria.

AIIM Journal 2016 Journal Article

Web-video-mining-supported workflow modeling for laparoscopic surgeries

  • Rui Liu
  • Xiaoli Zhang
  • Hao Zhang

Motivation As quality assurance is of strong concern in advanced surgeries, intelligent surgical systems are expected to have knowledge such as the knowledge of the surgical workflow model (SWM) to support their intuitive cooperation with surgeons. For generating a robust and reliable SWM, a large amount of training data is required. However, training data collected by physically recording surgery operations is often limited and data collection is time-consuming and labor-intensive, severely influencing knowledge scalability of the surgical systems. Objective The objective of this research is to solve the knowledge scalability problem in surgical workflow modeling with a low cost and labor efficient way. Methods A novel web-video-mining-supported surgical workflow modeling (webSWM) method is developed. A novel video quality analysis method based on topic analysis and sentiment analysis techniques is developed to select high-quality videos from abundant and noisy web videos. A statistical learning method is then used to build the workflow model based on the selected videos. To test the effectiveness of the webSWM method, 250 web videos were mined to generate a surgical workflow for the robotic cholecystectomy surgery. The generated workflow was evaluated by 4 web-retrieved videos and 4 operation-room-recorded videos, respectively. Results The evaluation results (video selection consistency n-index ≥0. 60; surgical workflow matching degree ≥0. 84) proved the effectiveness of the webSWM method in generating robust and reliable SWM knowledge by mining web videos. Conclusion With the webSWM method, abundant web videos were selected and a reliable SWM was modeled in a short time with low labor cost. Satisfied performances in mining web videos and learning surgery-related knowledge show that the webSWM method is promising in scaling knowledge for intelligent surgical systems.