Author name cluster

Tao Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

61 papers

2 author rows

EAAI Journal 2026 Journal Article

A Chinese financial event knowledge graph-based retrieval-augmented generation framework for financial question answering

Haitao Cheng
Ke Wang
Qi Wang
Tao Liu
Kai Sheng

Details DOI

AAAI Conference 2026 Conference Paper

Realistic Curriculum Reinforcement Learning for Autonomous and Sustainable Marine Vessel Navigation

Xiaocai Zhang
Zhe Xiao
Maohan Liang
Tao Liu
Haijiang Li
Wenbin Zhang

Sustainability is becoming increasingly critical in the maritime transport, encompassing both environmental and social impacts, such as Greenhouse Gas (GHG) emissions and navigational safety. Traditional vessel navigation heavily relies on human experience, often lacking autonomy and emission awareness, and is prone to human errors that may compromise safety. In this paper, we propose a Curriculum Reinforcement Learning (CRL) framework integrated with a realistic, data-driven marine simulation environment and a machine learning-based fuel consumption prediction module. The simulation environment is constructed using real-world vessel movement data and enhanced with a Diffusion Model to simulate dynamic maritime conditions. Vessel fuel consumption is estimated using historical operational data and learning-based regression. The surrounding environment is represented as image-based inputs to capture spatial complexity. We design a lightweight, policy-based CRL agent with a comprehensive reward mechanism that considers safety, emissions, timeliness, and goal completion. This framework effectively handles complex tasks progressively while ensuring stable and efficient learning in continuous action spaces. We validate the proposed approach in a sea area of the Indian Ocean, demonstrating its efficacy in enabling sustainable and safe vessel navigation.

PDF Details DOI

JBHI Journal 2026 Journal Article

Whisperization and Masked CycleGAN-Based Framework for Electrolaryngeal Speech Enhancement

Jie Zhou
Li Wang
Fengji Li
Shaochuan Zhang
Fan Fan
Tao Liu
Xiaohong Chen
Haijun Niu

Electrolarynx (EL) provides an effective approach to voice rehabilitation for patients with phonation disorder. However, due to its reliance on an external mechanical source, EL speech suffers from limited acoustic cues, leading to degraded quality and restricting the potential of subsequent modeling and enhancement. This paper proposes a novel EL speech enhancement framework that combines whisperization with Masked CycleGAN model. The whisperization step removes redundant constant excitation and mechanical noise, generating an intermediate speech form—whisper-like EL (W-EL) speech, whose acoustic and perceptual properties are closer to natural whisper. Subsequently, the Masked CycleGAN employs a frame-level masking strategy to guide the generator in reconstructing missing prosodic and linguistic features. Thus, we achieved a dual-stage enhancement of “redundancy removal” and “deficiency compensation. ” Acoustic feature analysis demonstrates that the converted W-EL speech is more similar to normal speech in terms of spectrogram, fundamental frequency (F0) values, and F0 contours, while also compensating for the missing low frequency energy below 500 Hz. Objective evaluations show significant improvements across multiple metrics. Subjective evaluations confirm that W-EL speech exhibits higher naturalness and intelligibility compared to original EL speech. Moreover, the combined “whisperization + voice conversion” framework further enhances perceptual quality. This study not only offer a novel pathway for EL speech enhancement, but also may provide valuable insights for improving other types of pathological speech.

Details DOI

EAAI Journal 2025 Journal Article

A multi-scale feature and cross-domain fusion network for image tampering localization

Fan Deng
Tao Liu
Haibo Yu
Ruitao Yang

Details DOI

JBHI Journal 2025 Journal Article

Addressing Multiple Challenges in Early Gait Freezing Prediction for Parkinson's Disease: A Practical Deep Learning Approach

Wenan Wang
Jingfeng Lin
Xinning Le
Yaru Li
Tao Liu
Lunxin Pan
Min Li
Dezhong Yao

Objective: Freezing of Gait (FOG) significantly impacts daily activities of Parkinson's disease (PD) patients. Despite the potential of wearable sensors in predicting FOG, challenges persist, including the brief prediction interval before FOG onset, limited generalization across patients, and the inconvenience of multiple sensors. Addressing one issue often aggravates others, making it difficult to achieve suitable concurrent solutions to all these challenges. Methods: We introduce the PhysioGait Predictive Network (PhysioGPN), a deep learning framework designed to predict FOG events in PD patients at least 2 seconds prior to onset. The model architecture incorporates four key strategies: 1) Detection of progressive motion changes using large convolutional kernels; 2) Unraveling the complexity of motion coordination and gait dynamics using multi-dimensional and multi-scale convolution; 3) Capture gait self-similarity and asymmetry with twin-tower structure; 4) Promoting cross-domain information exchange with multi-domain attention. Furthermore, we propose a framework based on knowledge distillation (KD), reducing the model's dependence on multiple sensors while maintaining prediction accuracy. Results: The model achieves an 85. 8% Area Under the Curve (AUC) in FOG prediction. When reducing the number of sensors, KD mitigates the decline in performance and increases the AUC by 5. 1%, compared to scenarios without KD. Conclusion: Our research proposes a practical solution to the challenges of FOG prediction, demonstrating the effectiveness of the KD approach for lightweight wearable sensors in rehabilitation engineering. Significance: Our findings offer valuable insights for addressing multiple challenges in the practical application of wearable devices.

Details DOI

IROS Conference 2025 Conference Paper

ContextCache: Task-Aware Lifecycle Management for Memory-Efficient LLM Agent Deployment

Tao Liu
Ping Guo
Dong Feng
Peng Wang

LLM-based agents have demonstrated remarkable capabilities in multi-step reasoning and task execution across domains such as robotics and autonomous systems. However, deploying these agents on resource-constrained platforms presents a fundamental challenge: minimizing latency while optimizing memory usage. Existing caching techniques (KVCache, PrefixCache, PromptCache) improve inference speed by reusing cached context but overlook LLM dependency relationships in agent workflows, leading to excessive memory usage or redundant recomputation across LLM calls. To address this, we propose ContextCache, a task-aware lifecycle management framework that optimizes context fragment caching for multi-step LLM agents. ContextCache predicts the lifespan of each context fragment and dynamically allocates and releases GPU memory accordingly. We evaluate our approach on a newly constructed dataset, covering logistics coordination, assembly tasks, and health management. Experimental results demonstrate a 15% reduction in memory usage compared to state-of-the-art caching strategies, with no loss in inference efficiency, making our approach well-suited for real-world deployment in resource-constrained environments.

Details

YNIMG Journal 2025 Journal Article

Decoding cortical folding patterns in marmosets using machine learning and large language model

Yue Wu
Xuesong Gao
Zhengliang Liu
Pengcheng Wang
Zihao Wu
Yiwei Li
Tuo Zhang
Tianming Liu

Details DOI

ECAI Conference 2025 Conference Paper

Degree of Staleness-Aware Data Updating in Federated Learning

Tao Liu
Xuehe Wang

Handling data staleness remains a significant challenge in federated learning with highly time-sensitive tasks, where data is generated continuously and data staleness largely affects model performance. Although recent works attempt to optimize data staleness by determining local data update frequency or client selection strategy, none of them explore taking both data staleness and data volume into consideration. In this paper, we propose Data Updating in Federated Learning (DUFL), an incentive mechanism featuring an innovative local data update scheme manipulated by three knobs: the server’s payment, outdated data conservation rate, and clients’ fresh data collection volume, to coordinate staleness and volume of local data for best utilities. To this end, we introduce a novel metric called Degree of Staleness (DoS) to quantify data staleness and conduct a theoretic analysis illustrating the quantitative relationship between DoS and model performance. We model DUFL as a two-stage Stackelberg game with dynamic constraint, deriving the optimal local data update strategy for each client in closed-form and the approximately optimal strategy for the server. Experimental results on real-world datasets demonstrate the significant performance of our approach.

Details

EAAI Journal 2025 Journal Article

Efficient generation of power system topology diagrams based on Graph Neural Network

Chen Yang
Shengyang Wu
Tao Liu
Yixuan He
Jingyu Wang
Dongyuan Shi

Details DOI

JBHI Journal 2025 Journal Article

Extraction of Fetal ECG by Logarithmic Hyperbolic Secant Adaptive Algorithm in Alpha-Stable Noise

Mengjia Wang
Deqiu Zhai
Jiacheng Zhang
Bo Ni
Tao Liu

Direct fetal electrocardiogram (FECG) plays a crucial role in assessing fetal health and monitoring pregnancy conditions. Extracting high-quality FECG signals from maternal abdominal electrocardiogram (AECG) recordings remains a significant challenge due to the low amplitude of the FECG, its overlap with the maternal electrocardiogram (MECG), and the potential exposure to impulsive noise in the real world. Adaptive filtering (AF) is an essential method for FECG extraction, however, its performance tends to degrade in the presence of impulsive noise, such as instrument interference. To address this limitation, we propose a novel AF algorithm based on a nonlinear logarithmic hyperbolic secant (LHS) cost function. Alpha-stable distribution is adopted to model the realistic noises due to its high scalability. To further enhance extraction accuracy and optimize the preset parameters, we introduce a hyperbolic tangent-like transformation and develop the improved logarithmic hyperbolic secant adaptive filtering (ILHSAF) algorithm. The proposed approach leverages the approximate linear interval of the LHS function to maximize the preservation of original FECG information within the AECG. We use the synthetic dataset FECGSYN as well as two real datasets, Daisy and NI-FECG, to evaluate the performance and our methods outperform other existing AF algorithms. The ILHSAF algorithm exhibits commendable performance in R-peak detection and full-wave analysis on both real-world datasets, indicating its effective denoising capability and robustness in FECG extraction. This advancement establishes a foundation for long-term maternal and fetal monitoring using portable devices, as the proposed algorithms are capable of real-time operation.

Details DOI

NeurIPS Conference 2025 Conference Paper

From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

Tao Liu
Dafeng Zhang
Gengchen Li
Shizhuo Liu
yongqi song
Senmao Li
Shiqi Yang
Boqian Li

Face aging has become a crucial task in computer vision, with applications ranging from entertainment to healthcare. However, existing methods struggle with achieving a realistic and seamless transformation across the entire lifespan, especially when handling large age gaps or extreme head poses. The core challenge lies in balancing $age\ accuracy$ and $identity\ preservation$—what we refer to as the $Age\text{-}ID\ trade\text{-}off$. Most prior methods either prioritize age transformation at the expense of identity consistency or vice versa. In this work, we address this issue by proposing a $two\text{-}pass$ face aging framework, named $Cradle2Cane$, based on few-step text-to-image (T2I) diffusion models. The first pass focuses on solving $age\ accuracy$ by introducing an adaptive noise injection ($AdaNI$) mechanism. This mechanism is guided by including prompt descriptions of age and gender for the given person as the textual condition. Also, by adjusting the noise level, we can control the strength of aging while allowing more flexibility in transforming the face. However, identity preservation is weakly ensured here to facilitate stronger age transformations. In the second pass, we enhance $identity\ preservation$ while maintaining age-specific features by conditioning the model on two identity-aware embeddings ($IDEmb$): $SVR\text{-}ArcFace$ and $Rotate\text{-}CLIP$. This pass allows for denoising the transformed image from the first pass, ensuring stronger identity preservation without compromising the aging accuracy. Both passes are $jointly\ trained\ in\ an\ end\text{-}to\text{-}end\ way\$. Extensive experiments on the CelebA-HQ test dataset, evaluated through Face++ and Qwen-VL protocols, show that our $Cradle2Cane$ outperforms existing face aging methods in age accuracy and identity consistency. Additionally, $Cradle2Cane$ demonstrates superior robustness when applied to in-the-wild human face images, where prior methods often fail. This significantly broadens its applicability to more diverse and unconstrained real-world scenarios. Code is available at https: //github. com/byliutao/Cradle2Cane.

PDF Details

ICRA Conference 2025 Conference Paper

GS-EVT: Cross-Modal Event Camera Tracking Based on Gaussian Splatting

Tao Liu
Runze Yuan
Yi'ang Ju
Xun Xu
Jiaqi Yang
Xiangting Meng
Xavier Lagorce
Laurent Kneip

Reliable self-localization is a foundational skill for many intelligent mobile platforms. This paper explores the use of event cameras for motion tracking thereby providing a solution with inherent robustness under difficult dynamics and illumination. In order to circumvent the challenge of event camera-based mapping, the solution is framed in a cross-modal way. It tracks a map representation that comes directly from frame-based cameras. Specifically, the proposed method operates on top of gaussian splatting, a state-of-the-art representation that permits highly efficient and realistic novel view synthesis. The key of our approach consists of a novel pose parametrization that uses a reference pose plus first order dynamics for local differential image rendering. The latter is then compared against images of integrated events in a staggered coarse-to-fine optimization scheme. As demonstrated by our results, the realistic view rendering ability of gaussian splatting leads to stable and accurate tracking across a variety of both publicly available and newly recorded data sequences.

Details

NeurIPS Conference 2025 Conference Paper

GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation

Tao Liu
Chongyu Wang
Rongjie Li
Yingchen Yu
Xuming He
Song Bai

While Multimodal Large Language Models (MLLMs) have advanced GUI navigation agents, current approaches face limitations in cross-domain generalization and effective history utilization. We present a reasoning-enhanced framework that systematically integrates structured reasoning, action prediction, and history summarization. The structured reasoning component generates coherent Chain-of-Thought analyses combining progress estimation and decision reasoning, which inform both immediate action predictions and compact history summaries for future steps. Based on this framework, we train a GUI agent, GUI-Rise, through supervised fine-tuning on pseudo-labeled trajectories and reinforcement learning with Group Relative Policy Optimization (GRPO). This framework employs specialized rewards, including a history-aware objective, directly linking summary quality to subsequent action performance. Comprehensive evaluations on standard benchmarks demonstrate state-of-the-art results under identical training data conditions, with particularly strong performance in out-of-domain scenarios. These findings validate our framework's ability to maintain robust reasoning and generalization across diverse GUI navigation tasks.

PDF Details

EAAI Journal 2025 Journal Article

Identification of zinc stripping defects from cathode plate based on deep learning

Tao Liu
Yibin Liu
Jian Chen
Jin Gong

Details DOI

ICLR Conference 2025 Conference Paper

One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt

Tao Liu
Kai Wang 0060
Senmao Li
Joost van de Weijer 0001
Fahad Shahbaz Khan
Shiqi Yang 0002
Yaxing Wang
Jian Yang 0003

Text-to-image generation models can create high-quality images from input prompts. However, they struggle to support the consistent generation of identity-preserving requirements for storytelling. Existing approaches to this problem typically require extensive training in large datasets or additional modifications to the original model architectures. This limits their applicability across different domains and diverse diffusion model configurations. In this paper, we first observe the inherent capability of language models, coined $\textit{context consistency}$, to comprehend identity through context with a single prompt. Drawing inspiration from the inherent $\textit{context consistency}$, we propose a novel $\textit{training-free}$ method for consistent text-to-image (T2I) generation, termed "One-Prompt-One-Story" ($\textit{1Prompt1Story}$). Our approach $\textit{1Prompt1Story}$ concatenates all prompts into a single input for T2I diffusion models, initially preserving character identities. We then refine the generation process using two novel techniques: $\textit{Singular-Value Reweighting}$ and $\textit{Identity-Preserving Cross-Attention}$, ensuring better alignment with the input description for each frame. In our experiments, we compare our method against various existing consistent T2I generation approaches to demonstrate its effectiveness, through quantitative metrics and qualitative assessments. Code is available at https://github.com/byliutao/1Prompt1Story.

Details

JBHI Journal 2025 Journal Article

PPA Net: The Pixel Prediction Assisted Net for 3D TOF-MRA Cerebrovascular Segmentation

Zhiqi Lee
Tao Liu
Haonan Zhang
Xiang Zhang
Xuan Li
Yizhen Pan
Tingting Wu
Jierui Ding

Cerebrovascular segmentation is essential for diagnosing and treating cerebrovascular diseases. However, accurately segmenting cerebral vessels in TOF-MRA remains challenging due to significant interindividual variations in cerebrovascular morphology, low image con-trast, and class imbalance. The present study proposes an advanced deep learning model called PPA Net, consisting of VesselMRA Net and VesselConvLSTM components. Firstly, VesselMRA Net utilizes rectangular convolutional blocks to fuse multi-scale features, enhancing feature extraction per-formance. VesselMRA Net employs the attention mechanism to boost certain valuable semantic weighting, addressing segmentation challenges arising from class imbalance and low contrast. Secondly, VesselConvLSTM, a pixel-level prediction model, employs a gating mechanism to learn cerebral vessel morphology across individuals. It reduces individual differences in segmentation and restores inter-voxel correlations disrupted by data slicing, aiding VesselMRA Net in accurately segmenting cerebrovascular pixels. Lastly, integrating VesselMRA Net and VesselConv-LSTM results in a modular cerebral vessel segmentation framework, PPA Net, facilitating separate optimization of the backbone network and predicted model components. The performance of this model has been extensively validated through experimental evaluations on three publicly available datasets, obtaining significant competitiveness when compared to the state-of-the-art of the current cerebral vessel segmentation models.

Details DOI

AAAI Conference 2025 Conference Paper

Relation-aware Hierarchical Prompt for Open-vocabulary Scene Graph Generation

Tao Liu
Rongjie Li
Chongyu Wang
Xuming He

Open-vocabulary Scene Graph Generation (OV-SGG) overcomes the limitations of the closed-set assumption by aligning visual relationship representations with open-vocabulary textual representations. This enables the identification of novel visual relationships, making it applicable to real-world scenarios with diverse relationships. However, existing OV-SGG methods are constrained by fixed text representations, limiting diversity and accuracy in image-text alignment. To address these challenges, we propose the Relation-Aware Hierarchical Prompting (RAHP) framework, which enhances text representation by integrating subject-object and region-specific relation information. Our approach utilizes entity clustering to address the complexity of relation triplet categories, enabling the effective integration of subject-object information. Additionally, we utilize a large language model (LLM) to generate detailed region-aware prompts, capturing fine-grained visual interactions and improving alignment between visual and textual modalities. RAHP also introduces a dynamic selection mechanism within Vision-Language Models (VLMs), which adaptively selects relevant text prompts based on the visual content, reducing noise from irrelevant prompts. Extensive experiments on the Visual Genome and Open Images v6 datasets demonstrate that our framework consistently achieves state-of-the-art performance, demonstrating its effectiveness in addressing the challenges of open-vocabulary scene graph generation.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Structural damage-driven brain compensation among near-centenarians and centenarians without dementia

Hui Tang
Haichao Zhao
Hao Liu
Jiyang Jiang
Nicole Kochan
Jing Jing
Henry Brodaty
Wei Wen

Details DOI

AAAI Conference 2025 Conference Paper

VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization

Tao Liu
Ziyang Ma
Qi Chen
Feilong Chen
Shuai Fan
Xie Chen
Kai Yu

We present VQTalker, a Vector Quantization-based framework for multilingual talking head generation that addresses the challenges of lip synchronization and natural motion across diverse languages. Our approach is grounded in the phonetic principle that human speech comprises a finite set of distinct sound units (phonemes) and corresponding visual articulations (visemes), which often share commonalities across languages. We introduce a facial motion tokenizer based on Group Residual Finite Scalar Quantization (GRFSQ), which creates a discretized representation of facial features. This method enables comprehensive capture of facial movements while improving generalization to multiple languages, even with limited training data. Building on this quantized representation, we implement a coarse-to-fine motion generation process that progressively refines facial animations. Extensive experiments demonstrate that VQTalker achieves state-of-the-art performance in both video-driven and speech-driven scenarios, particularly in multilingual settings. Notably, our method achieves high-quality results at a resolution of 512 × 512 pixels while maintaining a lower bitrate of approximately 11 kbps. Our work opens new possibilities for cross-lingual talking face generation.

PDF Details DOI

EAAI Journal 2024 Journal Article

An approach to ship target detection based on combined optimization model of dehazing and detection

Tao Liu
Zhao Zhang
Zhengling Lei
Yuchi Huo
Shuo Wang
Jiansen Zhao
Jinfeng Zhang
Xin Jin

Details DOI

EAAI Journal 2024 Journal Article

An interpretable precursor-driven hierarchical model for predictive aircraft safety

Jie Yang
Jinsong Yu
Diyin Tang
Zhanbao Gao
Can Feng
Tao Liu

Details DOI

JBHI Journal 2024 Journal Article

Ankle Moment Estimation Based on A Novel Distributed Plantar Pressure Sensing System

Mingyu Du
Bowen Lv
Bingfei Fan
Xiaoling Li
Junze Yu
Fugang Yi
Tao Liu
Shibo Cai

Ankle moment plays an important role in human gait analysis, patients’ rehabilitation process monitoring, and the human-machine interaction control of exoskeleton robots. However, current ankle moment estimation methods mainly rely on inverse dynamics (ID) based on optical motion capture system (OMC) and force plate. These methods rely on fixed instruments in the laboratory, which are difficult to be applied to the control of exoskeleton robots. To solve this problem, this paper developed a new distributed plantar pressure system and proposed an ankle plantar flexion moment estimation method using the plantar pressure system. We integrated eight pressure sensors in each insole to collect the pressure data of the key area of the foot and then used the plantar pressure data to train four neural networks to obtain the ankle moment. The performance of the models was evaluated using normalized root mean square error (NRMSE) and cross-correlation coefficient (ρ). During experiments, eight subjects were recruited for the overground walking tests, and OMC and force plate were used as the gold standard. The results indicate that the Genetic algorithm - Gated recurrent unit estimation algorithm (GA-GRU) was the best estimation model which achieved the highest accuracy in generalized ankle moment estimation (NRMSE = 7. 23%, ρ = 0. 85) compared with the other models. The designed novel distributed plantar pressure system and the proposed method could serve as a joint moment estimation approach in wearable robot control and human motion state monitoring.

Details DOI

AAAI Conference 2024 Conference Paper

Beyond Traditional Threats: A Persistent Backdoor Attack on Federated Learning

Tao Liu
Yuhang Zhang
Zhu Feng
Zhiqin Yang
Chen Xu
Dapeng Man
Wu Yang

Backdoors on federated learning will be diluted by subsequent benign updates. This is reflected in the significant reduction of attack success rate as iterations increase, ultimately failing. We use a new metric to quantify the degree of this weakened backdoor effect, called attack persistence. Given that research to improve this performance has not been widely noted, we propose a Full Combination Backdoor Attack (FCBA) method. It aggregates more combined trigger information for a more complete backdoor pattern in the global model. Trained backdoored global model is more resilient to benign updates, leading to a higher attack success rate on the test set. We test on three datasets and evaluate with two models across various settings. FCBA's persistence outperforms SOTA federated learning backdoor attacks. On GTSRB, post-attack 120 rounds, our attack success rate rose over 50% from baseline. The core code of our method is available at https://github.com/PhD-TaoLiu/FCBA.

PDF Details DOI

IROS Conference 2024 Conference Paper

EVIT: Event-based Visual-Inertial Tracking in Semi-Dense Maps Using Windowed Nonlinear Optimization

Runze Yuan
Tao Liu
Zijia Dai
Yi-Fan Zuo
Laurent Kneip

Event cameras are an interesting visual exteroceptive sensor that reacts to brightness changes rather than integrating absolute image intensities. Owing to this design, the sensor exhibits strong performance in situations of challenging dynamics and illumination conditions. While event-based simultaneous tracking and mapping remains a challenging problem, a number of recent works have pointed out the sensor’s suitability for prior map-based tracking. By making use of cross-modal registration paradigms, the camera’s ego-motion can be tracked across a large spectrum of illumination and dynamics conditions on top of accurate maps that have been created a priori by more traditional sensors. The present paper follows up on a recently introduced event-based geometric semi-dense tracking paradigm, and proposes the addition of inertial signals in order to robustify the estimation. More specifically, the added signals provide strong cues for pose initialization as well as regularization during windowed, multi-frame tracking. As a result, the proposed framework achieves increased performance under challenging illumination conditions as well as a reduction of the rate at which intermediate event representations need to be registered in order to maintain stable tracking across highly dynamic sequences. Our evaluation focuses on a diverse set of real world sequences and comprises a comparison of our proposed method against a purely event-based alternative running at different rates.

Details

NeurIPS Conference 2024 Conference Paper

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

Senmao Li
Taihang Hu
Joost van de Weijer
Fahad S. Khan
Tao Liu
Linxuan Li
Shiqi Yang
Yaxing Wang

One of the main drawback of diffusion models is the slow inference time for image generation. Among the most successful approaches to addressing this problem are distillation methods. However, these methods require considerable computational resources. In this paper, we take another approach to diffusion model acceleration. We conduct a comprehensive study of the UNet encoder and empirically analyze the encoder features. This provides insights regarding their changes during the inference process. In particular, we find that encoder features change minimally, whereas the decoder features exhibit substantial variations across different time-steps. This insight motivates us to omit encoder computation at certain adjacent time-steps and reuse encoder features of previous time-steps as input to the decoder in multiple time-steps. Importantly, this allows us to perform decoder computation in parallel, further accelerating the denoising process. Additionally, we introduce a prior noise injection method to improve the texture details in the generated image. Besides the standard text-to-image task, we also validate our approach on other tasks: text-to-video, personalized generation and reference-guided generation. Without utilizing any knowledge distillation technique, our approach accelerates both the Stable Diffusion (SD) and DeepFloyd-IF model sampling by 41$\%$ and 24$\%$ respectively, and DiT model sampling by 34$\%$, while maintaining high-quality generation performance. Our code will be publicly released.

PDF Details DOI

YNIMG Journal 2024 Journal Article

Influence of interpersonal distance on collaborative performance in the joint Simon task—An fNIRS-based hyperscanning study

Xiaolei Song
Meimei Dong
Kun Feng
Jiaqi Li
Xiaofei Hu
Tao Liu

Details DOI

EAAI Journal 2024 Journal Article

Predicting the properties of metamaterials consisting of curved-wall triangles using ensemble neural networks with interpretability

Shaowei Zhu
Mingdu Wen
Zhuo Lv
Liming Chen
Tao Liu
Xianbo Hou

Details DOI

YNIMG Journal 2024 Journal Article

Relationships between brain structure-function coupling in normal aging and cognition: A cross-ethnicity population-based study

Chang Liu
Jing Jing
Jiyang Jiang
Wei Wen
Wanlin Zhu
Zixiao Li
Yuesong Pan
Xueli Cai

Details DOI

EAAI Journal 2023 Journal Article

Method of recognizing sleep postures based on air pressure sensor and convolutional neural network: For an air spring mattress

Yao Chao
Tao Liu
Li-Ming Shen

Details DOI

NeurIPS Conference 2023 Conference Paper

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation

Ruida Zhou
Tao Liu
Min Cheng
Dileep Kalathil
P. R. Kumar
Chao Tian

We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment. Previous policy-based robust RL algorithms mainly focus on the tabular setting under uncertainty sets that facilitate robust policy evaluation, but are no longer tractable when the number of states scales up. To this end, we propose two novel uncertainty set formulations, one based on double sampling and the other on an integral probability metric. Both make large-scale robust RL tractable even when one only has access to a simulator. We propose a robust natural actor-critic (RNAC) approach that incorporates the new uncertainty sets and employs function approximation. We provide finite-time convergence guarantees for the proposed RNAC algorithm to the optimal robust policy within the function approximation error. Finally, we demonstrate the robust performance of the policy learned by our proposed RNAC approach in multiple MuJoCo environments and a real-world TurtleBot navigation task.

PDF Details

NeurIPS Conference 2023 Conference Paper

Penguin: Parallel-Packed Homomorphic Encryption for Fast Graph Convolutional Network Inference

Ran Ran
Nuo Xu
Tao Liu
Wei Wang
Gang Quan
Wujie Wen

The marriage of Graph Convolutional Network (GCN) and Homomorphic Encryption (HE) enables the inference of graph data on the cloud with significantly enhanced client data privacy. However, the tremendous computation and memory overhead associated with HE operations challenges the practicality of HE-based GCN inference. GCN inference involves a sequence of expensive matrix-matrix multiplications, and we observe that directly applying the state-of-the-art HE-based secure matrix-matrix multiplication solutions to accelerate HE-GCN inference is far less efficient as it does not exploit the unique aggregation mechanism of two-dimension graph node-features in GCN layer computation. As a result, in this paper, we propose a novel HE-based ciphertext packing technique, i. e. , Penguin, that can take advantage of the unique computation pattern during the HE-GCN inference to significantly reduce the computation and memory overhead associated with HE operations. Specifically, Penguin employs (i) an effective two-dimension parallel packing technique for feature ciphertext with optimal graph node partitioning and graph feature interleaving, and (ii) an interleaved assembly technique that can effectively make use of the blank slots to merge ciphertexts after feature reduction and significantly reduce the costly rotation operation. We provide theoretical analysis and experimental validation to demonstrate the speedup achieved by Penguin in accelerating GCN inference using popular GCN models and datasets. Our results show that Penguin can achieve up to $\sim10\times$ speedup and around $\sim79$% reduction in computational memory overhead, significantly outperforming state-of-the-art solutions. To the best of our knowledge, this is the first work that can ensure the protection of both graph structure and features when accelerating HE-GCN inference on encrypted data. Our code is publicly available at https: //github. com/ranran0523/Penguin.

PDF Details

NeurIPS Conference 2023 Conference Paper

Provably Fast Convergence of Independent Natural Policy Gradient for Markov Potential Games

Youbang Sun
Tao Liu
Ruida Zhou
P. R. Kumar
Shahin Shahrampour

This work studies an independent natural policy gradient (NPG) algorithm for the multi-agent reinforcement learning problem in Markov potential games. It is shown that, under mild technical assumptions and the introduction of the \textit{suboptimality gap}, the independent NPG method with an oracle providing exact policy evaluation asymptotically reaches an $\epsilon$-Nash Equilibrium (NE) within $\mathcal{O}(1/\epsilon)$ iterations. This improves upon the previous best result of $\mathcal{O}(1/\epsilon^2)$ iterations and is of the same order, $\mathcal{O}(1/\epsilon)$, that is achievable for the single-agent case. Empirical results for a synthetic potential game and a congestion game are presented to verify the theoretical bounds.

PDF Details

NeurIPS Conference 2022 Conference Paper

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

Ruida Zhou
Tao Liu
Dileep Kalathil
P. R. Kumar
Chao Tian

We study policy optimization for Markov decision processes (MDPs) with multiple reward value functions, which are to be jointly optimized according to given criteria such as proportional fairness (smooth concave scalarization), hard constraints (constrained MDP), and max-min trade-off. We propose an Anchor-changing Regularized Natural Policy Gradient (ARNPG) framework, which can systematically incorporate ideas from well-performing first-order methods into the design of policy optimization algorithms for multi-objective MDP problems. Theoretically, the designed algorithms based on the ARNPG framework achieve $\tilde{O}(1/T)$ global convergence with exact gradients. Empirically, the ARNPG-guided algorithms also demonstrate superior performance compared to some existing policy gradient-based approaches in both exact gradients and sample-based scenarios.

PDF Details

NeurIPS Conference 2022 Conference Paper

Falconn++: A Locality-sensitive Filtering Approach for Approximate Nearest Neighbor Search

Ninh Pham
Tao Liu

We present Falconn++, a novel locality-sensitive filtering (LSF) approach for approximate nearest neighbor search on angular distance. Falconn++ can filter out potential far away points in any hash bucket before querying, which results in higher quality candidates compared to other hashing-based solutions. Theoretically, Falconn++ asymptotically achieves lower query time complexity than Falconn, an optimal locality-sensitive hashing scheme on angular distance. Empirically, Falconn++ achieves a higher recall-speed tradeoff than Falconn on many real-world data sets. Falconn++ is also competitive with HNSW, an efficient representative of graph-based solutions on high search recall regimes.

PDF Details

NeurIPS Conference 2022 Conference Paper

Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

Tao Liu
P. R. Kumar
Ruida Zhou
Xi Liu

Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e. g. , translational invariance of images. Kernels based on the \textit{maximum} similarity over a group of transformations are not generally positive definite. Perhaps it is for this reason that they have not been studied theoretically. We address this lacuna and show that positive definiteness indeed holds \textit{with high probability} for kernels based on the maximum similarity in the small training sample set regime of interest, and that they do yield the best results in that regime. We also show how additional properties such as their ability to incorporate local features at multiple spatial scales, e. g. , as done in CNNs through max pooling, and to provide the benefits of composition through the architecture of multiple layers, can also be embedded into SVMs. We verify through experiments on widely available image sets that the resulting SVMs do provide superior accuracy in comparison to well-established deep neural network benchmarks for small sample sizes.

PDF Details

YNICL Journal 2022 Journal Article

Predicting prognosis of primary pontine hemorrhage using CT image and deep learning

Shuo Wang
Feng Chen
Mingyu Zhang
Xiaolin Zhao
Linghua Wen
Wenyuan Wu
Shina Wu
Zhe Li

Details DOI

YNIMG Journal 2022 Journal Article

The role of low-frequency oscillations in three-dimensional perception with depth cues in virtual reality

Zhili Tang
Xiaoyu Liu
Hongqiang Huo
Min Tang
Tao Liu
Zhixin Wu
Xiaofeng Qiao
Duo Chen

Details DOI

YNIMG Journal 2021 Journal Article

A slower rate of sulcal widening in the brains of the nondemented oldest old

Hui Tang
Tao Liu
Hao Liu
Jiyang Jiang
Jian Cheng
Haijun Niu
Shuyu Li
Henry Brodaty

Details DOI

JBHI Journal 2021 Journal Article

IMU-Based Gait Normalcy Index Calculation for Clinical Evaluation of Impaired Gait

Lei Wang
Yun Sun
Qingguo Li
Tao Liu
Jingang Yi

Inertial measurement units (IMU) have been used for gait analysis in many clinical studies, as a more convenient, low cost and less restricted alternative to the laboratory-based motion capture systems or instrumented walkways. Spatial-temporal gait parameters such as gait cycle duration and stride length calculated from the IMUs were often used in these studies for evaluating the impaired gait. However, the spatial-temporal information provided by IMUs is limited, and sometime suffers incomplete and less effective evaluation. In this study, we develop a novel IMU-based method for clinical gait evaluation. Nine gait variables including three spatial-temporal parameters and six kinematic parameters are extracted from two shank-mounted IMUs for quantifying patient's gait deviations. Based on those parameters, an IMU-based gait normalcy index (INI) is derived to evaluate the overall gait performance. Eight inpatient subjects with gait impairments caused by n-hexane neuropathy and ten healthy subjects were recruited. The proposed gait variables and INI were examined on the inpatients at three to five time instants during the rehabilitation process until being discharged. A comparison with healthy subjects and statistical analysis for the changes of gait variables and INI demonstrated that the proposed new set of gait variables and INI can provide adequate and effective information for quantifying gait abnormalities, and help understanding the progress of gait and effectiveness of therapy during rehabilitation process.

Details DOI

NeurIPS Conference 2021 Conference Paper

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

Tao Liu
Ruida Zhou
Dileep Kalathil
Panganamala Kumar
Chao Tian

We address the issue of safety in reinforcement learning. We pose the problem in an episodic framework of a constrained Markov decision process. Existing results have shown that it is possible to achieve a reward regret of $\tilde{\mathcal{O}}(\sqrt{K})$ while allowing an $\tilde{\mathcal{O}}(\sqrt{K})$ constraint violation in $K$ episodes. A critical question that arises is whether it is possible to keep the constraint violation even smaller. We show that when a strictly safe policy is known, then one can confine the system to zero constraint violation with arbitrarily high probability while keeping the reward regret of order $\tilde{\mathcal{O}}(\sqrt{K})$. The algorithm which does so employs the principle of optimistic pessimism in the face of uncertainty to achieve safe exploration. When no strictly safe policy is known, though one is known to exist, then it is possible to restrict the system to bounded constraint violation with arbitrarily high probability. This is shown to be realized by a primal-dual algorithm with an optimistic primal estimate and a pessimistic dual update.

PDF Details

AIIM Journal 2021 Journal Article

MvKFN-MDA: Multi-view Kernel Fusion Network for miRNA-disease association prediction

Jin Li
Tao Liu
Jingru Wang
Qing Li
Chenxi Ning
Yun Yang

Details DOI

YNIMG Journal 2021 Journal Article

Team-work, Team-brain: Exploring synchrony and team interdependence in a nine-person drumming task via multiparticipant hyperscanning and inter-brain network topology with fNIRS

Tao Liu
Lian Duan
Ruina Dai
Matthew Pelowski
Chaozhe Zhu

Details DOI

IJCAI Conference 2021 Conference Paper

What If We Could Not See? Counterfactual Analysis for Egocentric Action Anticipation

Tianyu Zhang
Weiqing Min
Jiahao Yang
Tao Liu
Shuqiang Jiang
Yong Rui

Egocentric action anticipation aims at predicting the near future based on past observation in first-person vision. While future actions may be wrongly predicted due to the dataset bias, we present a counterfactual analysis framework for egocentric action anticipation (CA-EAA) to enhance the capacity. In the factual case, we can predict the upcoming action based on visual features and semantic labels from past observation. Imagining one counterfactual situation where no visual representation had been observed, we would obtain a counterfactual predicted action only using past semantic labels. In this way, we can reduce the side-effect caused by semantic labels via a comparison between factual and counterfactual outcomes, which moves a step towards unbiased prediction for egocentric action anticipation. We conduct experiments on two large-scale egocentric video datasets. Qualitative and quantitative results validate the effectiveness of our proposed CA-EAA.

PDF Details DOI

JBHI Journal 2020 Journal Article

Spatially Aware Dense-LinkNet Based Regression Improves Fluorescent Cell Detection in Adaptive Optics Ophthalmic Images

Jianfei Liu
Yoo-Jean Han
Tao Liu
Nancy Aguilera
Johnny Tam

Retinal pigment epithelial (RPE) cells play an important role in nourishing retinal neurosensory photoreceptor cells, and numerous blinding diseases are associated with RPE defects. Their fluorescence signature can now be visualized in the living human eye using adaptive optics (AO) imaging combined with indocyanine green (ICG), which motivates us to develop an automated RPE detection method to improve the quantitative evaluation of RPE status in patients. This paper proposes a spatially-aware, Dense-LinkNet-based regression approach to improve the detection of in vivo fluorescent cell patterns, achieving precision, recall, and F1-Score of 93. 6 $\pm$ 4. 3%, 81. 4 $\pm$ 9. 5%, and 86. 7 $\pm$ 5. 7%, respectively. These results demonstrate the utility of incorporating spatial inputs into a deep learning-based regression framework for cell detection.

Details DOI

YNIMG Journal 2020 Journal Article

Stronger bilateral functional connectivity of the frontoparietal control network in near-centenarians and centenarians without dementia

Jiyang Jiang
Tao Liu
John D. Crawford
Nicole A. Kochan
Henry Brodaty
Perminder S. Sachdev
Wei Wen

Details DOI

YNIMG Journal 2019 Journal Article

Assessing autism at its social and developmental roots: A review of Autism Spectrum Disorder studies using functional near-infrared spectroscopy

Tao Liu
Xingchen Liu
Li Yi
Chaozhe Zhu
Patrick S. Markey
Matthew Pelowski

Details DOI

YNICL Journal 2018 Journal Article

The association of regional white matter lesions with cognition in a community-based cohort of older individuals

Jiyang Jiang
Matthew Paradise
Tao Liu
Nicola J. Armstrong
Wanlin Zhu
Nicole A. Kochan
Henry Brodaty
Perminder S. Sachdev

Details DOI

YNIMG Journal 2018 Journal Article

UBO Detector – A cluster-based, fully automated pipeline for extracting white matter hyperintensities

Jiyang Jiang
Tao Liu
Wanlin Zhu
Rebecca Koncz
Hao Liu
Teresa Lee
Perminder S. Sachdev
Wei Wen

Details DOI

YNIMG Journal 2018 Journal Article

Variation in longitudinal trajectories of cortical sulci in normal elderly

Xinke Shen
Tao Liu
Dacheng Tao
Yubo Fan
Jicong Zhang
Shuyu Li
Jiyang Jiang
Wanlin Zhu

Details DOI

AAAI Conference 2017 Conference Paper

Neural Bag-of-Ngrams

Bofang Li
Tao Liu
Zhe Zhao
Puwei Wang
Xiaoyong Du

Bag-of-ngrams (BoN) models are commonly used for representing text. One of the main drawbacks of traditional BoN is the ignorance of n-gram’s semantics. In this paper, we introduce the concept of Neural Bag-of-ngrams (Neural-BoN), which replaces sparse one-hot n-gram representation in traditional BoN with dense and rich-semantic n-gram representations. We ﬁrst propose context guided n-gram representation by adding n-grams to word embeddings model. However, the context guided learning strategy of word embeddings is likely to miss some semantics for text-level tasks. Text guided ngram representation and label guided n-gram representation are proposed to capture more semantics like topic or sentiment tendencies. Neural-BoN with the latter two n-gram representations achieve state-of-the-art results on 4 documentlevel classiﬁcation datasets and 6 semantic relatedness categories. They are also on par with some sophisticated DNNs on 3 sentence-level classiﬁcation datasets. Similar to traditional BoN, Neural-BoN is efﬁcient, robust and easy to implement. We expect it to be a strong baseline and be used in more real-world applications.

PDF Details

YNIMG Journal 2015 Journal Article

Optimization of large-scale mouse brain connectome via joint evaluation of DTI and neuron tracing data

Hanbo Chen
Tao Liu
Yu Zhao
Tuo Zhang
Yujie Li
Meng Li
Hongmiao Zhang
Hui Kuang

Details DOI

YNIMG Journal 2014 Journal Article

A longitudinal study of brain atrophy over two years in community-dwelling older individuals

Jiyang Jiang
Perminder Sachdev
Darren M. Lipnicki
Haobo Zhang
Tao Liu
Wanlin Zhu
Chao Suo
Lin Zhuang

Details DOI

YNIMG Journal 2013 Journal Article

Limited relationships between two-year changes in sulcal morphology and other common neuroimaging indices in the elderly

Tao Liu
Perminder S. Sachdev
Darren M. Lipnicki
Jiyang Jiang
Guangqiang Geng
Wanlin Zhu
Simone Reppermund
Dacheng Tao

Details DOI

YNIMG Journal 2013 Journal Article

Longitudinal changes in sulcal morphology associated with late-life aging and MCI

Tao Liu
Perminder S. Sachdev
Darren M. Lipnicki
Jiyang Jiang
Yue Cui
Nicole A. Kochan
Simone Reppermund
Julian N. Trollor

Details DOI

YNIMG Journal 2012 Journal Article

Automated detection of amnestic mild cognitive impairment in community-dwelling elderly adults: A combined spatial atrophy and white matter alteration approach

Yue Cui
Wei Wen
Darren M. Lipnicki
Mirza Faisal Beg
Jesse S. Jin
Suhuai Luo
Wanlin Zhu
Nicole A. Kochan

Details DOI

YNIMG Journal 2012 Journal Article

Predicting the development of mild cognitive impairment: A new use of pattern recognition

Yue Cui
Perminder S. Sachdev
Darren M. Lipnicki
Jesse S. Jin
Suhuai Luo
Wanlin Zhu
Nicole A. Kochan
Simone Reppermund

Details DOI

AAAI Conference 2011 Conference Paper

Partially Supervised Text Classification with Multi-Level Examples

Tao Liu
Xiaoyong Du
Yongdong Xu
Minghui Li
Xiaolong Wang

Partially supervised text classification has received great research attention since it only uses positive and unlabeled examples as training data. This problem can be solved by automatically labeling some negative (and more positive) examples from unlabeled examples before training a text classifier. But it is difficult to guarantee both high quality and quantity of the new labeled examples. In this paper, a multi-level example based learning method for partially supervised text classification is proposed, which can make full use of all unlabeled examples. A heuristic method is proposed to assign possible labels to unlabeled examples and partition them into multiple levels according to their labeling confidence. A text classifier is trained on these multi-level examples using weighted support vector machines. Experiments show that the multi-level example based learning method is effective for partially supervised text classification, and outperforms the existing popular methods such as Biased-SVM, ROC-SVM, S-EM and WL.

PDF Details

YNIMG Journal 2011 Journal Article