Arrow Research search

Author name cluster

Shuo Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers
2 author rows

Possible papers

21

AAAI Conference 2026 Conference Paper

LLM Collaboration with Multi-Agent Reinforcement Learning

  • Shuo Liu
  • Zeyu Liang
  • Xueguang Lyu
  • Christopher Amato

A large amount of work has been done in Multi-Agent Systems (MAS) for modeling and solving problems with multiple interacting agents. However, most LLMs are pretrained independently and not specifically optimized for coordination. Existing LLM fine-tuning frameworks rely on individual rewards, which require complex reward designs for each agent to encourage collaboration. To address these challenges, we model LLM collaboration as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. We develop a multi-agent, multi-turn algorithm, Multi-Agent Group Relative Policy Optimization (MAGRPO), to solve it, building on current RL approaches for LLMs as well as MARL techniques. Our experiments on LLM writing and coding collaboration demonstrate that fine-tuning MAS with MAGRPO enables agents to generate high-quality responses efficiently through effective cooperation. Our approach opens the door to using MARL methods for LLM collaboration and highlights the associated challenges.

IJCAI Conference 2025 Conference Paper

A Fast-Adaptive Cognitive Diagnosis Framework for Computerized Adaptive Testing Systems

  • Yuanhao Liu
  • Yiya You
  • Shuo Liu
  • Hong Qian
  • Ying Qian
  • Aimin Zhou

Computerized Adaptive Testing (CAT) measures student ability by iteratively selecting informative questions, with core components being the Cognitive Diagnosis Model (CDM) and selection strategy. Current research focuses on optimizing the selection strategy, assuming relatively accurate CDM results. However, existing static CDMs struggle with rapid and accurate diagnosis in the early stage of CAT. To this end, this paper proposes a Fast Adaptive Cognitive Diagnosis (FACD) framework, which incorporates dynamic collaborative and personalized diagnosis modules. Specifically, the collaborative module in FACD uses a dynamic response graph to quickly build student cognitive profiles, while the personalized module leverages each student's response sequence for robust and individualized diagnosis. Extensive experiments on real-world datasets show that, compared with existing static CDMs, FACD not only achieves superior prediction performance across various selection strategies with an improvement between roughly 5%-10% in the early stage of CAT, but also maintains a commendable inference speed.

AAAI Conference 2025 Conference Paper

Constrained Offline Black-Box Optimization via Risk Evaluation and Management

  • Yiyi Zhu
  • Huakang Lu
  • Yupeng Wu
  • Shuo Liu
  • Jing-Wen Yang
  • Hong Qian

Offline black-box optimization aims to identify the optimal solution of a black-box objective function under the guidance of a surrogate model constructed solely from a pre-collected dataset. It is commonly used in industrial scenarios, which often involve constraints, i.e., constrained offline optimization (COO). Offline optimization has progressed in addressing the out-of-distribution (OOD) issue caused by its inherent inability to interact with the objective function. However, there is not enough research in addressing more difficult scenarios, which must simultaneously address OOD issues and constrained issues to find stable, high-quality (i.e., high-scoring and feasible) solutions. To bridge this gap, this paper proposes a method called constrained offline optimization via risk evaluation and management (COOREM), which is capable of consistently surpassing the offline dataset under the condition of satisfying constraints. Specifically, COOREM employs a dual-energy model to separately evaluate OOD risk and constrained risk. This separation strategy aims to distinguish and address two difficult cases: the infeasible but not OOD solutions and the feasible but OOD solutions. Moreover, COOREM effectively manages OOD risk and constrained risk, ensuring the identification of high-quality solutions. Extensive experiments on real-world tasks, e.g., space missions, process synthesis, and design problems, showcase COOREM's effectiveness in managing both OOD risk and constrained risk. Furthermore, our findings indicate that COOREM could outperform online methods that need to access the objective function in certain space missions.

AAAI Conference 2025 Conference Paper

Effective and Efficient Representation Learning for Flight Trajectories

  • Shuo Liu
  • Wenbin Li
  • Di Yao
  • Jingping Bi

Flight trajectory data plays a vital role in the traffic management community, especially for downstream tasks such as trajectory prediction, flight recognition, and anomaly detection. Existing works often utilize handcrafted features and design models for different tasks individually, which heavily rely on domain expertise and are hard to extend. We argue that different flight analysis tasks share the same useful features of the trajectory. Jointly learning a unified representation for flight trajectories could be beneficial for improving the performance of various tasks. However, flight trajectory representation learning (TRL) faces two primary challenges, \ie unbalanced behavior density and 3D spatial continuity, which disable recent general TRL methods. In this paper, we propose Flight2Vec, a flight-specific representation learning method to address these challenges. Specifically, a behavior-adaptive patching mechanism is used to inspire the learned representation to pay more attention to behavior-dense segments. Moreover, we introduce a motion trend learning technique that guides the model to memorize not only the precise locations, but also the motion trend to generate better representations. Extensive experimental results demonstrate that Flight2Vec significantly improves performance in downstream tasks such as flight trajectory prediction, flight recognition, and anomaly detection.

TAAS Journal 2025 Journal Article

Evaluate Inference Attacks: Attack and Defense against 2D Semantic Segmentation Models

  • Yihan Liao
  • Jacky Keung
  • Jingyu Zhang
  • Yurou Dai
  • Shuo Liu

Deep learning (DL)-based 2D semantic segmentation (SS) plays a vital role in the perception task of autonomous driving. However, the SS model relies on DL, which makes it vulnerable to inference attacks. Recent research has discovered that SS models are susceptible to the membership inference attack, yet other inference attacks remain underexplored. Our study fills this gap by comprehensively investigating the vulnerabilities of two widely used RGB image-based 2D SS models ( DeepLabV3 and DeepLabV3+ ) against three inference attacks: membership inference, attribute inference, and model inversion. We evaluate the attack effectiveness on three backbones (MobileNetV2, ResNet50, and ResNet101) across three datasets (VOC2012, CityScapes, and ADE20K), where the attack accuracy can reach up to 95% (membership inference), 40% (attribute inference), and 70% (model inversion), revealing that deeper networks are more prone to privacy leakage in inference attacks. Consequently, we introduce differential privacy and model pruning as defensive mechanisms, significantly reducing attack performance, where the average accuracy drops 20% among the three inference attacks. Our findings reveal critical privacy vulnerabilities in SS tasks and offer practical guidance for developing more robust SS models in autonomous driving.

IROS Conference 2025 Conference Paper

Human-Robot Cooperative Heavy Payload Manipulation based on Whole-Body Model Predictive Control

  • Ning Wang
  • Shuo Liu
  • Tin Lun Lam
  • Tianwei Zhang 0002

Human-robot collaborative manipulation with mobile, multiple manipulators is crucial for expanding robotic applications, requiring precise handling of coupled force-position constraints between partners. Current systems, however, exhibit end-effector oscillations and instability during dynamic interactions. To overcome these limitations, this work develops a collaborative framework integrating a collaborative controller and a whole-body controller. The collaborative controller employs the object’s center-of-mass dynamics model with real-time contact forces and motion states to predict trajectories while coordinating with an attitude stabilization controller to adjust the desired end-effector poses. The whole-body controller utilizes model predictive control to generate coordinated motions that strictly follow pose commands from the collaborative controller, ensuring stable transportation. Simulation and physical experiments validate the proposed framework’s effectiveness in real-world scenarios.

ECAI Conference 2025 Conference Paper

Measuring Ageism in Large Language Models

  • Shuo Liu
  • Jiaoyun Yang
  • Yulong Li
  • Hongtu Chen
  • Ning An 0001

As large language models gain prominence, there is increasing concern about the potential biases they may perpetuate. While various biases have been studied, ageism in language models remains underexplored. According to the World Health Organization, ageism can significantly impact the physical and mental well-being of older adults, an impact that could grow as the global aging population increases. To address this research gap, we developed AgeismSet, a comprehensive Chinese dataset comprising 6, 444 sentences, enhanced with neutral impression options to provide a balanced evaluation framework. Our study then used AgeismSet to investigate ageism in large language models across cognitive, affective, and behavioral dimensions, using AgeismSet to evaluate models such as GPT-4, GPT-3. 5, GLM-3-Turbo, ERNIE Bot, Gemini Pro, and DeepSeek-V3. Our findings, quantified by the Ageism Score (AS), reveal that while some models perform well, there is considerable room for improvement in mitigating ageism. This work underscores the necessity for targeted interventions to ensure more equitable AI systems.

IJCAI Conference 2025 Conference Paper

Relation-Augmented Dueling Bayesian Optimization via Preference Propagation

  • Xiang Xia
  • Xiang Shu
  • Shuo Liu
  • Yiyi Zhu
  • Yijie Zhou
  • Weiye Wang
  • Bingdong Li
  • Hong Qian

In black-box optimization, when directly evaluating the function values of solutions is very costly or infeasible, access to the objective function is often limited to comparing pairs of solutions, which yields dueling black-box optimization. Dueling optimization is solely based on pairwise preferences, and thus notably reduces cost compared with function value based methods. However, the optimization performance of dueling optimization is often limited due to that most existing dueling optimization methods do not make full use of the pairwise preferences collected. To better utilize these preferences, this paper proposes relation-augmented dueling Bayesian optimization (RADBO) via preference propagation. By considering solution similarity, RADBO aims to uncover the potential dueling relations between solutions within different preferences through the proposed preference propagation technique. Specifically, RADBO first clusters solutions using a Gaussian mixture model. After obtaining the solution set with the highest intra-cluster similarity, RADBO utilizes a directed hypergraph to model the potential dueling relations between solutions, thereby realizing relation augmentation. Extensive experiments are conducted on both synthetic functions and real-world tasks such as motion control, car cab design and spacecraft trajectory optimization. The experimental results disclose the satisfactory accuracy of augmented preferences in RADBO, and show the superiority of RADBO compared with existing dueling optimization methods. Notably, it is verified that, under the same evaluation cost budget, RADBO can be competitive with or even surpass the function value based Bayesian optimization methods with respect to optimization performance.

ICLR Conference 2025 Conference Paper

SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization

  • Hong Qian
  • Yiyi Zhu
  • Xiang Shu
  • Shuo Liu
  • Yaolin Wen
  • Xin An
  • Huakang Lu
  • Aimin Zhou

Black-box optimization aims to find the optima through building a model close to the black-box objective function based on function value evaluation. However, in many real-world tasks, such as the design of molecular formulas and mechanical structures, it is perilous, costly, or even infeasible to evaluate the objective function value of an actively sampled solution. In this situation, optimization can only be conducted via utilizing offline historical data, which yields offline black-box optimization. Different from the traditional goal that is to pursue the optimal solution, this paper emphasizes that the goal of offline optimization is to stably surpass the offline dataset during optimization procedure. Although benchmarks called Design-Bench already exist in this emerging field, it can hardly evaluate the stability of offline optimization and mainly provides real-world offline tasks and the corresponding offline datasets. To this end, this paper proposes benchmarks named SOO-Bench (i.e., Stable Offline Optimization Benchmarks) for offline black-box optimization algorithms, so as to systematically evaluate the stability of surpassing the offline dataset under different data distributions. Along with SOO-Bench, we also propose a stability indicator to measure the degree of stability. Specifically, SOO-Bench includes various real-world offline optimization tasks and offline datasets under different data distributions, involving the fields of satellites, materials science, structural mechanics, and automobile manufacturing. Empirically, baseline and state-of-the-art algorithms are tested and analyzed on SOO-Bench. Hopefully, SOO-Bench is expected to serve as a catalyst for the rapid developments of more novel and stable offline optimization methods. The code is available at \url{https://github.com/zhuyiyi-123/SOO-Bench}.

TIST Journal 2025 Journal Article

The Evaluation Framework and Benchmark for Large Language Models in the Government Affairs Domain

  • Shuo Liu
  • Lin Zhang
  • Weidong Liu
  • Jianfeng Zhang
  • Donghui Gao
  • Xiaofeng Jia

The rapid evolution of AI has driven advancements across numerous sectors. In the domain of government affairs, large language models (LLMs) hold significant potential for applications such as policy analysis, data processing, and decision support. However, their adoption in government settings faces considerable challenges, including data accessibility issues, the absence of standardized evaluation criteria, and concerns regarding model accuracy, reliability, and security. To address these challenges, we propose a comprehensive evaluation framework specifically designed for LLMs in government affairs. Built on modular principles, this framework ensures adaptability across various industries. Additionally, we introduce the Multi-Scenario Government Affairs Benchmark (MSGABench 1 ) dataset, a Chinese-language dataset specifically crafted to meet the practical needs of government professionals. Employing the proposed framework and the MSGA dataset, we conducted an empirical evaluation of 15 prominent LLMs, revealing critical insights: (1) Performance: Many models demonstrated low accuracy and reliability, particularly under minor input variations, with some dropping below 35% accuracy, whereas GPT-4 achieved above 95% reliability; (2) Security and Compliance: Significant concerns were identified, including privacy vulnerabilities, legal compliance risks, and persistent biases, which may hinder secure deployments in government contexts; (3) Task Avoidance: Certain models exhibited excessive caution, often avoiding responses to basic tasks like document classification and government-related inquiries, which restricts their usability. These findings highlight essential limitations and opportunities for improvement, contributing to the safe and effective application of LLMs in the government sector.

IROS Conference 2025 Conference Paper

VIMS: A Visual-Inertial-Magnetic-Sonar SLAM System in Underwater Environments

  • Bingbing Zhang
  • Huan Yin
  • Shuo Liu
  • Fumin Zhang 0001
  • Wen Xu 0004

In this study, we present a novel simultaneous localization and mapping (SLAM) system, VIMS, designed for underwater navigation. Conventional visual-inertial state estimators encounter significant practical challenges in perceptually degraded underwater environments, particularly in scale estimation and loop closing. To address these issues, we first propose leveraging a low-cost single-beam sonar to improve scale estimation. Then, VIMS integrates a high-sampling-rate magnetometer for place recognition by utilizing magnetic signatures generated by an economical magnetic field coil. Building on this, a hierarchical scheme is developed for visual-magnetic place recognition, enabling robust loop closure. Furthermore, VIMS achieves a balance between local feature tracking and descriptor-based loop closing, avoiding additional computational burden on the front end. Experimental results highlight the efficacy of the proposed VIMS, demonstrating significant improvements in both the robustness and accuracy of state estimation within underwater environments.

NeurIPS Conference 2024 Conference Paper

A Simple yet Scalable Granger Causal Structural Learning Approach for Topological Event Sequences

  • Mingjia Li
  • Shuo Liu
  • Hong Qian
  • Aimin Zhou

In modern telecommunication networks, faults manifest as alarms, generating thousands of events daily. Network operators need an efficient method to identify the root causes of these alarms to mitigate potential losses. This task is challenging due to the increasing scale of telecommunication networks and the interconnected nature of devices, where one fault can trigger a cascade of alarms across multiple devices within a topological network. Recent years have seen a growing focus on causal approaches to addressing this problem, emphasizing the importance of learning a Granger causal graph from topological event sequences. Such causal graphs delineate the relations among alarms and can significantly aid engineers in identifying and rectifying faults. However, existing methods either ignore the topological relationships among devices or suffer from relatively low scalability and efficiency, failing to deliver high-quality responses in a timely manner. To this end, this paper proposes $S^2GCSL$, a simple yet scalable Granger causal structural learning approach for topological event sequences. $S^2GCSL$ utilizes a linear kernel to model activation interactions among various event types within a topological network, and employs gradient descent to efficiently optimize the likelihood function. Notably, it can seamlessly incorporate expert knowledge as constraints within the optimization process, which enhances the interpretability of the outcomes. Extensive experimental results on both large-scale synthetic and real-world problems verify the scalability and efficacy of $S^2GCSL$.

NeurIPS Conference 2024 Conference Paper

ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Ablation Capability for Large Vision-Language Models

  • Shuo Liu
  • Kaining Ying
  • Hao Zhang
  • Yue Yang
  • Yuqi Lin
  • Tianle Zhang
  • Chuanhao Li
  • Yu Qiao

Multi-turn visual conversation is an important ability of real-world AI assistants. However, the related evaluation benchmark is missed. This paper presents ConvBench, a multi-turn conversation benchmark with hierarchical capabilities ablation evaluation for Large Vision-Language Models (LVLMs). ConvBench comprises 577 curated multi-turn conversations, encompassing 215 tasks. These tasks are broad and open-ended, which resemble real-world user behaviors. ConvBench progressively examines the LVLMs' perception, reasoning, and creativity capabilities in each conversation and can decouple these capabilities in evaluations and thus perform reliable error attribution. Besides, considering the diversity of open-ended questions, we introduce an efficient and reliable automatic evaluation framework. Experimental results reveal that ConvBench is a significant challenge for current LVLMs, even for GPT4V, which achieves only a 39. 51% score. Besides, we have some insightful findings, such as the weak perception of LVLMs inhibits authentic strengths in reasoning and creation. We believe our design of hierarchical capabilities, decoupling capabilities evaluation, and multi-turn conversation can blaze a new trail in LVLMs evaluation. Code and benchmark are released at https: //github. com/shirlyliu64/ConvBench.

ICML Conference 2024 Conference Paper

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

  • Kaining Ying
  • Fanqing Meng
  • Jin Wang
  • Zhiqian Li
  • Han Lin
  • Yue Yang
  • Hao Zhang 0117
  • Wenbo Zhang 0009

Large Vision-Language Models (LVLMs) show significant strides in general-propose multimodal applications such as visual dialogue and embodied navigation. However, existing multimodal evaluation benchmarks cover a limited number of multimodal tasks testing rudimentary capabilities, falling short in tracking LVLM development. In this study, we present MMT-Bench, a comprehensive benchmark designed to assess LVLMs across massive multimodal tasks requiring expert knowledge and deliberate visual recognition, localization, and reasoning. MMT-Bench comprises $31, 325$ meticulously curated multi-choice visual questions from various multimodal scenarios such as vehicle driving and embodied navigation, covering $32$ core meta-tasks and $162$ subtasks in multimodal understanding. Due to its extensive task coverage, MMT-Bench enables the evaluation of LVLMs using a task map, facilitating the discovery of in- and out-of-domain tasks. Evaluation results involving $20$ publicly available LVLMs such as the proprietary GeminiProVision model, underscore the significant challenges posed by MMT-Bench. We anticipate that MMT-Bench will inspire the community to develop next-generation multimodal foundation models aimed at achieving general-purpose multimodal intelligence.

NeurIPS Conference 2024 Conference Paper

Needle In A Multimodal Haystack

  • Weiyun Wang
  • Shuibo Zhang
  • Yiming Ren
  • Yuchen Duan
  • Tiantong Li
  • Shuo Liu
  • Mengkang Hu
  • Zhe Chen

With the rapid advancement of multimodal large language models (MLLMs), their evaluation has become increasingly comprehensive. However, understanding long multimodal content, as a foundational ability for real-world applications, remains underexplored. In this work, we present Needle In A Multimodal Haystack (MM-NIAH), the first benchmark specifically designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents. Our benchmark includes three types of evaluation tasks: multimodal retrieval, counting, and reasoning. In each task, the model is required to answer the questions according to different key information scattered throughout the given multimodal document. Evaluating the leading MLLMs on MM-NIAH, we observe that existing models still have significant room for improvement on these tasks, especially on vision-centric evaluation. We hope this work can provide a platform for further research on long multimodal document comprehension and contribute to the advancement of MLLMs. Code and benchmark are released at https: //github. com/OpenGVLab/MM-NIAH.

NeurIPS Conference 2024 Conference Paper

SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up-to-Date Internet Knowledge

  • Chuanhao Li
  • Zhen Li
  • Chenchen Jing
  • Shuo Liu
  • Wenqi Shao
  • Yuwei Wu
  • Ping Luo
  • Yu Qiao

Large vision-language models (LVLMs) are ignorant of the up-to-date knowledge, such as LLaVA series, because they cannot be updated frequently due to the large amount of resources required, and therefore fail in many cases. For example, if a LVLM was released on January 2024, and it wouldn't know the singer of the theme song for the new Detective Conan movie, which wasn't released until April 2024. To solve the problem, a promising solution motivated by retrieval-augmented generation (RAG) is to provide LVLMs with up-to-date knowledge via internet search during inference, i. e. , internet-augmented generation (IAG), which is already integrated in some closed-source commercial LVLMs such as GPT-4V. However, the specific mechanics underpinning them remain a mystery. In this paper, we propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed SearchLVLMs. A hierarchical filtering model is trained to effectively and efficiently find the most helpful content from the websites returned by a search engine to prompt LVLMs with up-to-date knowledge. To train the model and evaluate our framework's performance, we propose a pipeline to automatically generate news-related VQA samples to construct a dataset, dubbed UDK-VQA. A multi-model voting mechanism is introduced to label the usefulness of website/content for VQA samples to construct the training set. Experimental results demonstrate the effectiveness of our framework, outperforming GPT-4o by $\sim$30\% in accuracy.

AAMAS Conference 2023 Conference Paper

A Hybrid Framework of Reinforcement Learning and Physics-Informed Deep Learning for Spatiotemporal Mean Field Games

  • Xu Chen
  • Shuo Liu
  • Xuan Di

Mean field games (MFG) are developed to solve equilibria in multiagent systems (MAS) with many agents. The majority of literature on MFGs is focused on finite states and actions. In many engineering applications such as autonomous driving, however, each agent (e. g. , an autonomous vehicle) makes a continuous-time-space (or spatiotemporal dynamic) decision to optimize a nonlinear cumulative reward. In this paper, we focus on a class of generic MFGs with continuous states and actions defined over a spatiotemporal domain for a finite horizon, named “spatiotemporal MFG (ST-MFG). " The mean field equilibria (MFE) for such games are challenging to solve using numerical methods to meet a satisfactory resolution in time and space, while it is critical to deploy smooth dynamic control in autonomous driving. Thus, we propose two methods, one is a joint reinforcement learning (RL) and machine learning framework, which iteratively solves agents’ optimal policies using RL, and propagates population density using physics-informed deep learning (PIDL). The other is a pure PIDL framework that updates agents’ states and population density altogether using deep neural networks. Both the proposed methods are mesh-free (i. e. , not restricted by mesh granularity), and have shown to be efficient in learning equilibria in autonomous driving MFGs. The PIDL method alone is faster to train than the RL-PIDL integrated method, when the environment dynamic is known.

JBHI Journal 2022 Journal Article

Capturing Time Dynamics From Speech Using Neural Networks for Surgical Mask Detection

  • Shuo Liu
  • Adria Mallol-Ragolta
  • Tianhao Yan
  • Kun Qian
  • Emilia Parada-Cabaleiro
  • Bin Hu
  • Bjorn W. Schuller

The importance of detecting whether a person wears a face mask while speaking has tremendously increased since the outbreak of SARS-CoV-2 (COVID-19), as wearing a mask can help to reduce the spread of the virus and mitigate the public health crisis. Besides affecting human speech characteristics related to frequency, face masks cause temporal interferences in speech, altering the pace, rhythm, and pronunciation speed. In this regard, this paper presents two effective neural network models to detect surgical masks from audio. The proposed architectures are both based on Convolutional Neural Networks (CNNs), chosen as an optimal approach for the spatial processing of the audio signals. One architecture applies a Long Short-Term Memory (LSTM) network to model the time-dependencies. Through an additional attention mechanism, the LSTM-based architecture enables the extraction of more salient temporal information. The other architecture (named ConvTx) retrieves the relative position of a sequence through the positional encoder of a transformer module. In order to assess to which extent both architectures can complement each other when modelling temporal dynamics, we also explore the combination of LSTM and Transformers in three hybrid models. Finally, we also investigate whether data augmentation techniques, such as, using transitions between audio frames and considering gender-dependent frameworks might impact the performance of the proposed architectures. Our experimental results show that one of the hybrid models achieves the best performance, surpassing existing state-of-the-art results for the task at hand.

JBHI Journal 2022 Journal Article

Practical Strategies for Extreme Missing Data Imputation in Dementia Diagnosis

  • Niamh McCombe
  • Shuo Liu
  • Xuemei Ding
  • Girijesh Prasad
  • Magda Bucholc
  • David P. Finn
  • Stephen Todd
  • Paula L. McClean

Accurate computational models for clinical decision support systems require clean and reliable data but, in clinical practice, data are often incomplete. Hence, missing data could arise not only from training datasets but also test datasets which could consist of a single undiagnosed case, an individual. This work addresses the problem of extreme missingness in both training and test data by evaluating multiple imputation and classification workflows based on both diagnostic classification accuracy and computational cost. Extreme missingness is defined as having ∼50% of the total data missing in more than half the data features. In particular, we focus on dementia diagnosis due to long time delays, high variability, high attrition rates and lack of practical data imputation strategies in its diagnostic pathway. We identified and replicated the extreme missingness structure of data from a real-world memory clinic on a larger open dataset, with the original complete data acting as ground truth. Overall, we found that computational cost, but not accuracy, varies widely for various imputation and classification approaches. Particularly, we found that iterative imputation on the training dataset combined with a reduced-feature classification model provides the best approach, in terms of speed and accuracy. Taken together, this work has elucidated important factors to be considered when developing a predictive model for a dementia diagnostic support system.

TIST Journal 2020 Journal Article

Deep Learning Thermal Image Translation for Night Vision Perception

  • Shuo Liu
  • Mingliang Gao
  • Vijay John
  • Zheng Liu
  • Erik Blasch

Context enhancement is critical for the environmental perception in night vision applications, especially for the dark night situation without sufficient illumination. In this article, we propose a thermal image translation method, which can translate thermal/infrared (IR) images into color visible (VI) images, called IR2VI. The IR2VI consists of two cascaded steps: translation from nighttime thermal IR images to gray-scale visible images (GVI), which is called IR-GVI; and the translation from GVI to color visible images (CVI), which is known as GVI-CVI in this article. For the first step, we develop the Texture-Net, a novel unsupervised image translation neural network based on generative adversarial networks. Texture-Net can learn the intrinsic characteristics from the GVI and integrate them into the IR image. In comparison with the state-of-the-art unsupervised image translation methods, the proposed Texture-Net is able to address some common challenges, e.g., incorrect mapping and lack of fine details, with a structure connection module and a region-of-interest focal loss. For the second step, we investigated the state-of-the-art gray-scale image colorization methods and integrate the deep convolutional neural network into the IR2VI framework. The results of the comprehensive evaluation experiments demonstrate the effectiveness of the proposed IR2VI image translation method. This solution will contribute to the environmental perception and understanding in varied night vision applications.