Arrow Research search

Author name cluster

Wei Peng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

27 papers
2 author rows

Possible papers

27

JBHI Journal 2026 Journal Article

Biomed-DPT: Dual Modality Prompt Tuning for Biomedical Vision-Language Models

  • Wei Peng
  • Jianchen Hu
  • Kang Liu
  • Meng Zhang

Prompt learning has emerged as one of the most effective paradigms for adapting pre-trained vision language models (VLMs) to biomedical image classification tasks in few-shot scenarios. However, most existing prompt learning methods rely on a single textual prompt, often ignoring the particular visual structures (e. g. , the complex anatomical structures and subtle pathological features) in biomedical images. In this work, we propose Biomed DPT, a knowledge-enhanced dual-modality prompt tuning framework. For text prompts, Biomed-DPT constructs a dual prompt including template-driven ensemble clinical prompts and large language model (LLM)-driven expert domain adapted prompts. These prompts are systematically ranked and their optimal combination is searched for using a neural network. A semantic regularization loss is then applied to extract clinical knowledge while mitigating semantic discrepancies. For visual prompts, Biomed-DPT introduces zero vectors as soft prompts to leverage attention re-weighting so that the focus on non-diagnostic regions and the recognition of non-critical pathological features are avoided. Biomed DPT achieves an average classification accuracy of 66. 28% across 11 biomedical image datasets covering 9 modalities and 10 organs, with performance reaching 79. 54% in base classes and 76. 91% in novel classes. Our code is available at: https://github.com/pengwei222/Biomed-DPT.

AAAI Conference 2025 Conference Paper

FreeNet: Liberating Depth-Wise Separable Operations for Building Faster Mobile Vision Architectures

  • Hao Yu
  • Haoyu Chen
  • Wei Peng
  • Xu Cheng
  • Guoying Zhao

In the pursuit of efficient vision architectures, substantial efforts have been devoted to optimizing operator efficiency. Depth-wise separable operators, such as DWConv, are found cheap in both FLOPs and parameters. As a result, they are increasingly incorporated into efficient backbones, trading for deeper and wider architectures to enhance performance. However, separable operators are not really fast on devices due to the discontinuous memory access requirements. In this paper, we propose FreeNets, a family of simple and efficient backbones that free the separable operation to further accelerate the running speed. We introduce sparse sampling mixers (S2-Mixer) to supersede existing separable token mixers. The S2-Mixer samples multiple segments of partially continuous signals across spatial and channel dimensions for convolutional processing, achieving extremely fast on-device speed. The sparse sampling also enables S2-Mixer to capture long-range pixel relationships from dynamic receptive fields. Furthermore, we introduce a Shift Feed-Forward Network (ShiftFFN) as a faster alternative to existing channel mixers. It utilizes a shift neck architecture that aggregates global information to shift features, enabling faster channel mixing while incorporating global pixel information. Extensive experiments demonstrate that FreeNet offers a superior accuracy-efficiency tradeoff compared to the latest efficient models. On ImageNet-1k, FreeNet-S2 outperforms the StarNet-S4 by 0.4% in top-1 accuracy, while running around 40% faster on desktop GPU and 15% faster on Mobile GPU.

JBHI Journal 2025 Journal Article

Hierarchical Graph Representation Learning With Multi-Granularity Features for Anti-Cancer Drug Response Prediction

  • Wei Peng
  • Jiangzhen Lin
  • Wei Dai
  • Ning Yu
  • Jianxin Wang

Patients with the same type of cancer often respond differently to identical drug treatments due to unique genomic traits. Accurately predicting a patient's response to drug is crucial in guiding treatment decisions, alleviating patient suffering, and improving cancer prognosis. Current computational methods utilize deep learning models trained on extensive drug screening data to predict anti-cancer drug responses based on features of cell lines and drugs. However, the interaction between cell lines and drugs is a complex biological process involving interactions across various levels, from internal cellular and drug structures to the external interactions among different molecules. To address this complexity, we propose a novel Hierarchical graph representation Learning with Multi-Granularity features (HLMG) algorithm for predicting anti-cancer drug responses. The HLMG algorithm combines features at two granularities: the overall gene expression and pathway substructures of cell lines, and the overall molecular fingerprints and substructures of drugs. Subsequently, it constructs a heterogeneous graph including cell lines, drugs, known cell line-drug responses, and the associations between similar cell lines and similar drugs. Through a graph convolutional network model, the HLMG learns the final cell line and drug representations by aggregating features of their multi-level neighbor in the heterogeneous graph. The multi-level neighbors consist of the node self, directly related drugs/cell lines, and indirectly related similar drugs/cell lines. Finally, a linear correlation coefficient decoder is employed to reconstruct the cell line-drug correlation matrix to predict anti-cancer drug responses. Our model was tested on the Genomics of Drug Sensitivity in Cancer (GDSC) and the Cancer Cell Line Encyclopedia (CCLE) databases. Results indicate that HLMG outperforms other state-of-the-art methods in accurately predicting anti-cancer drug responses.

ICLR Conference 2025 Conference Paper

LASeR: Towards Diversified and Generalizable Robot Design with Large Language Models

  • Junru Song
  • Yang Yang
  • Huan Xiao
  • Wei Peng
  • Wen Yao
  • Feifei Wang

Recent advances in Large Language Models (LLMs) have stimulated a significant paradigm shift in evolutionary optimization, where hand-crafted search heuristics are gradually replaced with LLMs serving as intelligent search operators. However, these studies still bear some notable limitations, including a challenge to balance exploitation with exploration, often leading to inferior solution diversity, as well as poor generalizability of problem solving across different task settings. These unsolved issues render the prowess of LLMs in robot design automation largely untapped. In this work, we present LASeR -- Large Language Model-Aided Evolutionary Search for Robot Design Automation. Leveraging a novel reflection mechanism termed DiRect, we elicit more knowledgeable exploratory behaviors from LLMs based on past search trajectories, reshaping the exploration-exploitation tradeoff with dual improvements in optimization efficiency and solution diversity. Additionally, with evolution fully grounded in task-related background information, we unprecedentedly uncover the inter-task reasoning capabilities of LLMs, facilitating generalizable design processes that effectively inspire zero-shot robot proposals for new applications. Our simulated experiments on voxel-based soft robots showcase distinct advantages of LASeR over competitive baselines. Code at https://github.com/WoodySJR/LASeR.

ICLR Conference 2025 Conference Paper

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

  • Hantao Zhang
  • Yuhe Liu
  • Jiancheng Yang
  • Shouhong Wan
  • Xinyuan Wang
  • Wei Peng
  • Pascal Fua

Patient data from real-world clinical practice often suffers from data scarcity and long-tail imbalances, leading to biased outcomes or algorithmic unfairness. This study addresses these challenges by generating lesion-containing image-segmentation pairs from lesion-free images. Previous efforts in medical imaging synthesis have struggled with separating lesion information from background, resulting in low-quality backgrounds and limited control over the synthetic output. Inspired by diffusion-based image inpainting, we propose LeFusion, a lesion-focused diffusion model. By redesigning the diffusion learning objectives to focus on lesion areas, we simplify the learning process and improve control over the output while preserving high-fidelity backgrounds by integrating forward-diffused background contexts into the reverse diffusion process. Additionally, we tackle two major challenges in lesion texture synthesis: 1) multi-peak and 2) multi-class lesions. We introduce two effective strategies: histogram-based texture control and multi-channel decomposition, enabling the controlled generation of high-quality lesions in difficult scenarios. Furthermore, we incorporate lesion mask diffusion, allowing control over lesion size, location, and boundary, thus increasing lesion diversity. Validated on 3D cardiac lesion MRI and lung nodule CT datasets, LeFusion-generated data significantly improves the performance of state-of-the-art segmentation models, including nnUNet and SwinUNETR.

JBHI Journal 2025 Journal Article

Predicting Clinical Anticancer Drug Response of Patients by Using Domain Alignment and Prototypical Learning

  • Wei Peng
  • Chuyue Chen
  • Wei Dai
  • Ning Yu
  • Jianxin Wang

Anticancer drug response prediction is crucial in developing personalized treatment plans for cancer patients. However, High-quality patient anticancer drug response data are scarce and cell line data and patient data have different distributions, models trained solely on cell line data perform poorly. Some existing methods predict anticancer drug response by transferring knowledge from the cell line domain to the patient domain using transfer learning. However, the robustness of these classifiers is affected by anomalies in the cell line data, and they do not utilize the knowledge in the unlabeled target domain data. To this end, we proposed a model called DAPL to predict patient responses to anticancer drugs. The model extracts domain-invariant features from cell lines and patients by constructing multiple VAEs and extracts drug features using GNNs. These features are then combined for prototypical learning to train a classifier, resulting in better predictions of patient anticancer drug response. We used the cell line datasets CCLE and GDSC as source domains and the patient datasets TCGA and PDTC as target domains and conducted experiments. The results indicate that DAPL shows excellent performance in predicting patient anticancer drug response compared to other state-of-the-art methods.

EAAI Journal 2025 Journal Article

Rapid prediction of thermal stress on satellites via domain decomposition-based Hybrid Fourier Neural Operator

  • Kangrui Zhou
  • Wei Peng
  • Xiaoya Zhang
  • Xu Liu
  • Wen Yao

Rapid thermal stress analysis is crucial for the thermal design of satellites. To overcome the disadvantages of traditional algorithms in terms of efficiency, deep learning methods have been used to tackle these problems. However, using uniform grid-based techniques is challenging when faced with complex geometric shapes. To address this, we introduce the domain decomposition-based Hybrid Fourier Neural Operator (HFNO), a comprehensive framework for learning a multi-scale and end-to-end operator on two-dimensional point clouds. We then propose two decomposition metrics: a stress gradient-based metric for scenarios with prior knowledge of training data, and a mesh density-based metric for scenarios without prior knowledge. Leveraging K-Dimension tree-based domain decomposition optimized via Monte Carlo tree search, we decompose the computational domain into several disjoint rectangular subdomains. In the proposed hybrid framework, a Geometry-aware Fourier Neural Operator (Geo-FNO) is used to deal with subdomains with high-frequency information, while a Non-Uniform Fourier Neural Operator (NU-FNO) is used to deal with subdomains with low-frequency information. This framework effectively combines the advantages of two Fourier Neural Operator variants, overcoming the issue of large prediction errors on the subdomains with high-frequency information and ensuring stable prediction performance across different positions. Furthermore, we introduce a boundary loss term during the training process to enhance continuity across subdomain boundaries. The numerical results demonstrate that our method achieves a superior balance between efficiency and precision, surpassing that of a single algorithm.

ICLR Conference 2025 Conference Paper

Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining

  • Wonhyeok Choi
  • Kyumin Hwang
  • Wei Peng
  • Minwoo Choi
  • Sunghoon Im 0001

Self-supervised monocular depth estimation (SSMDE) aims to predict the dense depth map of a monocular image, by learning depth from RGB image sequences, eliminating the need for ground-truth depth labels. Although this approach simplifies data acquisition compared to supervised methods, it struggles with reflective surfaces, as they violate the assumptions of Lambertian reflectance, leading to inaccurate training on such surfaces. To tackle this problem, we propose a novel training strategy for an SSMDE by leveraging triplet mining to pinpoint reflective regions at the pixel level, guided by the camera geometry between different viewpoints. The proposed reflection-aware triplet mining loss specifically penalizes the inappropriate photometric error minimization on the localized reflective regions while preserving depth accuracy on non-reflective areas. We also incorporate a reflection-aware knowledge distillation method that enables a student model to selectively learn the pixel-level knowledge from reflective and non-reflective regions. This results in robust depth estimation across areas. Evaluation results on multiple datasets demonstrate that our method effectively enhances depth quality on reflective surfaces and outperforms state-of-the-art SSMDE baselines.

JBHI Journal 2025 Journal Article

The Large Language Models on Biomedical Data Analysis: A Survey

  • Wei Lan
  • Zhentao Tang
  • Mingyang Liu
  • Qingfeng Chen
  • Wei Peng
  • Yi-Ping Phoebe Chen
  • Yi Pan

With the rapid development of Large Language Model (LLM) technology, it has become an indispensable force in biomedical data analysis research. However, biomedical researchers currently have limited knowledge about LLM. Therefore, there is an urgent need for a summary of LLM applications in biomedical data analysis. Herein, we propose this review by summarizing the latest research work on LLM in biomedicine. In this review, LLM techniques are first outlined. We then discuss biomedical datasets and frameworks for biomedical data analysis, followed by a detailed analysis of LLM applications in genomics, proteomics, transcriptomics, radiomics, single-cell analysis, medical texts and drug discovery. Finally, the challenges of LLM in biomedical data analysis are discussed. In summary, this review is intended for researchers interested in LLM technology and aims to help them understand and apply LLM in biomedical data analysis research.

AAAI Conference 2024 Conference Paper

Learning from Failure: Improving Meeting Summarization without Good Samples

  • Ke Wang
  • Xiutian Zhao
  • Wei Peng

Existing methods aligning language models with various human needs are reliant heavily on high-quality and task-specific data. However, industrial deployment of task-specific language models often encounter challenges in the availability of appropriate training samples. Taking meeting summarization for instance, public datasets are scarce, and private corpora are also hard to obtain due to privacy issues or resource-demanding annotation. To improve meeting summarization in the absence of positively-rated (i.e., ``good'') samples, we propose Score Tuning, a cold start tuning framework that leverages bad samples of distinguishable degrees to incrementally enhance the performance of summary generation without an initial presence of good samples. Our method utilizes asynchronous and numerical human feedback that measure the quality of generated summaries. Formulating data into triplets of (transcript, summary, score), our approach instructs a pre-trained model to learn the association between summary qualities and human-rated scores and hence to generate better summaries corresponding to higher scores. The experiment results show that our method is effective in improving meeting summarization on both English and Chinese corpora while requiring less annotated data and training resources compared to existing alignment methods. Additionally, we also preliminarily explore the transferability of our approach in machine translation tasks and demonstrate its potential for future development and usage in other domains.

AAAI Conference 2024 Conference Paper

MorphVAE: Advancing Morphological Design of Voxel-Based Soft Robots with Variational Autoencoders

  • Junru Song
  • Yang Yang
  • Wei Peng
  • Weien Zhou
  • Feifei Wang
  • Wen Yao

Soft robot design is an intricate field with unique challenges due to its complex and vast search space. In the past literature, evolutionary computation algorithms, including novel probabilistic generative models (PGMs), have shown potential in this realm. However, these methods are sample inefficient and predominantly focus on rigid robots in locomotion tasks, which limit their performance and application in robot design automation. In this work, we propose MorphVAE, an innovative PGM that incorporates a multi-task training scheme and a meticulously crafted sampling technique termed ``continuous natural selection'', aimed at bolstering sample efficiency. This method empowers us to gain insights from assessed samples across diverse tasks and temporal evolutionary stages, while simultaneously maintaining a delicate balance between optimization efficiency and biodiversity. Through extensive experiments in various locomotion and manipulation tasks, we substantiate the efficiency of MorphVAE in generating high-performing and diverse designs, surpassing the performance of competitive baselines.

AAAI Conference 2024 Conference Paper

Self-Distillation Regularized Connectionist Temporal Classification Loss for Text Recognition: A Simple Yet Effective Approach

  • Ziyin Zhang
  • Ning Lu
  • Minghui Liao
  • Yongshuai Huang
  • Cheng Li
  • Min Wang
  • Wei Peng

Text recognition methods are gaining rapid development. Some advanced techniques, e.g., powerful modules, language models, and un- and semi-supervised learning schemes, consecutively push the performance on public benchmarks forward. However, the problem of how to better optimize a text recognition model from the perspective of loss functions is largely overlooked. CTC-based methods, widely used in practice due to their good balance between performance and inference speed, still grapple with accuracy degradation. This is because CTC loss emphasizes the optimization of the entire sequence target while neglecting to learn individual characters. We propose a self-distillation scheme for CTC-based model to address this issue. It incorporates a framewise regularization term in CTC loss to emphasize individual supervision, and leverages the maximizing-a-posteriori of latent alignment to solve the inconsistency problem that arises in distillation between CTC-based models. We refer to the regularized CTC loss as Distillation Connectionist Temporal Classification (DCTC) loss. DCTC loss is module-free, requiring no extra parameters, longer inference lag, or additional training data or phases. Extensive experiments on public benchmarks demonstrate that DCTC can boost text recognition model accuracy by up to 2.6%, without any of these drawbacks.

EAAI Journal 2023 Journal Article

Joint deep reversible regression model and physics-informed unsupervised learning for temperature field reconstruction

  • Zhiqiang Gong
  • Weien Zhou
  • Jun Zhang
  • Wei Peng
  • Wen Yao

Temperature monitoring over heat source components in engineering systems, such as the energy system, electronic equipments, becomes essential to guarantee the working performance of these components. However, prior methods, which mainly use the interpolate estimation to reconstruct the overall temperature field from limited monitoring points, require large amounts of temperature tensors for an accurate estimation. This may affect the availability and reliability of the system. To solve the problem, this work develops a novel reconstruction method which joints the deep reversible regression model and physics-informed unsupervised learning for temperature field reconstruction of heat-source systems (TFR-HSS). Firstly, we define the TFR-HSS mathematically, numerically model the system with discrete grids, and hence transform the task as an image-to-image regression problem. Then, this work develops the deep reversible regression model which can better learn physical information, especially over the area near the boundaries of the system. Finally, this work proposes the physics-informed reconstruction loss with the physical characteristics of the system and learns the deep model without labelled samples. Experimental studies have conducted over typical two-dimensional heat-source systems to validate the effectiveness of the proposed method. Under the proposed method, the mean average error of the constructed temperature field can achieve about 0. 1K, 50% lower than other methods. Besides, the proposed method takes 5. 2 ms per sample for inference which can provide real-time predictions.

AAAI Conference 2023 Conference Paper

Learning to Know Myself: A Coarse-to-Fine Persona-Aware Training Framework for Personalized Dialogue Generation

  • Yunpeng Li
  • Yue Hu
  • Yajing Sun
  • Luxi Xing
  • Ping Guo
  • Yuqiang Xie
  • Wei Peng

A critical challenge for open-domain dialogue agents is to generate persona-relevant and consistent responses. Due to the nature of persona sparsity in conversation scenarios, previous persona-based dialogue agents trained with Maximum Likelihood Estimation tend to overlook the given personas and generate responses irrelevant or inconsistent with personas. To address this problem, we propose a two-stage coarse-to-fine persona-aware training framework to improve the persona consistency of a dialogue agent progressively. Specifically, our framework first trains the dialogue agent to answer the constructed persona-aware questions, making it highly sensitive to the personas to generate persona-relevant responses. Then the dialogue agent is further trained with a contrastive learning paradigm by explicitly perceiving the difference between the consistent and the generated inconsistent responses, forcing it to pay more attention to the key persona information to generate consistent responses. By applying our proposed training framework to several representative baseline models, experimental results show significant boosts on both automatic and human evaluation metrics, especially the consistency of generated responses.

IJCAI Conference 2022 Conference Paper

Control Globally, Understand Locally: A Global-to-Local Hierarchical Graph Network for Emotional Support Conversation

  • Wei Peng
  • Yue Hu
  • Luxi Xing
  • Yuqiang Xie
  • Yajing Sun
  • Yunpeng Li

Emotional support conversation aims at reducing the emotional distress of the help-seeker, which is a new and challenging task. It requires the system to explore the cause of help-seeker's emotional distress and understand their psychological intention to provide supportive responses. However, existing methods mainly focus on the sequential contextual information, ignoring the hierarchical relationships with the global cause and local psychological intention behind conversations, thus leads to a weak ability of emotional support. In this paper, we propose a Global-to-Local Hierarchical Graph Network to capture the multi-source information (global cause, local intentions and dialog history) and model hierarchical relationships between them, which consists of a multi-source encoder, a hierarchical graph reasoner, and a global-guide decoder. Furthermore, a novel training objective is designed to monitor semantic information of the global cause. Experimental results on the emotional support conversation dataset, ESConv, confirm that the proposed GLHG has achieved the state-of-the-art performance on the automatic and human evaluations.

NeurIPS Conference 2022 Conference Paper

FNeVR: Neural Volume Rendering for Face Animation

  • Bohan Zeng
  • Boyu Liu
  • Hong Li
  • Xuhui Liu
  • Jianzhuang Liu
  • Dapeng Chen
  • Wei Peng
  • Baochang Zhang

Face animation, one of the hottest topics in computer vision, has achieved a promising performance with the help of generative models. However, it remains a critical challenge to generate identity preserving and photo-realistic images due to the sophisticated motion deformation and complex facial detail modeling. To address these problems, we propose a Face Neural Volume Rendering (FNeVR) network to fully explore the potential of 2D motion warping and 3D volume rendering in a unified framework. In FNeVR, we design a 3D Face Volume Rendering (FVR) module to enhance the facial details for image rendering. Specifically, we first extract 3D information with a well designed architecture, and then introduce an orthogonal adaptive ray-sampling module for efficient rendering. We also design a lightweight pose editor, enabling FNeVR to edit the facial pose in a simple yet effective way. Extensive experiments show that our FNeVR obtains the best overall quality and performance on widely used talking-head benchmarks.

IS Journal 2022 Journal Article

Multiscale 3D-Shift Graph Convolution Network for Emotion Recognition From Human Actions

  • Henglin Shi
  • Wei Peng
  • Haoyu Chen
  • Xin Liu
  • Guoying Zhao

Emotion recognition from body gestures is challenging since similar emotions can be expressed by arbitrary spatial configurations of joints, which results in relying on modeling spatial-temporal patterns from a more global level. However, most recent powerful graph convolution networks (GCNs) separate the spatial and temporal modeling into isolated processes, where GCN models spatial interactions using partially fixed adjacent matrices and 1D convolution captures temporal dynamics, which is insufficient for emotion recognition. In this work, we propose the 3D-Shift GCN, which enables interactions of joints within a spatial-temporal volume for global feature extraction. Besides, we further develop a multiscale architecture, the MS-Shift GCN, to fuse features captured under different temporal ranges for modeling richer dynamics. After conducting evaluation on two regular action recognition benchmarks and two gesture based emotion recognition datasets, the results show that the proposed method outperforms several state-of-the-art methods.

JBHI Journal 2022 Journal Article

Predicting Drug Response Based on Multi-Omics Fusion and Graph Convolution

  • Wei Peng
  • Tielin Chen
  • Wei Dai

Different cancer patients may respond differently to cancer treatment due to the heterogeneity of cancer. It is an urgent task to develop an efficient computational method to identify drug responses in different cell lines, which guides us to design personalized therapy for an individual patient. Hence, we propose an end-to-end algorithm, namely MOFGCN, to predict drug response in cell lines based on Multi-Omics Fusion and Graph Convolution Network. MOFGCN first fuses multiple omics data to calculate the cell line similarity and then constructs a heterogeneous network by combining the cell line similarity, drug similarity, and the known cell line-drug associations. Secondly, it learns the latent features for cancer cell lines and drugs by performing graph convolution operations on the heterogeneous network. Finally, MOFGCN applies the linear correlation coefficient to reconstruct the cancer cell line-drug correlation matrix to predict drug sensitivity. To our knowledge, this is the first attempt to combine graph convolutional neural network and linear correlation coefficient for this significant task. We performed extensive evaluation experiments on the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) databases to validate MOFGCN’s performance. The experimental results show that MOFGCN is superior to the state-of-the-art algorithms in predicting missing drug responses. It also leads to higher performance in predicting drug responses for new cell lines, new drugs, and targeted drugs.

JMLR Journal 2022 Journal Article

Scalable and Efficient Hypothesis Testing with Random Forests

  • Tim Coleman
  • Wei Peng
  • Lucas Mentch

Throughout the last decade, random forests have established themselves as among the most accurate and popular supervised learning methods. While their black-box nature has made their mathematical analysis difficult, recent work has established important statistical properties like consistency and asymptotic normality by considering subsampling in lieu of bootstrapping. Though such results open the door to traditional inference procedures, all formal methods suggested thus far place severe restrictions on the testing framework and their computational overhead often precludes their practical scientific use. Here we propose a hypothesis test to formally assess feature significance, which uses permutation tests to circumvent computationally infeasible estimates of nuisance parameters. This test is intended to be analogous to the F-test for linear regression. We establish asymptotic validity of the test via exchangeability arguments and show that the test maintains high power with orders of magnitude fewer computations. Importantly, the procedure scales easily to big data settings where large training and testing sets may be employed, conducting statistically valid inference without the need to construct additional models. Simulations and applications to ecological data, where random forests have recently shown promise, are provided. [abs] [ pdf ][ bib ] &copy JMLR 2022. ( edit, beta )

EAAI Journal 2022 Journal Article

Temperature field inversion of heat-source systems via physics-informed neural networks

  • Xu Liu
  • Wei Peng
  • Zhiqiang Gong
  • Weien Zhou
  • Wen Yao

Temperature field inversion of heat-source systems (TFI-HSS) with limited observations is essential to monitor the system health. Although some methods such as interpolation have been proposed to solve TFI-HSS, those existing methods ignore correlations between data constraints and physics constraints, causing the low precision. In this work, we develop a physics-informed neural network-based temperature field inversion (PINN-TFI) method to solve the TFI-HSS task and a coefficient matrix condition number based position selection of observations (CMCN-PSO) method to select optimal positions of noisy observations. For the TFI-HSS task, the PINN-TFI method encodes constrain terms into the loss function and thus the task is transformed into an optimization problem of minimizing the loss function. In addition, we have found that noise significantly affect reconstruction performances of the PINN-TFI method. To alleviate the effect of noises in observations, we propose the CMCN-PSO method to find optimal positions, where the condition number of observations is used to evaluate positions. The results demonstrate that the PINN-TFI method can significantly improve prediction precisions and the CMCN-PSO method can find good positions to improve the robustness of the PINN-TFI method.

IS Journal 2021 Journal Article

Adaptive Modality Distillation for Separable Multimodal Sentiment Analysis

  • Wei Peng
  • Xiaopeng Hong
  • Guoying Zhao

Multimodal sentiment analysis has increasingly attracted attention since with the arrival of complementary data streams, it has great potential to improve and go beyond unimodal sentiment analysis. In this article, we present an efficient separable multimodal learning method to deal with the tasks with modality missing issue. In this method, the multimodal tensor is utilized to guide the evolution of each separated modality representation. To save the computational expense, Tucker decomposition is introduced, which leads to a general extension of the low-rank tensor fusion method with more modality interactions. The method, in turn, enhances our modality distillation processing. Comprehensive experiments on three popular multimodal sentiment analysis datasets, CMU-MOSI, POM, and IEMOCAP, show a superior performance especially when only partial modalities are available.

AIIM Journal 2021 Journal Article

Network differentiation: A computational method of pathogenesis diagnosis in traditional Chinese medicine based on systems science

  • Qiang Xu
  • Qiang Guo
  • Chun-Xia Wang
  • Song Zhang
  • Chuan-Biao Wen
  • Tao Sun
  • Wei Peng
  • Jun Chen

Resembling the role of disease diagnosis in Western medicine, pathogenesis (also called Bing Ji) diagnosis is one of the utmost important tasks in traditional Chinese medicine (TCM). In TCM theory, pathogenesis is a complex system composed of a group of interrelated factors, which is highly consistent with the character of systems science (SS). In this paper, we introduce a heuristic definition called pathogenesis network (PN) to represent pathogenesis in the form of the directed graph. Accordingly, a computational method of pathogenesis diagnosis, called network differentiation (ND), is proposed by integrating the holism principle in SS. ND consists of three stages. The first stage is to generate all possible diagnoses by Cartesian Product operated on specified prior knowledge corresponding to the input symptoms. The second stage is to screen the validated diagnoses by holism principle. The third stage is to pick out the clinical diagnosis by physician-computer interaction. Some theorems are stated and proved for the further optimization of ND in this paper. We conducted simulation experiments on 100 clinical cases. The experimental results show that our proposed method has an excellent capability to fit the holistic thinking in the process of physician inference.

EAAI Journal 2020 Journal Article

Interval type-2 fuzzy logic based transmission power allocation strategy for lifetime maximization of WSNs

  • Wei Peng
  • Chengdong Li
  • Guiqing Zhang
  • Jianqiang Yi

In wireless sensor networks (WSNs), it is critical to design an advisable transmission power allocation strategy for balancing the latency and energy efficiency, and prolonging the lifetime of WSNs. However, some measured key parameters, e. g. , data latency, energy consumption and communication radius, are with high levels of uncertainties, which deteriorate the transmission power allocation performance greatly. How to employ an advanced method to deal with the uncertainties and to further improve the network performance is a pressing issue. Type-2 fuzzy logic system (T2FLS) as a powerful tool for handling the uncertainties provides an effective way for designing such advisable allocation strategies. Therefore, this paper adopts the interval T2FLS (IT2FLS) to design the transmission power allocation (TPA) strategy for lifetime maximization of WSNs. Firstly, the problem of lifetime enhancement in WSNs is formulated in detail, and then it is converted into a TPA problem. Secondly, the IT2FLS method is applied to the transmission power decision making process for maximizing the lifetime of WSNs. In the designed IT2FLS-based TPA strategy, expected latency, residual energy and distance between nodes are taken as input variables, while the transmission power and communication radius are considered as the output variables. Finally, both simulation and experiment results are given. The results indicate that the proposed TPA strategy using IT2FLS can effectively realize the tradeoff between the latency and energy efficiency, and can prolong the network lifetime of the WSNs. Moreover, compared with other TPA strategies, including the minimum total energy algorithm, the flow augmentation algorithm and the type-1 fuzzy logic method, the proposed IT2FLS-based TPA strategy has obvious advantages in terms of network lifetime, average latency and energy consumption.

AAAI Conference 2020 Conference Paper

Learning Graph Convolutional Network for Skeleton-Based Human Action Recognition by Neural Searching

  • Wei Peng
  • Xiaopeng Hong
  • Haoyu Chen
  • Guoying Zhao

Human action recognition from skeleton data, fuelled by the Graph Convolutional Network (GCN) with its powerful capability of modeling non-Euclidean data, has attracted lots of attention. However, many existing GCNs provide a pre-defined graph structure and share it through the entire network, which can loss implicit joint correlations especially for the higherlevel features. Besides, the mainstream spectral GCN is approximated by one-order hop such that higher-order connections are not well involved. All of these require huge efforts to design a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for this task. Specifically, we explore the spatial-temporal correlations between nodes and build a search space with multiple dynamic graph modules. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a corresponding sampling- and memory-efficient evolution strategy is proposed to search in this space. The resulted architecture proves the effectiveness of the higher-order approximation and the layer-wise dynamic graph modules. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scale skeleton-based action recognition datasets. The results show that our model gets the stateof-the-art results in term of given metrics.

JBHI Journal 2020 Journal Article

Multi-Task Joint Learning Model for Segmenting and Classifying Tongue Images Using a Deep Neural Network

  • Qiang Xu
  • Yu Zeng
  • Wenjun Tang
  • Wei Peng
  • Tingwei Xia
  • Zongrun Li
  • Fei Teng
  • Weihong Li

Automatic tongue image segmentation and tongue image classification are two crucial tongue characterization tasks in traditional Chinese medicine (TCM). Due to the complexity of tongue segmentation and fine-grained traits of tongue image classification, both tasks are challenging. Fortunately, from the perspective of computer vision, these two tasks are highly interrelated, making them compatible with the idea of Multi-Task Joint learning (MTL). By sharing the underlying parameters and adding two different task loss functions, an MTL method for segmenting and classifying tongue images is proposed in this paper. Moreover, two state-of-the-art deep neural network variants (UNET and Discriminative Filter Learning (DFL)) are fused into the MTL to perform these two tasks. To the best of our knowledge, our method is the first attempt to manage both tasks simultaneously with MTL. We conducted extensive experiments with the proposed method. The experimental results show that our joint method outperforms the existing tongue characterization methods. Besides, visualizations and ablation studies are provided to aid in understanding our approach, which suggest that our method is highly consistent with human perception.

TIST Journal 2012 Journal Article

Mining the “Voice of the Customer” for Business Prioritization

  • Wei Peng
  • Tong Sun
  • Shriram Revankar
  • Tao Li

To gain competitiveness and sustained growth in the 21st century, most businesses are on a mission to become more customer-centric. In order to succeed in this endeavor, it is crucial not only to synthesize and analyze the VOC (the VO ice of the C ustomer) data (i.e., the feedbacks or requirements raised by customers), but also to quickly turn these data into actionable knowledge. Although there are many technologies being developed in this complex problem space, most existing approaches in analyzing customer requests are ad hoc, time-consuming, error-prone, people-based processes which hardly scale well as the quantity of customer information explodes. This often results in the slow response to customer requests. In this article, in order to mine VOC to extract useful knowledge for the best product or service quality, we develop a hybrid framework that integrates domain knowledge with data-driven approaches to analyze the semi-structured customer requests. The framework consists of capturing functional features, discovering the overlap or correlation among the features, and identifying the evolving feature trend by using the knowledge transformation model. In addition, since understanding the relative importance of the individual customer request is very critical and has a direct impact on the effective prioritization in the development process, we develop a novel semantic enhanced link-based ranking (SELRank) algorithm for relatively rating/ranking both customer requests and products. The framework has been successfully applied on Xerox Office Group Feature Enhancement Requirements (XOG FER) datasets to analyze customer requests.

IS Journal 2008 Journal Article

Managing Household Wind-Energy Generation

  • G. James
  • Wei Peng
  • Ke Deng

This article describes the use of intelligent-agent technology to aggregate wind-energy generation installed at numerous households and battery storage. This scenario creates a "virtual generator" that can be dispatched on the electricity grid in a manner similar to centralized generation. The purpose of aggregation is to sell renewable generation to the electricity network and market at a price commensurate with its true value.