Author name cluster

Kun Huang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

18 papers

2 author rows

ECAI Conference 2023 Conference Paper

Grafting Fine-Tuning and Reinforcement Learning for Empathetic Emotion Elicitation in Dialog Generation

Ying Zhu
Bo Wang 0011
Dongming Zhao
Kun Huang
Zhuoxuan Jiang
Ruifang He
Yuexian Hou

For human-like dialogue systems, it is significant to inject the empathetic ability or elicit the opposite’s positive emotions, while existing studies mostly only focus on either of the above two research lines. In this work, we propose a novel and grafted task named Empathetic Emotion Elicitation Dialog to make a dialog system able to possess both aspects of ability simultaneously. We do not train an empathetic dialog system and an emotion elicitation dialog system separately and then simply concatenate the responses generated by these two systems, which will cause illogical and repetitive responses. Instead, we propose a unified solution: (1) To generate empathetic responses and emotion elicitation responses within the same semantic space, we design a unified framework. (2) The unified framework has three stages which first retrieve the empathetic and emotion elicitation exemplars as external knowledge, then fine-tune the emotion/action prediction on a pre-trained language model to enhance the empathetic ability, and finally model the user feedback by reinforcement learning to enhance the emotion elicitation ability. Experiments show that our method outperforms the baselines in the response generation quality and simultaneously empathizes with the user and elicits their positive emotions.

Details

JBHI Journal 2023 Journal Article

LAGAN: Lesion-Aware Generative Adversarial Networks for Edema Area Segmentation in SD-OCT Images

Yuhui Tao
Xiao Ma
Yizhe Zhang
Kun Huang
Zexuan Ji
Wen Fan
Songtao Yuan
Qiang Chen

Large volume of labeled data is a cornerstone for deep learning (DL) based segmentation methods. Medical images require domain experts to annotate, and full segmentation annotations of large volumes of medical data are difficult, if not impossible, to acquire in practice. Compared with full annotations, image-level labels are multiple orders of magnitude faster and easier to obtain. Image-level labels contain rich information that correlates with the underlying segmentation tasks and should be utilized in modeling segmentation problems. In this article, we aim to build a robust DL-based lesion segmentation model using only image-level labels (normal v. s. abnormal). Our method consists of three main steps: (1) training an image classifier with image-level labels; (2) utilizing a model visualization tool to generate an object heat map for each training sample according to the trained classifier; (3) based on the generated heat maps (as pseudo-annotations) and an adversarial learning framework, we construct and train an image generator for Edema Area Segmentation (EAS). We name the proposed method Lesion-Aware Generative Adversarial Networks (LAGAN) as it combines the merits of supervised learning (being lesion-aware) and adversarial training (for image generation). Additional technical treatments, such as the design of a multi-scale patch-based discriminator, further enhance the effectiveness of our proposed method. We validate the superior performance of LAGAN via comprehensive experiments on two publicly available datasets (i. e. , AI Challenger and RETOUCH).

Details DOI

ICLR Conference 2023 Conference Paper

Learning Sparse Group Models Through Boolean Relaxation

Yijie Wang
Yuan Zhou 0007
Xiaoqing Huang
Kun Huang
Jie Zhang
Jianzhu Ma

We introduce an efficient algorithmic framework for learning sparse group models formulated as the natural convex relaxation of a cardinality-constrained program with Boolean variables. We provide theoretical techniques to characterize the equivalent condition when the relaxation achieves the exact integral optimal solution, as well as a rounding algorithm to produce a feasible integral solution once the optimal relaxation solution is fractional. We demonstrate the power of our equivalent condition by applying it to two ensembles of random problem instances that are challenging and popularly used in literature and prove that our method achieves exactness with overwhelming probability and nearly optimal sample complexity. Empirically, we use synthetic datasets to demonstrate that our proposed method significantly outperforms the state-of-the-art group sparse learning models in terms of individual and group support recovery when the number of samples is small. Furthermore, we show the out-performance of our method in cancer drug response prediction.

Details

AAMAS Conference 2023 Conference Paper

Think Twice: A Human-like Two-stage Conversational Agent for Emotional Response Generation

Yushan Qian
Bo Wang
Shangzhao Ma
Wu Bin
Shuo Zhang
Dongming Zhao
Kun Huang
Yuexian Hou

Towards human-like dialogue systems, current emotional dialogue approaches jointly model emotion and semantics with a unified neural network. This strategy tends to generate safe responses due to the mutual restriction between emotion and semantics, and requires the rare large-scale emotion-annotated dialogue corpus. Inspired by the "think twice" behavior in human intelligent dialogue, we propose a two-stage conversational agent for the generation of emotional dialogue. Firstly, a dialogue model trained without the emotion-annotated dialogue corpus generates a prototype response that meets the contextual semantics. Secondly, the first-stage prototype is modified by a controllable emotion refiner with the empathy hypothesis. Experimental results on the DailyDialog and EmpatheticDialogues datasets demonstrate that the proposed conversational agent outperforms the compared models in the emotion generation and maintains the semantic performance in the automatic and human evaluations.

PDF

NeurIPS Conference 2023 Conference Paper

Towards Efficient Pre-Trained Language Model via Feature Correlation Distillation

Kun Huang
Xin Guo
Meng Wang

Knowledge Distillation (KD) has emerged as a promising approach for compressing large Pre-trained Language Models (PLMs). The performance of KD relies on how to effectively formulate and transfer the knowledge from the teacher model to the student model. Prior arts mainly focus on directly aligning output features from the transformer block, which may impose overly strict constraints on the student model's learning process and complicate the training process by introducing extra parameters and computational cost. Moreover, our analysis indicates that the different relations within self-attention, as adopted in other works, involves more computation complexities and can easily be constrained by the number of heads, potentially leading to suboptimal solutions. To address these issues, we propose a novel approach that builds relationships directly from output features. Specifically, we introduce token-level and sequence-level relations concurrently to fully exploit the knowledge from the teacher model. Furthermore, we propose a correlation-based distillation loss to alleviate the exact match properties inherent in traditional KL divergence or MSE loss functions. Our method, dubbed FCD, presents a simple yet effective method to compress various architectures (BERT, RoBERTa, and GPT) and model sizes (base-size and large-size). Extensive experimental results demonstrate that our distilled, smaller language models significantly surpass existing KD methods across various NLP tasks.

PDF Details

JBHI Journal 2022 Journal Article

A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media

Xiao Luo
Priyanka Gandhi
Susan Storey
Kun Huang

Patients experience various symptoms when they haveeither acute or chronic diseases or undergo some treatments for diseases. Symptoms are often indicators of the severity of the disease and the need for hospitalization. Symptoms are often described in free text written as clinical notes in the Electronic Health Records (EHR) and are not integrated with other clinical factors for disease prediction and healthcare outcome management. In this research, we propose a novel deep language model to extract patient-reported symptoms from clinical text. The deep language model integrates syntactic and semantic analysis for symptom extraction and identifies the actual symptoms reported by patients and conditional or negation symptoms. The deep language model can extract both complex and straightforward symptom expressions. We used a real-world clinical notes dataset to evaluate our model and demonstrated that our model achieves superior performance compared to three other state-of-the-art symptom extraction models. We extensively analyzed our model to illustrate its effectiveness by examining each component’s contribution to the model. Finally, we applied our model on a COVID-19 tweets data set to extract COVID-19 symptoms. The results show that our model can identify all the symptoms suggested by the Center for Disease Control (CDC) ahead of their timeline and many rare symptoms.

Details DOI

ICRA Conference 2022 Conference Paper

Accurate Calibration of Multi-Perspective Cameras from a Generalization of the Hand-Eye Constraint

Yifu Wang
Wenqing Jiang
Kun Huang
Sören Schwertfeger
Laurent Kneip

Multi-perspective cameras are quickly gaining importance in many applications such as smart vehicles and virtual or augmented reality. However, a large system size or absence of overlap in neighbouring fields-of-view often complicate their calibration. We present a novel solution which relies on the availability of an external motion capture system. Our core contribution consists of an extension to the hand-eye calibration problem which jointly solves multi-eye-to-base problems in closed form. We furthermore demonstrate its equivalence to the multi-eye-in-hand problem. The practical validity of our approach is supported by our experiments, indicating that the method is highly efficient and accurate, and outperforms existing closed-form alternatives.

Details

ICLR Conference 2022 Conference Paper

Know Thyself: Transferable Visual Control Policies Through Robot-Awareness

Edward S. Hu
Kun Huang
Oleh Rybkin
Dinesh Jayaraman

Training visual control policies from scratch on a new robot typically requires generating large amounts of robot-specific data. How might we leverage data previously collected on another robot to reduce or even completely remove this need for robot-specific data? We propose a "robot-aware control" paradigm that achieves this by exploiting readily available knowledge about the robot. We then instantiate this in a robot-aware model-based RL policy by training modular dynamics models that couple a transferable, robot-aware world dynamics module with a robot-specific, potentially analytical, robot dynamics module. This also enables us to set up visual planning costs that separately consider the robot agent and the world. Our experiments on tabletop manipulation tasks with simulated and real robots demonstrate that these plug-in improvements dramatically boost the transferability of visual model-based RL policies, even permitting zero-shot transfer of visual manipulation skills onto new robots. Project website: https://www.seas.upenn.edu/~hued/rac

Details

AAAI Conference 2022 Conference Paper

Width & Depth Pruning for Vision Transformers

Fang Yu
Kun Huang
Meng Wang
Yuan Cheng
Wei Chu
Li Cui

Transformer models have demonstrated their promising potential and achieved excellent performance on a series of computer vision tasks. However, the huge computational cost of vision transformers hinders their deployment and application to edge devices. Recent works have proposed to find and remove the unimportant units of vision transformers. Despite achieving remarkable results, these methods take one dimension of network width into consideration and ignore network depth, which is another important dimension for pruning vision transformers. Therefore, we propose a Width & Depth Pruning (WDPruning) framework that reduces both width and depth dimensions simultaneously. Specifically, for width pruning, a set of learnable pruning-related parameters is used to adaptively adjust the width of transformer. For depth pruning, we introduce several shallow classifiers by using the intermediate information of the transformer blocks, which allows images to be classified by shallow classifiers instead of the deeper classifiers. In the inference period, all of the blocks after shallow classifiers can be dropped so they don’t bring additional parameters and computation. Experimental results on benchmark datasets demonstrate that the proposed method can significantly reduce the computational costs of mainstream vision transformers such as DeiT and Swin Transformer with a minor accuracy drop. In particular, on ILSVRC-12, we achieve over 22% pruning ratio of FLOPs by compressing DeiT-Base, even with an increase of 0. 14% Top-1 accuracy.

PDF Details

JBHI Journal 2021 Journal Article

A Computational Framework to Analyze the Associations Between Symptoms and Cancer Patient Attributes Post Chemotherapy Using EHR Data

Xiao Luo
Priyanka Gandhi
Susan Storey
Zuoyi Zhang
Zhi Han
Kun Huang

Patients with cancer, such as breast and colorectal cancer, often experience different symptoms post-chemotherapy. The symptoms could be fatigue, gastrointestinal (nausea, vomiting, lack of appetite), psychoneurological symptoms (depressive symptoms, anxiety), or other types. Previous research focused on understanding the symptoms using survey data. In this research, we propose to utilize the data within the Electronic Health Record (EHR). A computational framework is developed to use a natural language processing (NLP) pipeline to extract the clinician-documented symptoms from clinical notes. Then, a patient clustering method is based on the symptom severity levels to group the patient in clusters. The association rule mining is used to analyze the associations between symptoms and patient attributes (smoking history, number of comorbidities, diabetes status, age at diagnosis) in the patient clusters. The results show that the various symptom types and severity levels have different associations between breast and colorectal cancers and different timeframes post-chemotherapy. The results also show that patients with breast or colorectal cancers, who smoke and have severe fatigue, likely have severe gastrointestinal symptoms six months after the chemotherapy. Our framework can be generalized to analyze symptoms or symptom clusters of other chronic diseases where symptom management is critical.

Details DOI

ICRA Conference 2021 Conference Paper

B-splines for Purely Vision-based Localization and Mapping on Non-holonomic Ground Vehicles

Kun Huang
Yifu Wang
Laurent Kneip

Purely vision-based localization and mapping is a cost-effective and thus attractive solution to localization and mapping on smart ground vehicles. However, the accuracy and especially robustness of vision-only solutions remain rivalled by more expensive, lidar-based multi-sensor alternatives. We show that a significant increase in robustness can be achieved if taking non-holonomic kinematic constraints on the vehicle motion into account. Rather than using approximate planar motion models or simple, pair-wise regularization terms, we demonstrate the use of B-splines for an exact imposition of smooth, non-holonomic trajectories inside the 6 DoF bundle adjustment. We introduce both hard and soft formulations and compare their computational efficiency and accuracy against traditional solutions. Through results on both simulated and real data, we demonstrate a significant improvement in robustness and accuracy in degrading visual conditions.

Details

IROS Conference 2021 Conference Paper

Dynamic Event Camera Calibration

Kun Huang
Yifu Wang
Laurent Kneip

Camera calibration is an important prerequisite towards the solution of 3D computer vision problems. Traditional methods rely on static images of a calibration pattern. This raises interesting challenges towards the practical usage of event cameras, which notably require image change to produce sufficient measurements. The current standard for event camera calibration therefore consists of using flashing patterns. They have the advantage of simultaneously triggering events in all reprojected pattern feature locations, but it is difficult to construct or use such patterns in the field. We present the first dynamic event camera calibration algorithm. It calibrates directly from events captured during relative motion between camera and calibration pattern. The method is propelled by a novel feature extraction mechanism for calibration patterns, and leverages existing calibration tools before optimizing all parameters through a multi-segment continuous-time formulation. As demonstrated through our results on real data, the obtained calibration method is highly convenient and reliably calibrates from data sequences spanning less than 10 seconds.

Details

NeurIPS Conference 2021 Conference Paper

Fast Projection onto the Capped Simplex with Applications to Sparse Regression in Bioinformatics

Man Shun Ang
Jianzhu Ma
Nianjun Liu
Kun Huang
Yijie Wang

We consider the problem of projecting a vector onto the so-called k-capped simplex, which is a hyper-cube cut by a hyperplane. For an n-dimensional input vector with bounded elements, we found that a simple algorithm based on Newton's method is able to solve the projection problem to high precision with a complexity roughly about O(n), which has a much lower computational cost compared with the existing sorting-based methods proposed in the literature. We provide a theory for partial explanation and justification of the method. We demonstrate that the proposed algorithm can produce a solution of the projection problem with high precision on large scale datasets, and the algorithm is able to significantly outperform the state-of-the-art methods in terms of runtime (about 6-8 times faster than a commercial software with respect to CPU time for input vector with 1 million variables or more). We further illustrate the effectiveness of the proposed algorithm on solving sparse regression in a bioinformatics problem. Empirical results on the GWAS dataset (with 1, 500, 000 single-nucleotide polymorphisms) show that, when using the proposed method to accelerate the Projected Quasi-Newton (PQN) method, the accelerated PQN algorithm is able to handle huge-scale regression problem and it is more efficient (about 3-6 times faster) than the current state-of-the-art methods.

PDF Details

ICRA Conference 2021 Conference Paper

Fusing RGBD Tracking and Segmentation Tree Sampling for Multi-Hypothesis Volumetric Segmentation

Andrew Price
Kun Huang
Dmitry Berenson

Despite rapid progress in scene segmentation in recent years, 3D segmentation methods are still limited when there is severe occlusion. The key challenge is estimating the segment boundaries of (partially) occluded objects, which are inherently ambiguous when considering only a single frame. In this work, we propose Multihypothesis Segmentation Tracking (MST), a novel method for volumetric segmentation in changing scenes, which allows scene ambiguity to be tracked and our estimates to be adjusted over time as we interact with the scene. Two main innovations allow us to tackle this difficult problem: 1) A novel way to sample possible segmentations from a segmentation tree; and 2) A novel approach to fusing tracking results with multiple segmentation estimates. These methods allow MST to track the segmentation state over time and incorporate new information, such as new objects being revealed. We evaluate our method on several cluttered tabletop environments in simulation and reality. Our results show that MST outperforms baselines in all tested scenes.

Details

IJCAI Conference 2021 Conference Paper

Transfer Learning via Optimal Transportation for Integrative Cancer Patient Stratification

Ziyu Liu
WEI SHAO
Jie Zhang
Min Zhang
Kun Huang

The Stratification of early-stage cancer patients for the prediction of clinical outcome is a challenging task since cancer is associated with various molecular aberrations. A single biomarker often cannot provide sufficient information to stratify early-stage patients effectively. Understanding the complex mechanism behind cancer development calls for exploiting biomarkers from multiple modalities of data such as histopathology images and genomic data. The integrative analysis of these biomarkers sheds light on cancer diagnosis, subtyping, and prognosis. Another difficulty is that labels for early-stage cancer patients are scarce and not reliable enough for predicting survival times. Given the fact that different cancer types share some commonalities, we explore if the knowledge learned from one cancer type can be utilized to improve prognosis accuracy for another cancer type. We propose a novel unsupervised multi-view transfer learning algorithm to simultaneously analyze multiple biomarkers in different cancer types. We integrate multiple views using non-negative matrix factorization and formulate the transfer learning model based on the Optimal Transport theory to align features of different cancer types. We evaluate the stratification performance on three early-stage cancers from the Cancer Genome Atlas (TCGA) project. Comparing with other benchmark methods, our framework achieves superior accuracy for patient outcome prediction.

PDF Details DOI

JBHI Journal 2020 Journal Article

AI in Medical Imaging Informatics: Current Challenges and Future Directions

Andreas S. Panayides
Amir Amini
Nenad D. Filipovic
Ashish Sharma
Sotirios A. Tsaftaris
Alistair Young
David Foran
Nhan Do

This paper reviews state-of-the-art research solutions across the spectrum of medical imaging informatics, discusses clinical translation, and provides future directions for advancing clinical practice. More specifically, it summarizes advances in medical imaging acquisition technologies for different modalities, highlighting the necessity for efficient medical data management strategies in the context of AI in big healthcare data analytics. It then provides a synopsis of contemporary and emerging algorithmic methods for disease classification and organ/ tissue segmentation, focusing on AI and deep learning architectures that have already become the de facto approach. The clinical benefits of in-silico modelling advances linked with evolving 3D reconstruction and visualization applications are further documented. Concluding, integrative analytics approaches driven by associate research branches highlighted in this study promise to revolutionize imaging informatics as known today across the healthcare continuum for both radiology and digital pathology applications. The latter, is projected to enable informed, more accurate diagnosis, timely prognosis, and effective treatment planning, underpinning precision medicine.

Details DOI

ICRA Conference 2020 Conference Paper

Reliable frame-to-frame motion estimation for vehicle-mounted surround-view camera systems

Yifu Wang
Kun Huang
Xin Peng 0005
Hongdong Li
Laurent Kneip

Modern vehicles are often equipped with a surround-view multi-camera system. The current interest in autonomous driving invites the investigation of how to use such systems for a reliable estimation of relative vehicle displacement. Existing camera pose algorithms either work for a single camera, make overly simplified assumptions, are computationally expensive, or simply become degenerate under non-holonomic vehicle motion. In this paper, we introduce a new, reliable solution able to handle all kinds of relative displacements in the plane despite the possibly non-holonomic characteristics. We furthermore introduce a novel two-view optimization scheme which minimizes a geometrically relevant error without relying on 3D point related optimization variables. Our method leads to highly reliable and accurate frame-to-frame visual odometry with a full-size, vehicle-mounted surround-view camera system.

Details

AAAI Conference 2019 Conference Paper

Efficient Quantization for Neural Networks with Binary Weights and Low Bitwidth Activations

Kun Huang
Bingbing Ni
Xiaokang Yang

Quantization has shown stunning efficiency on deep neural network, especially for portable devices with limited resources. Most existing works uncritically extend weight quantization methods to activations. However, we take the view that best performance can be obtained by applying different quantization methods to weights and activations respectively. In this paper, we design a new activation function dubbed CReLU from the quantization perspective and further complement this design with appropriate initialization method and training procedure. Moreover, we develop a specific quantization strategy in which we formulate the forward and backward approximation of weights with binary values and quantize the activations to low bitwdth using linear or logarithmic quantizer. We show, for the first time, our final quantized model with binary weights and ultra low bitwidth activations outperforms the previous best models by large margins on ImageNet as well as achieving nearly a 10. 85× theoretical speedup with ResNet-18. Furthermore, ablation experiments and theoretical analysis demonstrate the effectiveness and robustness of CReLU in comparison with other activation functions.

PDF Details