Author name cluster

Yun Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

26 papers

2 author rows

JBHI Journal 2026 Journal Article

HCA-Net: Hierarchical Contextual Attention Network for Lightweight and Accurate Polyp Segmentation

Chengcheng Li
Huiying Xu
Xinzhong Zhu
Huiling Chen
Xinwang Liu
Yun Liu
Chang Tang
Zhendong Chen

Early detection of colorectal polyps is crucial for clinical screening and cancer prevention, where accurate and efficient automatic segmentation plays a pivotal role. However, colonoscopy images often suffer from low contrast, blurred boundaries, and scale variations, making segmentation challenging. Existing encoder-decoder networks (e. g. , U-Net) suffer from asymmetric supervision and feature redundancy, which in turn lead to semantic inconsistency and loss of fine details. While deeper or hybrid designs alleviate these issues, their high complexity and computational burden limit feasibility in real-time clinical practice. To address these challenges, we propose a lightweight segmentation framework, Hierarchical Contextual Attention Network (HCA-Net), consisting of the Redundancy-Suppressed Dual-Path Downsampling (RS-DPD) module and the Boundary-Aware Semantic Alignment Upsampling (BA-SAU) module, applied to the encoder and decoder, respectively. RS-DPD suppresses redundancy while preserving fine-grained details through a dual-path design, whereas BA-SAU leverages cross-layer contextual attention to enforce semantic consistency and enhance boundary sensitivity. Both modules are built upon our proposed Hierarchical Contextual Attention (HCA) mechanism, which combines convolutional projection with pooling-based compression to achieve efficient global modeling and accurate local boundary restoration. In addition, a composite boundary-aware loss function is designed to improve pixel-level accuracy, structural consistency, and robustness in low-contrast and boundary-ambiguous regions. Extensive experiments on public colorectal polyp datasets demonstrate that HCA-Net achieves state-of-the-art (SOTA) segmentation accuracy with significantly improved efficiency, while maintaining robustness under low-contrast and blurred-boundary conditions.

Details DOI

AAAI Conference 2026 Conference Paper

Nighttime Flare Removal via Wavelet-Guided and Gated-Enhanced Spatial-Frequency Fusion Network

Yun Liu
Guang Yang
Tao Li
Weisi Lin

Nighttime flares, caused by complex scattering and reflections from artificial light sources, significantly degrade image quality and hinder downstream visual tasks. Existing deflare networks usually struggle to jointly capture and fuse latent spatial and frequency features. In this paper, we propose a novel Wavelet-guided and Gated-enhanced Spatial-frequency Fusion Network (WGSF-Net) for nighttime flare removal. WGSF-Net is primarily composed of two key modules: Wavelet-guided Fusion Block (WFB) and Local-Global Block (LGB). Specifically, WFB integrates a Multi-level Wavelet Enhancement Block (MWEB) and a Spatial-Frequency Fusion Network (SFFN) to effectively extract hierarchical spatial and frequency features through a coarse-to-fine strategy based on multi-level wavelet decomposition. To better suppress flare artifacts, LGB is designed to jointly capture local and global information: a Gated-Enhanced Attention Block (GEAB) selectively amplifies critical local features through a gated network and a difference network, and the subsequent SFFN performs global spatial-frequency fusion via depthwise separable convolution and partial Fourier convolution. This design enables LGB to effectively disentangle flare-corrupted regions and restore fine-grained details, making it particularly suited for challenging real-world flare scenarios. Extensive experiments on both synthetic and real datasets show that WGSF-Net achieves state-of-the-art performance in nighttime flare removal, outperforming existing methods across five evaluation metrics.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Unsupervised Multi-View Visual Anomaly Detection via Progressive Homography-Guided Alignment

Xintao Chen
Xiaohao Xu
Bozhong Zheng
Yun Liu
Yingna Wu

Unsupervised visual anomaly detection from multi-view images presents a significant challenge: distinguishing genuine defects from benign appearance variations caused by viewpoint changes. Existing methods, often designed for single-view inputs, treat multiple views as a disconnected set of images, leading to inconsistent feature representations and a high false-positive rate. To address this, we introduce ViewSense-AD (VSAD), a novel framework that learns viewpoint-invariant representations by explicitly modeling geometric consistency across views. At its core is our Multi-View Alignment Module (MVAM), which leverages homography to project and align corresponding feature regions between neighboring views. We integrate MVAM into a View-Align Latent Diffusion Model (VALDM), enabling progressive and multi-stage alignment during the denoising process. This allows the model to build a coherent and holistic understanding of the object's surface from coarse to fine scales. Furthermore, a lightweight Fusion Refiner Module (FRM) enhances the global consistency of the aligned features, suppressing noise and improving discriminative power. Anomaly detection is performed by comparing multi-level features from the diffusion model against a learned memory bank of normal prototypes. Extensive experiments on the challenging RealIAD and MANTA datasets demonstrate that VSAD sets a new state-of-the-art, significantly outperforming existing methods in pixel, view, and sample-level visual anomaly detection, proving its robustness to large viewpoint shifts and complex textures.

PDF Details DOI

EAAI Journal 2025 Journal Article

A self-prompt based dual-domain network for nighttime flare removal

Kejing Qi
Bo Wang
Yun Liu

Details DOI

AAAI Conference 2025 Conference Paper

Dehaze-RetinexGAN: Real-World Image Dehazing via Retinex-based Generative Adversarial Network

Xinran Wang
Guang Yang
Tian Ye
Yun Liu

Deep learning based dehazing networks trained on paired synthetic data have shown impressive performance, but they struggle with significant degradation in generalization ability on real-world hazy scenes. In this paper, we propose Dehaze-RetinexGAN, a lightweight Retinex-based Generative Adversarial Network for real-world image Dehazing using unpaired data. Our Dehaze-RetinexGAN consists of two stages: self-supervised pre-training and weakly-supervised fine-tuning. During the pre-training, we reduce the image dehazing task to an illumination-reflectance decomposition task based on the duality correlation between Retinex and dehazing. Specifically, a decomposition network named DecomNet is constructed to obtain an illumination and a reflectance, simultaneously. Moreover, a self-supervised learning strategy is developed to construct the connection between the preliminary dehazed result and the input hazy image, which constrains the solution space of DecomNet and accelerates training, leading to a more realistic dehazed result. In the fine-tuning stage, we develop a dual DTCWT-based attention module and embed it into the U-Net architecture to further improve the quality of preliminary result in the frequency domain. In addition, the adversarial learning is employed to constrain the relevance between the clean image and the final dehazed result in a weakly supervised manner, which can promote more natural performance. Extensive experiments on several real-world datasets demonstrate that our proposed framework performs favorably over state-of-the-art dehazing methods in visual quality and quantitative evaluation.

PDF Details DOI

EAAI Journal 2025 Journal Article

Dual-path multiple attention-guided feature interaction network for Camouflaged Object Detection

Anzhi Wang
Jintao Wu
Shuang Zhao
Yun Liu

Details DOI

JBHI Journal 2025 Journal Article

Innovative Dual-Decoupling CNN With Layer-Wise Temporal-Spatial Attention for Sensor-Based Human Activity Recognition

Qi Teng
Wei Li
Guangwei Hu
Yuanyuan Shu
Yun Liu

Human Activity Recognition (HAR) is essential for monitoring and analyzing human behavior, particularly in health applications such as fall detection and chronic disease management. Traditional methods, even those incorporating attention mechanisms, often oversimplify the complex temporal and spatial dependencies in sensor data by processing features uniformly, leading to inadequate modeling of high-dimensional interactions. To address these limitations, we propose a novel framework: the Temporal-Spatial Feature Decoupling Unit with Layer-wise Training Convolutional Neural Network (CNN-TSFDU-LW). Our model enhances HAR accuracy by decoupling temporal and spatial dependencies, facilitating more precise feature extraction and reducing computational overhead. The TSFDU mechanism enables parallel processing of temporal and spatial features, thereby enriching the learned representations. Furthermore, layer-wise training with a local error function allows for independent updates of each CNN layer, reducing the number of parameters and improving memory efficiency without compromising performance. Experiments on four benchmark datasets (UCI-HAR, PAMAP2, UNIMIB-SHAR, and USC-HAD) demonstrate accuracy improvements ranging from 0. 9% to 4. 19% over state-of-the-art methods while simultaneously reducing computational complexity. Specifically, our framework achieves accuracy rates of 97. 90% on UCI-HAR, 94. 34% on PAMAP2, 78. 90% on UNIMIB-SHAR, and 94. 71% on USC-HAD, underscoring its effectiveness in complex HAR tasks. In conclusion, the CNN-TSFDU-LW framework represents a significant advancement in sensor-based HAR, delivering both improved accuracy and computational efficiency, with promising potential for enhancing health monitoring applications.

Details DOI

AAAI Conference 2025 Conference Paper

J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain

Yiran Hu
Huanghai Liu
Qingjing Chen
Ning Zheng
Chong Wang
Yun Liu
Charles L. A. Clarke
Weixing Shen

As the scale and capabilities of Large Language Models (LLMs) increase, their applications in knowledge-intensive fields such as legal domain have garnered widespread attention. However, it remains doubtful whether these LLMs make judgments based on domain knowledge for reasoning. If LLMs base their judgments solely on specific words or patterns, rather than on the underlying logic of the language, the “LLM-as-judges” paradigm poses substantial risks in the real-world applications. To address this question, we propose a method of legal knowledge injection attacks for robustness testing, thereby inferring whether LLMs have learned legal knowledge and reasoning logic. In this paper, we propose J&H: an evaluation framework for detecting the robustness of LLMs under knowledge injection attacks in the legal domain. The aim of the framework is to explore whether LLMs perform deductive reasoning when accomplishing legal tasks. To further this aim, we have attacked each part of the reasoning logic underlying these tasks (major premise, minor premise, and conclusion generation). We have collected mistakes that legal experts might make in judicial decisions in the real world, such as typos, legal synonyms, inaccurate external legal statutes retrieval. However, in real legal practice, legal experts tend to overlook these mistakes and make judgments based on logic. However, when faced with these errors, LLMs are likely to be misled by typographical errors and may not utilize logic in their judgments. We conducted knowledge injection attacks on existing general and domain-specific LLMs. Current LLMs are not robust against the attacks employed in our experiments. In addition we propose and compare several methods to enhance the knowledge robustness of LLMs. All code can be found at the link.

PDF Details DOI

JBHI Journal 2025 Journal Article

MDP-GRL: Multi-disease Prediction by Graph-enabled Representation Learning

Yongan Guo
Yeqi Huang
Yuao Wang
Yun Liu
Shenqi Jing
Tao Shan
Yuan Miao
Bo Li

In recent years, automatic disease prediction based on electronic health records (EHRs) has emerged as a focal area of research in medical informatics. While successfully facilitating disease diagnosis, this technique still suffers from many limitations caused by the complexity of medical data, particularly the diverse relations and shared risk factors among multiple diseases. Besides, the data sparsity and imbalanced problem in EHR also undermines the effectiveness of existing approaches. Therefore, new approaches are urgently needed to accommodate the EHR features better and make effective predictions on individuals' potential diseases. To address the above challenges, this paper proposes MDP-GRL, a novel m ulti-label d isease p rediction model based on g raph-enabled r epresentation l earning. Specifically, MDP-GRL constructs a medical knowledge graph (MKG) based on the patient and disease information in EHR and then employs a graph neural network (GNN) to realise the disease prediction. To address the data sparsity issue, it incorporates supplementary data for both patients and diseases, i. e. , enriching patient nodes by personal basic information, examination indicators, and illness history, and supplementing disease information with comorbidity information, prevalent populations, common causes, and diagnostic basis. To mitigate the data complexity issue, MDP-GRL considers four different relation patterns in MKG, which optimizes the modelling capabilities. To address the data imbalance problem, it introduces an attention mechanism and self-adversarial negative sampling strategy, which further enhance MDP-GRL's ability to identify error-prone and minority samples. Comprehensive experiments and ablation studies are conducted based on the MIMIC-IV dataset. The results demonstrate MDP-GRL's superiority in multi-disease prediction compared with state-of-the-art approaches.

Details DOI

JBHI Journal 2025 Journal Article

MiRS-HF: A Novel Deep Learning Predictor for Cancer Classification and miRNA Expression Patterns

Jie Ni
Donghui Yan
Shan Lu
Zhuoying Xie
Yun Liu
Xin Zhang

Cancer classification and biomarker identification are crucial for guiding personalized treatment. To make effective use of miRNA associations and expression data, we have developed a deep learning model for cancer classification and biomarker identification. We propose an approach for cancer classification called MiRNA Selection and Hybrid Fusion (MiRS-HF), which consists of early fusion and intermediate fusion. The early fusion involves applying a Layer Attention Graph Convolutional Network (LAGCN) to a miRNA-disease heterogeneous network, resulting in a miRNA-disease association degree score matrix. The intermediate fusion employs a Graph Convolutional Network (GCN) in the classification tasks, weighting the expression data based on the miRNA-disease association degree score. Furthermore, MiRS-HF can identify the important miRNA biomarkers and their expression patterns. The proposed method demonstrates superior performance in the classification tasks of six cancers compared to other methods. Simultaneously, we incorporated the feature weighting strategy into the comparison algorithm, leading to a significant improvement in the algorithm's results, highlighting the extreme importance of this strategy.

Details DOI

NeurIPS Conference 2025 Conference Paper

RADAR: Benchmarking Language Models on Imperfect Tabular Data

Ken Gu
Zhihan Zhang
Kate Lin
Yuwei Zhang
Akshay Paruchuri
Hong Yu
Mehran Kazemi
Kumar Ayush

Language models (LMs) are increasingly being deployed to perform autonomous data analyses. However, their data awareness—the ability to recognize, reason over, and appropriately handle data artifacts such as missing values, outliers, and logical inconsistencies—remains underexplored. These artifacts are especially common in real-world tabular data and, if mishandled, can significantly compromise the validity of analytical conclusions. To address this gap, we present RADAR, a benchmark for systematically evaluating data-aware reasoning on tabular data. We develop a framework to simulate data artifacts via programmatic perturbations to enable targeted evaluation of model behavior. RADAR comprises 2, 980 table-query pairs, grounded in real-world data spanning 9 domains and 5 data artifact types. In addition to evaluating artifact handling, RADAR systematically varies table size to study how reasoning performance holds when increasing table size. Our evaluation reveals that, despite decent performance on tables without data artifacts, frontier models degrade significantly when data artifacts are introduced, exposing critical gaps in their capacity for robust, data-aware analysis. Designed to be flexible and extensible, RADAR supports diverse perturbation types and controllable table sizes, offering a valuable resource for advancing tabular reasoning.

PDF Details

ICML Conference 2025 Conference Paper

Runtime Analysis of Evolutionary NAS for Multiclass Classification

Zeqiong Lv
Chao Qian 0001
Yun Liu
Jiahao Fan
Yanan Sun 0001

Evolutionary neural architecture search (ENAS) is a key part of evolutionary machine learning, which commonly utilizes evolutionary algorithms (EAs) to automatically design high-performing deep neural architectures. During past years, various ENAS methods have been proposed with exceptional performance. However, the theory research of ENAS is still in the infant. In this work, we step for the runtime analysis, which is an essential theory aspect of EAs, of ENAS upon multiclass classification problems. Specifically, we first propose a benchmark to lay the groundwork for the analysis. Furthermore, we design a two-level search space, making it suitable for multiclass classification problems and consistent with the common settings of ENAS. Based on both designs, we consider (1+1)-ENAS algorithms with one-bit and bit-wise mutations, and analyze their upper and lower bounds on the expected runtime. We prove that the algorithm using both mutations can find the optimum with the expected runtime upper bound of $O(rM\ln{rM})$ and lower bound of $\Omega(rM\ln{M})$. This suggests that a simple one-bit mutation may be greatly considered, given that most state-of-the-art ENAS methods are laboriously designed with the bit-wise mutation. Empirical studies also support our theoretical proof.

Details

NeurIPS Conference 2025 Conference Paper

SensorLM: Learning the Language of Wearable Sensors

Yuwei Zhang
Kumar Ayush
Siyuan Qiao
A. Ali Heydari
Girish Narayanswamy
Max Xu
Ahmed Metwally
Jinhua Xu

We present SensorLM, a family of sensor-language foundation models that enable wearable sensor data understanding with natural language. Despite its pervasive nature, aligning and interpreting sensor data with language remains challenging due to the lack of paired, richly annotated sensor-text descriptions in uncurated, real-world wearable data. We introduce a hierarchical caption generation pipeline designed to capture statistical, structural, and semantic information from sensor data. This approach enabled the curation of the largest sensor-language dataset to date, comprising over 59. 7 million hours of data from more than 103, 000 people. Furthermore, SensorLM extends prominent multimodal pretraining architectures (e. g. , CLIP, CoCa) and recovers them as specific variants within a generic architecture. Extensive experiments on real-world tasks in human activity analysis and healthcare verify the superior performance of SensorLM over state-of-the-art in zero-shot recognition, few-shot learning, and cross-modal retrieval. SensorLM also demonstrates intriguing capabilities including scaling behaviors, label efficiency, sensor captioning, and zero-shot generalization to unseen tasks. Code is available at https: //github. com/Google-Health/consumer-health-research/tree/main/sensorlm.

PDF Details

JBHI Journal 2025 Journal Article

SNER: Semi-Supervised Named Entity Recognition for Large Volume of Diabetes Data

Jingyi Zuo
Qijie Qian
Yun Liu
Shan Lu
Bo Li
Yongan Guo

The medical literature and records on diabetes provide crucial resources for diabetes prevention and treatment. However, extracting entities from these textual diabetes data is crucial but challenging. Named entity recognition (NER) - an important corner-stone technology of natural language processing - has been studied well in the general medical field. However, there is still a lack of effective NER methods to handle diabetes data. Briefly, there are three challenges in the real world, including 1) the large volume of diabetes-related data to be processed, 2) the lack of labeled data, and 3) the high costs of manual labeling. To mitigate those challenges, this paper proposes a novel NER method based on semi-supervised learning, namely SNER, for diabetes data processing. It utilizes large amounts of unlabeled data to solve the problem of lack of labeled data. Specifically, it filters the predicted labels based on their confidence and uncertainty scores to reduce the noise entering the model and divide them into positive pseudo-labels and negative pseudo-labels. Also, it utilizes negative pseudo-labels reasonably to improve the training effect of pseudo-labels. Experiments on two public diabetes datasets show that SNER achieves the best performance compared with existing state-of-the-art models.

Details DOI

NeurIPS Conference 2025 Conference Paper

Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts

Chen Li
Huiying Xu
Changxin Gao
Zeyu Wang
Yun Liu
Xinzhong Zhu

Single-source Domain Generalized Object Detection (SDGOD), as a cutting-edge research topic in computer vision, aims to enhance model generalization capability in unseen target domains through single-source domain training. Current mainstream approaches attempt to mitigate domain discrepancies via data augmentation techniques. However, due to domain shift and limited domain‑specific knowledge, models tend to fall into the pitfall of spurious correlations. This manifests as the model's over-reliance on simplistic classification features (e. g. , color) rather than essential domain-invariant representations like object contours. To address this critical challenge, we propose the Cauvis (Causal Visual Prompts) method. First, we introduce a Cross-Attention Prompts module that mitigates bias from spurious features by integrating visual prompts with cross-attention. To address the inadequate domain knowledge coverage and spurious feature entanglement in visual prompts for single-domain generalization, we propose a dual-branch adapter that disentangles causal-spurious features while achieving domain adaptation via high-frequency feature extraction. Cauvis achieves state-of-the-art performance with 15. 9–31. 4\% gains over existing domain generalization methods on SDGOD datasets, while exhibiting significant robustness advantages in complex interference environments.

PDF Details

IJCAI Conference 2024 Conference Paper

LEEC for Judicial Fairness: A Legal Element Extraction Dataset with Extensive Extra-Legal Labels

Zongyue Xue
Huanghai Liu
Yiran Hu
Yuliang Qian
Yajing Wang
Kangle Kong
Chenlu Wang
Yun Liu

An extensive label system is pivotal to facilitate judicial fairness and social justice. Prior empirical research and our interview with legal professionals underscore the importance of extra-legal factors in criminal trials. To help identify sentencing biases and facilitate downstream applications, we introduce the Legal Element ExtraCtion (LEEC) dataset comprising 15, 919 judicial documents and 155 labels. This dataset was constructed through two main steps: First, designing the label system by legal experts based on prior empirical research which identified critical factors driving and processes generating sentencing outcomes in criminal cases; Second, employing legal knowledge to annotate judicial documents according to the label system and annotation guideline. LEEC represents the most extensive and domain-specific legal element extraction dataset for the Chinese legal system. Our experiments reveal that despite certain capabilities, both Document Event Extraction (DEE) models and Large Language Models(LLMs) face significant restrictions in legal element extraction tasks. Finally, our empirical analysis based on LEEC provides evidence for judicial unfairness in Chinese criminal sentencing and confirms the applicability of LEEC for future empirical research and other downstream applications. LEEC and related resources are available on https: //github. com/THUlawtech/LEEC.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Unsupervised Legal Evidence Retrieval via Contrastive Learning with Approximate Aggregated Positive

Feng Yao
Jingyuan Zhang
Yating Zhang
Xiaozhong Liu
Changlong Sun
Yun Liu
Weixing Shen

Verifying the facts alleged by the prosecutors before the trial requires the judges to retrieve evidence within the massive materials accompanied. Existing Legal AI applications often assume the facts are already determined and fail to notice the difficulty of reconstructing them. To build a practical Legal AI application and free the judges from the manually searching work, we introduce the task of Legal Evidence Retrieval, which aims at automatically retrieving the precise fact-related verbal evidence within a single case. We formulate the task in a dense retrieval paradigm, and jointly learn the constrastive representations and alignments between facts and evidence. To get rid of the tedious annotations, we construct an approximated positive vector for a given fact by aggregating a set of evidence from the same case. An entropy-based denoise technique is further applied to mitigate the impact of false positive samples. We train our models on tens of thousands of unlabeled cases and evaluate them on a labeled dataset containing 919 cases and 4,336 queries. Experimental results indicate that our approach is effective and outperforms other state-of-the-art representation and retrieval models. The dataset and code are available at https://github.com/yaof20/LER.

PDF Details DOI

EAAI Journal 2022 Journal Article

Single nighttime image dehazing based on unified variational decomposition model and multi-scale contrast enhancement

Yun Liu
Zhongsheng Yan
Tian Ye
Aimin Wu
Yuche Li

Details DOI

AAAI Conference 2021 Conference Paper

MiniSeg: An Extremely Minimum Network for Efficient COVID-19 Segmentation

Yu Qiu
Yun Liu
Shijie Li
Jing Xu

The rapid spread of the new pandemic, i. e. , COVID-19, has severely threatened global health. Deep-learning-based computer-aided screening, e. g. , COVID-19 infected CT area segmentation, has attracted much attention. However, the publicly available COVID-19 training data are limited, easily causing overfitting for traditional deep learning methods that are usually data-hungry with millions of parameters. On the other hand, fast training/testing and low computational cost are also necessary for quick deployment and development of COVID-19 screening systems, but traditional deep learning methods are usually computationally intensive. To address the above problems, we propose MiniSeg, a lightweight deep learning model for efficient COVID-19 segmentation. Compared with traditional segmentation methods, MiniSeg has several significant strengths: i) it only has 83K parameters and is thus not easy to overfit; ii) it has high computational efficiency and is thus convenient for practical deployment; iii) it can be fast retrained by other users using their private COVID-19 data for further improving performance. In addition, we build a comprehensive COVID-19 segmentation benchmark for comparing MiniSeg to traditional methods.

PDF Details

AAAI Conference 2020 Conference Paper

Pyramid Constrained Self-Attention Network for Fast Video Salient Object Detection

Yuchao Gu
Lijuan Wang
Ziqin Wang
Yun Liu
Ming-Ming Cheng
Shao-Ping Lu

Spatiotemporal information is essential for video salient object detection (VSOD) due to the highly attractive object motion for human’s attention. Previous VSOD methods usually use Long Short-Term Memory (LSTM) or 3D ConvNet (C3D), which can only encode motion information through step-by-step propagation in the temporal domain. Recently, the non-local mechanism is proposed to capture long-range dependencies directly. However, it is not straightforward to apply the non-local mechanism into VSOD, because i) it fails to capture motion cues and tends to learn motion-independent global contexts; ii) its computation and memory costs are prohibitive for video dense prediction tasks such as VSOD. To address the above problems, we design a Constrained Self- Attention (CSA) operation to capture motion cues, based on the prior that objects always move in a continuous trajectory. We group a set of CSA operations in Pyramid structures (PCSA) to capture objects at various scales and speeds. Extensive experimental results demonstrate that our method outperforms previous state-of-the-art methods in both accuracy and speed (110 FPS on a single Titan Xp) on ﬁve challenge datasets. Our code is available at https: //github. com/ guyuchao/PyramidCSA.

PDF Details

AAAI Conference 2019 Conference Paper

Adversarial Learning for Weakly-Supervised Social Network Alignment

Chaozhuo Li
Senzhang Wang
Yukun Wang
Philip Yu
Yanbo Liang
Yun Liu
Zhoujun Li

Nowadays, it is common for one natural person to join multiple social networks to enjoy different kinds of services. Linking identical users across multiple social networks, also known as social network alignment, is an important problem of great research challenges. Existing methods usually link social identities on the pairwise sample level, which may lead to undesirable performance when the number of available annotations is limited. Motivated by the isomorphism information, in this paper we consider all the identities in a social network as a whole and perform social network alignment from the distribution level. The insight is that we aim to learn a projection function to not only minimize the distance between the distributions of user identities in two social networks, but also incorporate the available annotations as the learning guidance. We propose three models SNNAu, SNNAb and SNNAo to learn the projection function under the weakly-supervised adversarial learning framework. Empirically, we evaluate the proposed models over multiple datasets, and the results demonstrate the superiority of our proposals.

PDF Details

IJCAI Conference 2018 Conference Paper

DEL: Deep Embedding Learning for Efficient Image Segmentation

Yun Liu
Peng-tao Jiang
Vahan Petrosyan
Shi-Jie Li
Jiawang Bian
Le Zhang
Ming-Ming Cheng

Image segmentation has been explored for many years and still remains a crucial vision problem. Some efficient or accurate segmentation algorithms have been widely used in many vision applications. However, it is difficult to design a both efficient and accurate image segmenter. In this paper, we propose a novel method called DEL (deep embedding learning) which can efficiently transform superpixels into image segmentation. Starting with the SLIC superpixels, we train a fully convolutional network to learn the feature embedding space for each superpixel. The learned feature embedding corresponds to a similarity measure that measures the similarity between two adjacent superpixels. With the deep similarities, we can directly merge the superpixels into large segments. The evaluation results on BSDS500 and PASCAL Context demonstrate that our approach achieves a good trade-off between efficiency and effectiveness. Specifically, our DEL algorithm can achieve comparable segments when compared with MCG but is much faster than it, i. e. 11. 4fps vs. 0. 07fps.

PDF Details

AAAI Conference 2017 Conference Paper

Semi-Supervised Classifications via Elastic and Robust Embedding

Yun Liu
Yiming Guo
Hua Wang
Feiping Nie
Heng Huang

Transductive semi-supervised learning can only predict labels for unlabeled data appearing in training data, and can not predict labels for testing data never appearing in training set. To handle this out-of-sample problem, many inductive methods make a constraint such that the predicted label matrix should be exactly equal to a linear model. In practice, this constraint might be too rigid to capture the manifold structure of data. In this paper, we relax this rigid constraint and propose to use an elastic constraint on the predicted label matrix such that the manifold structure can be better explored. Moreover, since unlabeled data are often very abundant in practice and usually there are some outliers, we use a non-squared loss instead of the traditional squared loss to learn a robust model. The derived problem, although is convex, has so many nonsmooth terms, which make it very challenging to solve. In the paper, we propose an efﬁcient optimization algorithm to solve a more general problem, based on which we ﬁnd the optimal solution to the derived problem.

PDF Details

IJCAI Conference 2015 Conference Paper

Discriminative Unsupervised Dimensionality Reduction

Xiaoqian Wang
Yun Liu
Feiping Nie
Heng Huang

As an important machine learning topic, dimensionality reduction has been widely studied and utilized in various kinds of areas. A multitude of dimensionality reduction methods have been developed, among which unsupervised dimensionality reduction is more desirable when obtaining label information requires onerous work. However, most previous unsupervised dimensionality reduction methods call for an affinity graph constructed beforehand, with which the following dimensionality reduction steps can be then performed. Separation of graph construction and dimensionality reduction leads the dimensionality reduction process highly dependent on quality of the input graph. In this paper, we propose a novel graph embedding method for unsupervised dimensionality reduction. We simultaneously conduct dimensionality reduction along with graph construction by assigning adaptive and optimal neighbors according to the projected local distances. Our method doesn’t need an affinity graph constructed in advance, but instead learns the graph concurrently with dimensionality reduction. Thus, the learned graph is optimal for dimensionality reduction. Meanwhile, our learned graph has an explicit block diagonal structure, from which the clustering results could be directly revealed without any postprocessing steps. Extensive empirical results on dimensionality reduction as well as clustering are presented to corroborate the performance of our method.

PDF Details

TCS Journal 2012 Journal Article

Groups and decompositions of codes

Yun Liu

Details DOI

TCS Journal 2010 Journal Article

Compositions of maximal codes

Yun Liu

Details DOI