Author name cluster

Shuhao Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

1 author row

JBHI Journal 2021 Journal Article

iPro2L-PSTKNC: A Two-Layer Predictor for Discovering Various Types of Promoters by Position Specific of Nucleotide Composition

Yinuo Lyu
Wenying He
Shuhao Li
Quan Zou
Fei Guo

Promoters are DNA regulatory elements located proximal to the transcription start site, which are in charge of the initiation of specific gene transcription. In Escherichia coli, promoters can be recognized by σ factors that have multiple families based on distinct function and structure, such as σ 24, σ 28, σ 32, σ 38, σ 54 and σ 70. At present, biological methods are mainly used to identify these promoters. However, because it is time-consuming and material-consuming to do biological experiments, computational biology algorithm has emerged as a more effective way to predict the classification. In this study, we develop a novel two-layer seamless predictor called iPro2L-PSTKNC to identify the promoters of the E. coli genome, which based on the feature extraction model we newly proposed that is named as the position specific tendencies of k-mer nucleotide composition (PSTKNC). On the first layer, it is a binary classification predicting whether a sequence is promoter or not. And the second layer is a multiple classification identifying which type the identified promoter belongs to. The ensemble classification SVM performsbest comparing with other algorithms, which gets a promising accuracy and the Matthews correlation coefficient (MCC) at 90. 05% and 80. 13%. Our data and code are available at https://github.com/lyuyinuo/iPro2L-PSTKNC.

Details DOI

AAAI Conference 2020 Conference Paper

Author Name Disambiguation on Heterogeneous Information Network with Adversarial Representation Learning

Haiwen Wang
Ruijie Wan
Chuan Wen
Shuhao Li
Yuting Jia
Weinan Zhang
Xinbing Wang

Author name ambiguity causes inadequacy and inconvenience in academic information retrieval, which raises the necessity of author name disambiguation (AND). Existing AND methods can be divided into two categories: the models focusing on content information to distinguish whether two papers are written by the same author, the models focusing on relation information to represent information as edges on the network and to quantify the similarity among papers. However, the former requires adequate labeled samples and informative negative samples, and are also ineffective in measuring the high-order connections among papers, while the latter needs complicated feature engineering or supervision to construct the network. We propose a novel generative adversarial framework to grow the two categories of models together: (i) the discriminative module distinguishes whether two papers are from the same author, and (ii) the generative module selects possibly homogeneous papers directly from the heterogeneous information network, which eliminates the complicated feature engineering. In such a way, the discriminative module guides the generative module to select homogeneous papers, and the generative module generates high-quality negative samples to train the discriminative module to make it aware of high-order connections among papers. Furthermore, a self-training strategy for the discriminative module and a random walk based generating algorithm are designed to make the training stable and efﬁcient. Extensive experiments on two real-world AND benchmarks demonstrate that our model provides signiﬁcant performance improvement over the state-of-the-art methods.

PDF Details