Arrow Research search
Back to AAAI

AAAI 2007

A Randomized String Kernel and Its Application to RNA Interference

Conference Paper Machine Learning Artificial Intelligence

Abstract

String kernels directly model sequence similarities without the necessity of extracting numerical features in a vector space. Since they better capture complex traits in the sequences, string kernels often achieve better prediction performance. RNA interference is an important biological mechanism with many therapeutical applications, where strings can be used to represent target messenger RNAs and initiating short RNAs and string kernels can be applied for learning and prediction. However, existing string kernels are not particularly developed for RNA applications. Moreover, most existing string kernels are n-gram based and suffer from high dimensionality and inability of preserving subsequence orderings. We propose a randomized string kernel for use with support vector regression with a purpose of better predicting silencing efficacy scores for the candidate sequences and eventually improving the efficiency of biological experiments. We show the positive definiteness of this kernel and give an analysis of randomization error rates. Empirical results on biological data demonstrate that the proposed kernel performed better than existing string kernels and achieved significant improvements over kernels computed from numerical descriptors extracted according to structural and thermodynamic rules. In addition, it is computationally more efficient.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
1125867987859925813