Previous Article in event
Next Article in event
Alignment-free Prediction of Ribonucleases using a Computational Chemistry approach: Comparison with HMM model and Isolation from Schizosaccharomyces pombe, Prediction, and Experimental assay of a new sequence
Published:
15 November 2008
by MDPI
in The 12th International Electronic Conference on Synthetic Organic Chemistry
session Computational Chemistry
Abstract: The study of type III RNases constitutes an important area in molecular biology. It is known that the pac1+ gene encodes a particular RNase III that shares low amino acid similarity with other genes despite having a double-stranded ribonuclease activity. Bioinformatics methods based on sequence alignment may fail when there is a low amino acidic identity percentage between query sequence and others with similar functions (remote homologues) or a similar sequence is not recorded in the database. Quantitative Structure-Activity Relationships (QSAR) applied to protein sequences may allow an alignment-independent prediction of protein function. These sequences QSAR like methods often use 1D sequence numerical parameters as the input to seek sequence-function relationships. However, previous 2D representation of sequences may uncover useful higher-order information. In the work described here we calculated for the first time the Spectral Moments of a Markov Matrix (MMM) associated with a 2D-HP-map of a protein sequence. We used MMMs values to characterize numerically 81 sequences of type III RNases and 133 proteins of a control group. We subsequently developed one MMM-QSAR and one classic Hidden Markov Model (HMM) based on the same data. The MMM-QSAR showed a discrimination power of RNAses from other proteins of 97.35% without using alignment, which is a result as good as for the known HMM techniques. We also report for the first time the isolation of a new Pac1 protein (DQ647826) from Schizosaccharomyces pombe, strain 428-4-1. The MMM-QSAR model predicts the new RNase III with the same accuracy as otherclassical alignment methods. Experimental assay of this protein confirms the predicted activity. The present results suggest that MMM-QSAR models may be used for protein function annotation avoiding sequence alignment with the same accuracy of classic HMM models.
Keywords: Spectral graph theory / Hidden Markov Model / Ribonucleases / Pac1 / Protein 2D representations