CMB-2016v6n3

Computational Molecular Biology 2016, Vol.6, No.3, 1-6

http://cmb.biopublisher.ca

3

role in different types of cancer, for instance. Breast cancer is often caused by an error in the production of RNA.

This regulatory role makes microRNA very interesting source as a drug target. Millar (2006) explored some of the

similarities and differences between the miRNA’s systems of plants and animals and examine whether they are

fundamentally different or simply variations of a theme. This gives insight to study miRNA and its biological

importance. Similar work was also done by Pant et al. (2009) support vector machine for the classification of plant

and animal miRNA’s. Looking into the importance of miRNA, RNAi (RNA interference) come into existence.

Aagaard and Rossi (2007) studied about RNAi importance with respect to its therapeutics and shows it would be

next biological source for treating diseases. Söllner and Mayer (2006) studied machine learning approaches for

prediction of linear B-cell epitopes on proteins. The approach combines several parameters previously associated

with antigenicity, and includes novel parameters based on frequencies of amino acids and amino acid

neighborhood propensities. Machine learning classifiers clearly outperform the reference classification systems on

the HIV epitope validation set.

Hallett and his co-workers (2006) studied the prediction of subcellular localization of viral proteins within a

mammalian host cell. PSLT predictor which considers the combinatorial presence of domains and targeting signals

in human proteins to predict localization. This localization of proteins greatly helps to identify signature proteins

for HIV drug target sites. Song and Shi (2010) jointly using K-Nearest Neighbor Classifier, and test on a known

dataset which includes 317 apoptosis proteins, the total prediction accuracy of the method are 88.3%. These

results indicate that the composition of dipeptide categories combined with K-Nearest Neighbor Classifier is very

useful for predicting subcellular location of apoptosis proteins. Harrison and Langdale (2006) studied both amino

acid and nucleotide data to generate a phylogeny by distance based methods and likelihood methods and the

results were further analyzed by Bayesian algorithm. Thus, using the DNA data to generate the alignments is very

likely to lead to alignments that sometimes do not reflect the actual mutational history. The protein sequence is

under selective constraint for protein function and protein structure, and these are conserved over much longer

periods than the individual codon choices, hence amino acid sequences are important to study phylogeny.

Prosperi (2009) studied different machine learning and feature selection methods for the classification of HIV

treatment, the success based on viral genotype, therapy, and derived input features. HIV positive persons have low

CD4 count and somehow retinal damage and visual field defects which were proposed by Kozak and Sample

(2007) by Support vector machine and relevance vector machine (RVM), which were sufficiently sensitive to

distinguish these eyes from normal eyes. Nanni and his team (2009) proposed Protein classification combining

surface analysis and primary structure of proteins. Emily et al. (2007) proposed a hybrid prediction method for

Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural

homology approach. The SVM model comprises a number of binary classifiers, in which biological features

derived from Gram-negative bacteria translocation pathways are incorporated and structural homology shows the

common amino acids of these bacteria.

G-protein coupled receptors (GPCRs) the seven-transmembrane domain comprise the largest family of proteins

targeted by drug discovery. Together with structures of the prototypical GPCR rhodopsin, solved structures of

other liganded GPCRs promise to provide insights into the structural basis of the super family’s biochemical

functions and assist in the development of new therapeutic modalities and drugs. Neberg (2007) proposed

evolutionary analysis of GPCR by DNA extraction methods. Evolutionary data from both sequenced genomes and

targeted retrieved orthologs are increasingly used as a source of structural information. Recent success in

sequencing and functionally expressing GPCRs from fossils opens the possibility of studying signaling pathways

even in extinct species.

Steffen et al. (2008) predicted the outcome of a therapy attempt for a patient who carries an HIV with a set of

observed genetic properties; such predictions need to be made for hundreds of possible combinations of drugs,

which use similar biochemical mechanisms. In this paired t-test, distribution matching is significantly better than

reference methods. As significance of machine learning techniques like support vector machine increases

CMB-2016v6n3 - page 6

Warning.