Prediction of Protein Subcellular Localization: A Machine Learning Approach

Date of Submission: 
June 24, 2010
Report Number: 
10-014
Report PDF: 
Abstract: 
Subcellular localization is a key functional characteristic of proteins. Optimally combining available information is one of the key challenges in today's knowledge-based subcellular localization prediction approaches. This study explores machine learning approaches for the prediction of protein subcellular localization that use resources concerning Gene Ontology and secondary structures. Using the spectrum kernel for feature representation of amino acid sequences and secondary structures, we explore an SVM-based learning method that classifies six subcellular localization sites: endoplasmic reticulum, extracellular, Golgi, membrane, mitochondria, and nucleus.