TOPTMH: Topology Predictor for Transmembrane Alpha-Helices
Alpha-helical transmembrane proteins mediate many key biological processes and represent 20-30% of all genes in many organisms. Due to the difficulties in experimentally determining their high-resolution 3D structure, computational methods that predict their topology (transmembrane helical segments and their orientation) are essential in advancing the understanding of membrane proteins structures and functions.
We developed a new topology prediction method for transmembrane helices called TOPTMH that combines a helix residue predictor with a helix segment identification method and determines the overall orientation using the positive-inside rule. The residue predictor is built using Support Vector Machines (SVM) that utilize evolutionary information in the form of PSI-BLAST generated sequence profiles to annotate each residue by its likelihood of being part of a helix segment. The helix segment identification method is built by combining the segments predicted by two Hidden Markov Models (HMM)one based on the SVM predictions and the other based on the hydrophobicity values of the sequences amino acids. This approach combines the power of SVM-based models to discriminate between the helical and non-helical residues with the power of HMMs to identify contiguous segments of helical residues that take into account the SVM predictions and the hydrophobicity values of neighboring residues.
We present empirical results on two standard datasets and show that both the per-residue (Q2) and per-segment (Qok) scores obtained by TOPTMH are higher than those achieved by well-known methods such as Phobius and MEMSAT3. In addition, on an independent static benchmark, TOPTMH achieved the highest scores on high-resolution sequences (Q2 score of 84% and Qok score of 86%) against existing state-of-the-art systems while achieving low signal peptide error.