A Summary of Missense Prediction Methods
Dr. Andrew C.R. Martin, UCL

www.bioinf.org.uk

 

Methods and Approaches

Most mutations are 'Loss of Function' – some are 'Gain of Function' (generally through loss of regulation). A small number are actually 'change of function' – e.g. of specificity (estimated at 5% of cancer mutations [9]).

Many methods are purely sequence based (e.g. SIFT). Protein structure information has been incorporated through rule-based approaches [70] or machine learning. More sophisticated alignment information has been used exploiting hidden Markov models (e.g. subPSEC and PANTHER).When a structure is not available, comparative modelling may be exploited (e.g. LS-SNP) and application of ab initio structural models has also been explored [65]. Various servers exploit a combination of sequence, structural and evolutionary features (e.g. SNAP, PMUT and CanPredict).

Sequence and evolutionary conservation-based methods

e.g. SIFT, Align-GVGD, MutationAssessor, PANTHER, MAPP

Empirical rules

e.g. PolyPhen

Protein Structure

e.g. SNP@Domain, BONGO, SNPs3D

Protein sequence and structure-based methods

e.g. PolyPhen, PolyPhen-2, LS-SNP/PDB, SNPeffect, BONGO

Direct methods

These employ some sort of score based on some type of theoretical model of what happens when a mutation occurs (e.g. SIFT, PANTHER, etc)

Machine-learning methods

These use machine-learning (such as neural nets, SVMs, random forests, etc.) and can combine different properties of the native and mutant residue such as size and polarity, together with other information such as structural environment (e.g. accessibility, H-bonding), evolutionary conservation

e.g. PMut, SNAP, PhD-SNP, SNPs&GO, Parepro, CanPredict, nsSNPAnalyzer, MutPred, Hansa, MutationTaster

Missense Prediction Tool Catalogue

http://www.ngrl.org.uk/Manchester/page/missense-prediction-tool-catalogue

A Summary of Prediction Methods and Databases

Ref – reference (see below)
St – use of structural data: Y = required; (Y) = used if available (predicts structural information otherwise)
M – generates models: Y = yes; P = precomputed only; H = highlights where the mutation is but doesn't model it
Pre – Are data pre-calculated: Y = yes (novel mutations cannot be uploaded); NS = No web server

Program Ref St M Pre Notes
SNPs3D 1, 70, 72, 73 Y P Y snps3d.org
Uses structures, sequence profiles, pathways together with conservation scores from MutDB to train SVMs to make destabilization predictions
StSNP 2 Y Y Y ilyinlab.org/StSNP/
Pre-calculated analysis uses pathways
ModSNP 3 Y Y Simply provides models and SIFT results (No longer available?)
MutDB 4 Y H Y www.mutdb.org
Precalculated set of mutant models
LS-SNP 5, 57, 58 Y P Y www.salilab.org/LS-SNP
Uses an SVM trained with rule-based annotation of structure, sequence and evolution to look for destabilization, proximity to ligands and interfaces and exploits information from OMIM on similar known PDs
TopoSNP 6 Y P Y gila.bioe.uic.edu/snp/toposnp/
Classifies residues based on location (surface pocket or interior void; convex or depressed surface; internal) and combines this with a conservation score from derived from Pfam.
SNPeffect 7 Y N N snpeffect.vib.be/
assesses stability (FoldX), aggregation, amyloidosis, proximity to functional sites and cellular processing
nsSNPAnalyzer 8 Y N http://snpanalyzer.uthsc.edu/
Exploits SIFT and structural features to train a random forest
FIS 9 NS 'Functional Impact Score' – exploits evolutionary information from multiple sequence alignments.
MutationTaster 10 N www.mutationtaster.org
Uses conservation, effects on splicing, protein features and mRNA production/stability
SNAP 11, 12 (Y) N N www.rostlab.org/services/SNAP
Uses neural networks with data from the sequence and PolyPhen and SIFT predictions. In addition it uses predicted structural features (solvent accessibility, secondary structure and flexibility), but can exploit actual structural data if available.
Condel 13 bg.upf.edu/condel
Uses a weighted average score from a number of predictors. The original paper uses LogRE, MAPP, MutationAssessor, PolyPhen-2 and SIFT, but the latest version just MutationAssessor and FATHMM.
FATHMM 14 fathmm.biocompute.org.uk
Exploits HMMs to represent a protein family and exploits species-specific weights.
SAAPdb 28 Y N Y www.bioinf.org.uk/saap/db/ [NO LONGER MAINTAINED]
A pre-calculated database of the structural effects of mutations. Used a number of rule-based analyses of strctural effects together with a conservation score.
SAAPdap 15 Y N www.bioinf.org.uk/saap/dap/
A pipeline for calculating the structural effects of mutations (replaces SAAPdb). Uses a number of rule-based analyses of strctural effects together with a conservation score.
SAAPpred 15 Y N www.bioinf.org.uk/saap/dap/
A random-forest predictor based on the structural analyses from SAAPdap
MutPred 16 mutpred.mutdb.org
Uses a Random Forest predictor with data based on predicted protein structure and dynamics, predicted functional properties and sequence and evolutionary information.
CADD 17 cadd.gs.washington.edu
A meta-predictor that uses support vector machines with results from SIFT, PolyPhen, conservation, predicted effects on regulation, the 'Grantham' score for amino acid differences. Designed to be expandable.
SNPS&GO 18, 36 snps.biofold.org/snps-and-go
Uses results from PANTHER together with functional information from GO and sequence information – both from the local environment and from profiles from multiple sequence alignments.
SNPS&GO3D N/A Y snps.biofold.org/snps-and-go
As SNPS&GO, but also uses structural data
SIFT 19, 64 sift.bii.a-star.edu.sg/
An evolutionary method which calculates a sophisticated residue conservation score from multiple alignment
PolyPhen/PolyPhen-2 20, 21, 22, 67 (Y) N genetics.bwh.harvard.edu/pph/
genetics.bwh.harvard.edu/pph2/
Uses machine learning on a set of eight sequence- and three structure-based features. If no structure is available, the structural features are predicted.
Panther/subPSEC 23, 24, 25 www.pantherdb.org
PSEC is a position-specific evolutionary conservation score and subPSEC is a difference in PSEC scores for a substitution. Panther exploits these scores derived from HMMs (PANTHER/lib) together with an ontology of protein function (PANTHER/X – a simplified form of GO) to make predictions.
PhD-SNP 26 gpcr.biocomp.unibo.it/cgi/predictors/PhD-SNP/PhD-SNP.cgi
Uses a support vector machine with local sequence environment and a profile derived from a multiple sequence alignment
PMut 27, 45 (Y) mmb.pcb.ub.es/PMut/
Uses PHD secondary structure and accessibility prediction (or observed if a structure is available), together with statistical potentials from Prosa-II to evaluate stability, mutation matrix scores, changes in amino acid properties, a sequence potential, PSSM, a conservation score and SwissProt annotations to train a neural network.
SDM 29, 71 Y www-cryst.bioc.cam.ac.uk/~sdm/sdm.php
Assess stability using environment-specific substitution tables and local structural environment (secondary structure, solvent accessibility, Hbonds), functional information from the catalytic site atlas and UniProt.
MutationAssessor 30 mutationassessor.org
Uses 'combinatorial entropy optimization' (CEO) to look at sets of evolutionarily related proteins and find key functional residues to which it applies a conservation score.
LogRE / CanPredict 31, 39, 56 lpgws.nci.nih.gov/cgi-bin/GeneViewer.cgi [SEEMS NOT TO BE AVAILABLE]
LogRE is a score calculated from a Hidden Markov Model for a substitution that is exploited by CanPredict
MAPP / ProPhylER 32, 33 mendel.stanford.edu/sidowlab/downloads/MAPP/ [DOWNLOADABLE SOFTWARE]
www.prophyler.org [SEEMS NOT TO BE AVAILABLE]
Prophyler uses the MAPP score which takes data from a multiple alignment and converts a position in the alignment to a vector describing the importance of 6 physicochemical properties (hydropathy, polarity, charge, volume and free-energy in alpha helices and beta-strands)
ProSPect 34, 35, 77 www.sbg.bio.ic.ac.uk/servers/suspect/
Concentrates on stability and interfaces and protein network information
SNP@Domain 37 H sysbio.kribb.re.kr:8080/domainsnp/
FOLD-X 38 Y foldxsuite.crg.eu/
FOLD-X is an online force-field for calculating energy – it has been widely used for calculating stability changes on mutation.
PoPMuSiC 40, 41, 42 (Y) dezyme.com/en/Software
VEP 49 www.ensembl.org/info/docs/tools/vep/
Links ENSEMBL to variant effect predictors (currently SIFT and PolyPhen-2)
BONGO 46 www.bongo.cl.cam.ac.uk/Bongo2/Bongo.htm [NOT AVAILABLE]
A protein structure is converted to a graph, based on its amino acid interactions. Those residues of key importance for structural stability are determined by these interactions. The substituted amino acids are modelled and the impact of the change determined based on the changes in the network.
HANSA 47 hansa.cdfd.org.in:8080/
Combines 10 different properties of these substitutions to partition disease and neutral mutations: 6 features related to the specific position of the mutation and probabilities of the amino acids; 2 features of protein structural environment; 2 features based on likelihood of the amino acid substitutions.
Parepro 48 www.mobioinfor.cn/parepro/
Three attributes are characterised from homologues collected using PSI-BLAST: (i) property differences between the ‘new’ amino acid and those in the alignment; (ii) the distribution of amino acids at the position; (iii) the sequence environment (upstream and downstream amino acids)
transFIC N/A bg.upf.edu/transfic/home
Exploits Functional Impact Scores with SIFT, PolyPhen-2 and MutationAssessor to score cancer mutations
[Westhead] 59 NS Evaluates two machine learning methods in prediction from sequence
[Cui] 50 Y NS compbio.utmem.edu/snp/dataset/ [NO LONGER AVAILABLE]
Evaluate two machine learning methods and uses structural information from homologues and sequence profiles from multiple alignment
[Kohane] 51 NS Uses Bayesian methods using frequency data and hydrophobicity on some specific datasets
CHASM 52 NS Cancer-specific High-throughput Annotation of Somatic Mutations. Uses a random forest to identify driver mutations in cancer.
B-SIFT 62 NS a modified version of SIFT which is able to identify both deleterious and a subset of activating mutations given a protein sequence and a query mutation within that sequence
[Baker] 65 Y NS Uses classification tree and logistic regression machine learning method with solvent-accessibility, Cβ density and SIFT scores.
SNPdryad 75 snps.ccbr.utoronto.ca:8080/SNPdryad/
Uses only protein orthologs in building a multiple sequence alignment to derive a novel conservation scoring scheme with a Random Forest classifier.
STRUM 78 Y http://zhanglab.ccmb.med.umich.edu/STRUM/
Predicts stability changes caused by single-point mutations. Starting from wild-type sequences, 3D models are constructed using I-TASSER and physics- and knowledge-based energy functions derived from the I-TASSER models are used for machine learning.

Reviews etc

Ref 2 has a useful comparison of some of the resources in Table 1

Refs 43, 44, 63, 66, 68 are extensive reviews

Refs 55 and 61 are review of methods used for cancer mutations

References

[1] Yue, P., Melamud, E. and Moult, J. (2006) SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics, 7:166–166.

[2] Uzun, A., Leslin, C.M., Abyzov, A. and Ilyin, V. (2007) Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways. Nucleic Acids Res, 35:W384–W392.

[3] Yip, Y.L., Scheib, H., Diemand, A.V., Gattiker, A., Famiglietti, L.M. Gasteiger, E. and Bairoch, A. (2004) The SwissProt variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum Mutat, 23:464–470.

[4] Dantzer, J., Moad, C., Heiland, R. and Mooney, S. (2005) MutDB services: Interactive structural analysis of mutation data. Nucleic Acids Res, 33:W311–W314.

[5] Karchin, R., Diekhans, M., Kelly, L., Thomas, D.J., Pieper, U., Eswar, N., Haussler, D. and Sali, A. (2005) LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics, 21:2814–2820.

[6] Stitziel, N.O., Binkowski, T.A., Tseng, Y.Y., Kasif, S. and Liang, J. (2004) topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association. Nucleic Acids Res, 32:D520–D522.

[7] Reumers, J., Schymkowitz, J., Ferkinghoff-Borg, J., Stricher, F., Serrano, L. and Rousseau, F. (2005) SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs. Nucleic Acids Res, 33:D527–D532.

[8] Lei Bao, Mi Zhou, and Yan Cui. (2005) nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res, 33:W480–W482.

[9] Boris Reva, Yevgeniy Antipin, and Chris Sander. (2011) Predicting the functional impact of protein mutations: Application to cancer genomics. Nucleic Acids Res, 39:e118–e118.

[10] Jana Marie Schwarz, Christian Rödelsperger, Markus Schuelke, and Dominik Seelow. (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nature Methods, 7:575–576.

[11] Bromberg,Y. and Rost,B. (2007) SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res., 35, 3823–3835.

[12] Yana Bromberg, Guy Yachdav, and Burkhard Rost. (2008) SNAP predicts effect of mutations on protein function. Bioinformatics, 24:2397–2398.

[13] Abel González-Pérez and Nuria López-Bigas. (2011) Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. Am J Hum Genet, 88:440–449.

[14] Hashem A. Shihab, Julian Gough, David N. Cooper, Peter D. Stenson, Gary L. A. Barker, Keith J. Edwards, Ian N. M. Day, and Tom R. Gaunt. (2013) Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat, 34:57–65.

[15] Nouf S Al-Numair and Andrew C R Martin. (2013) The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations. BMC Genomics, 14(3):1–11.

[16] Biao Li, Vidhya G. Krishnan, Matthew E. Mort, Fuxiao Xin, Kishore K. Kamati, David N. Cooper, Sean D. Mooney, and Predrag Radivojac. (2009) Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 25:2744–2750.

[17] Martin Kircher, Daniela M. Witten, Preti Jain, Brian J. O’Roak, Gregory M. Cooper, and Jay Shendure. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics, 46:310–315.

[18] Remo Calabrese, Emidio Capriotti, Piero Fariselli, Pier Luigi Martelli, and Rita Casadio. (2009) Functional annotations improve the predictive score of human disease-related mutations in proteins. Human Mutatation, 30:1237– 1244.

[19] Pauline C. Ng and Steven Henikoff. (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res, 31:3812–3814.

[20] Ramensky V, Bork P, Sunyaev S (2002) Human non-synonymous SNPs: server and survey. Nucleic Acids Res 30:3894-3900

[21] Ivan A. Adzhubei, Steffen Schmidt, Leonid Peshkin, Vasily E. Ramensky, Anna Gerasimova, Peer Bork, Alexey S. Kondrashov, and Shamil R. Sunyaev. (2010) A method and server for predicting damaging missense mutations. Nature Methods, 7:248–249.

[22] Ivan A. Adzhubei, Daniel M. Jordan, and Shamil R. Sunyaev. (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet, 76:7.20.

[23] Paul D. Thomas, Michael J. Campbell, Anish Kejariwal, Huaiyu Mi, Brian Karlak, Robin Daverman, Karen Diemer, Anushya Muruganujan, Apurva Narechania (2003) PANTHER: A Library of Protein Families and Subfamilies Indexed by Function. Genome Res. 13(9):2129-2141.

[24] Paul D. Thomas and Anish Kejariwal (2004) Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: Evolutionary evidence for differences in molecular effects PNAS 101:15398–15403

[25] Liam R Brunham, Roshni R Singaraja, Terry D Pape, Anish Kejariwal, Paul D Thomas, Michael R Hayden (2005) Accurate Prediction of the Functional Significance of Single Nucleotide Polymorphisms and Mutations in the ABCA1 Gene. PLoS Genet 1(6): e83. doi: 10.1371/journal.pgen.0010083

[26] Capriotti, E., Calabrese, R. & Casadio, R. (2006) Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics, 22:2729-2734.

[27] Ferrer-Costa C, Gelpí JL, Zamakola L, Parraga I, de la Cruz X, Orozco M. (2005) PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics. 21:3176-8.

[28] Hurst, J.M., McMillan, L.E.M., Porter, C.T., Allen, J. Fakorede, A. and Martin, A.C.R. (2009) The SAAPdb web resource: a large scale structural analysis of mutant proteins, Human Mutation, 30:616-624.

[29] Worth CL, Preissner R, & Blundell TL. (2011) SDM – a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res. 39:W215-22.

[30] Reva, B., Antipin, Y., and Sander, C. (2007). Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 8, R232.

[31] Clifford, R.J., Edmonson, M.N., Nguyen, C., & Buetow, K.H. (2004). Large-scale analysis of non-synonymous coding region single nucleotide polymorphisms. Bioinformatics 20:1006–1014.

[32] Stone, E.A., and Sidow, A. (2005). Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res. 15, 978–986.

[33] Binkley, J., Karra, K., Kirby, A., Hosobuchi, M., Stone, E.A., & Sidow, A. (2010). ProPhylER: A curated online resource for protein function and structure based on evolutionary constraint analyses. Genome Res. 20:142–154.

[34] Yates, C.M. & Sternberg, M.J. (2013) Proteins and Domains Vary in Their Tolerance of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) J. Mol. Biol. 425, 1274–1286

[35] Yates, C.M. & Sternberg, M.J. (2013) The Effect of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) on Protein-Protein Interactions. J. Mol. Biol. 425:3949–63

[36] Emidio Capriotti, Remo Calabrese, Piero Fariselli, Pier Luigi Martelli, Russ B Altman, Rita Casadio (2013) WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation BMC Genomics 14(Suppl 3):S6

[37] Chasman,D. and Adams,R.M. (2001) Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J. Mol. Biol., 307:683–706.

[38] Schymkowitz, J., Borg, J., Stricher, F., Nys, R., Rousseau, F., Serrano, L. (2005) The FoldX web server: an online force field. Nucleic Acids Res. 33:W382-8.

[39] Kaminker J.S., Zhang Y., Waugh A., Haverty P.M., Peters B., Sebisanovic D., Stinson J., Forrest W.F., Bazan J.F., Seshagiri S., Zhang Z. (2007). Distinguishing cancer-associated missense mutations from common polymorphisms. Cancer Research 67, 465-73.

[40] Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts Ph, Rooman M. (2009) Prediction of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC 2.0. Bioinformatics 25:2537-2543

[41] Dehouck Y, Kwasigroch JM, Gilis D, Rooman M. (2011) PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinformatics 12:151

[42] Gonnelli G, Rooman M, Dehouck Y. (2012) Structure-based mutant stability predictions on proteins of unknown structure. Journal of Biotechnology 161:287-293

[43] Casandra Riera, Sergio Lois, Xavier de la Cruz 2014, 'Prediction of pathological mutations in proteins: the challenge of integrating sequence conservation and structure stability principles', WIREs Computational Molecular Science, 4, 3, 249-268.

[44] Tavtigian, S.V., Greenblatt, M.S., Lesueur, F. and Byrnes, G.B. for the IARC Unclassified Genetic Variants Working Group (2008) In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat. 29: 1327–1336

[45] Ferrer-Costa, C., Orozco, M., de la Cruz, X. (2004) Sequence-Based Prediction of Pathological Mutations. PROTEINS: Structure, Function, and Bioinformatics 57:811– 819

[46] Cheng T.M.K., Lu Y-E, Vendruscolo M., Lio P., Blundell T.L. (2008) Prediction by graph theoretic measures of structural effects in proteins arising from non-synonymous single nucleotide polymorphisms. PLoS Comp. Biology. 4 (7) e1000135.

[47] Acharya V. and Nagarajaram H.A. Hansa (2012) An automated method for discriminating disease and neutral human nsSNPs. Human Mutation 2:332-337.

[48] Tian J., Wu N., Guo X., Guo J. Zhang J., Fan Y. (2007) Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics 8:450-464.

[49] McLaren W, Pritchard B, Rios D, Chen Y, Flicek P and Cunningham F. (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26:2069-70

[50] Bao,L. and Cui,Y. (2005) Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics, 21, 2185–2190.

[51] Cai,Z., Tsung,E.F., Marinescu,V.D., Ramoni,M.F., Riva,A. and Kohane, I.S. (2004) Bayesian approach to discovering pathogenic SNPs in conserved protein domains. Hum. Mutat., 24, 178–184.

[52] Carter,H., Chen,S., Isik,L., Tyekucheva,S., Velculescu,V.E., Kinzler,K.W., Vogelstein,B. and Karchin,R. (2009) Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res., 69, 6660–6667.

*[53] Chan,P.A. et al. (2007) Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Hum. Mutat., 28, 683–693.

*[54] Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901–913 (2005).

[55] Hon,L.S. et al. (2008) Computational approaches for predicting causal missense mutations in cancer genome projects. Curr. Bioinformatics, 3, 46–55.

[56] Kaminker,J.S. et al. (2007a) CanPredict: a computational tool for predicting cancer-associated missense mutations. Nucleic Acids Res., 35, W595–W598. [WITH 39]

[57] Karchin,R. (2009) Next generation tools for the annotation of human SNPs. Brief Bioinformatics, 10, 35–52.

[58] Karchin,R., Kelly,L. and Sali,A. (2005) Improving functional annotation of non-synonymous SNPs with information theory. In Klein,T.E., Hunter,L., Dunker,A.K., Jung,T. and Altman,R.B. (eds), Proceedings of the Pacific Symposium in Biocomputing 2005 (PBS 2005), January 4–8, Hawaii, USA.

[59] Krishnan,V.G. and Westhead,D.R. (2003) A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics, 19, 2199–2209.

*[60] Kulkarni,V. et al. (2008) Exhaustive prediction of disease susceptibility to coding base changes in the human genome. BMC Bioinformatics, 9(Suppl. 9), S3.

[61] Lee,W., Yue,P. and Zhang,Z. (2009) Analytical methods for inferring functional effects of single base pair substitutions in human cancers. Hum.Genet., 126, 481–498.

[62] Lee,W., Zhang,Y., Mukhyala,K., Lazarus,R.A. and Zhang,Z. (2009) Bi-directional SIFT predicts a subset of activating mutations. PLoS One, 4, e8311.

[63] Mooney,S. (2005) Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief. Bioinform., 6, 44–56.

[64] Ng,P.C. and Henikoff,S. (2001) Predicting deleterious amino acid substitutions. Genome Res., 11, 863–874.

[65] Saunders,C.T. and Baker,D. (2002) Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol., 322, 891–901

[66] Steward,R.E. et al. (2003) Molecular basis of inherited diseases: a structural perspective. Trends Genet., 19, 505–513.

[67] Sunyaev,S.R., Ramensky,V., Koch,I., Lathe,W.,3rd, Kondrashov,A.S. and Bork,P. (2001) Prediction of deleterious human alleles. Hum. Mol. Genet., 10, 591–597.

[68] Teng,S., Michonova-Alexova,E. and Alexov,E. (2008) Approaches and resources for prediction of the effects of non-synonymous single nucleotide polymorphism on protein function and interactions. Curr. Pharm. Biotechnol., 9, 123–133.

*[69] Thomas,P.D. and Kejariwal,A. (2004) Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc. Natl Acad. Sci. USA, 101, 15398–15403.

[70] Wang,Z. and Moult,J. (2001) SNPs, protein structure, and disease. Hum. Mutat., 17, 263–270.

[71] Worth CL, Bickerton GR, Schreyer A, Forman JR, Cheng TM, Lee S, Gong S, Burke DF, Blundell TL. 2007. A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J Bioinform Comput Biol 5:1297–1318.

[72] Yue P, Li Z, Moult J. 2005. Loss of protein structure stability as a major causative factor in monogenic disease. J Mol Biol 353:459–473.

[73] Yue,P. and Moult,J. (2006) Identification and analysis of deleterious human SNPs. J. Mol. Biol., 356, 1263–1274.

[75] Wong, K-C. and Zhang, Z. (2014) 'SNPdryad: Predicting Deleterious Non-synonymous Human SNPs Using Only Orthologous Protein Sequences' Bioinformatics 30: 1112-1119.

*[76] Juritz, E., Fornasari, M.S., Martelli, P.L., Fariselli, P., Casadio, R. and Parisi, G. (2012) 'On the effect of protein conformation diversity in discriminating among neutral and disease related single amino acid substitutions' BMC Genomics 13(Suppl 4):S5

[77] Yates, C.M., Filippis, I., Kelley, L.A. and Sternberg, M.J.E. (2014) 'SuSPect: Enhanced Prediction of Single Amino Acid Variant (SAV) Phenotype Using Network Features' J. Mol. Biol., 426, 2692-2701.

[78] Quan, L., Lv, Q., Zhang, Y. (2016) 'STRUM: structure-based prediction of protein stability changes upon single-point mutation', Bioinformatics 32:2936-2946

* Reference does not appear in the table above