“PSIPRED”的意思、由来-开放百科全书

PSI-blast based secondary structure PREDiction (PSIPRED) is a method used to investigate protein structure. It uses artificial neural network machine learning methods in its algorithm.^[1]^[2]^[3] It is a server-side program, featuring a website serving as a front-end interface, which can predict a protein's secondary structure (beta sheets, alpha helixes and coils) from the primary sequence.

PSIPRED is available as a web service and as software. The software is distributed as source code, licensed technically as proprietary software. It allows modifying, but enforces freeware provisions by forbidding for-profit distribution of the software and its results.

Secondary structure

Secondary structure prediction involves a set of methods in bioinformatics that aim to predict the local secondary structures of proteins and RNA sequences based only on knowledge of their primary structure – amino acid or nucleotide sequence, respectively. For proteins, a prediction consists of assigning regions of the amino acid sequence as highly probable alpha helixes, beta strands (often noted as extended conformations), or turns. The success of a prediction is determined by comparing it to the results of the DSSP algorithm applied to the crystal structure of the protein; for nucleic acids, it may be determined from the hydrogen bonding pattern. Specialized algorithms have been developed to detect specific well-defined patterns such as transmembrane helixes and coiled coils in proteins, or canonical micro-RNA structures in RNA.

Basic information

The idea of this method is to use the information of the evolutionarily related proteins to predict the secondary structure of a new amino acid sequence. PSIBLAST is used to find related sequences and to build a position-specific scoring matrix. This matrix is processed by an artificial neural network,^[2]^[5] which was constructed and trained to predict the secondary structure of the input sequence;^[6] in short, it is a machine learning method.^[7]

Prediction algorithm (method)

The prediction method or algorithm is split into three stages: generating a sequence profile, predicting initial secondary structure, and filtering the predicted structure.^[8] PSIPRED works to normalize the sequence profile generated by PSIBLAST.^[2]

Then, by using neural networking, initial secondary structure is predicted. For each amino acid in the sequence, the neural network is fed with a window of 15 acids. Added information is attached, indicating if the window spans the N or C terminus of the chain. This results in a final input layer of 315 input units, divided into 15 groups of 21 units. The network has one hidden layer of 75 units and 3 output nodes (one for each secondary structure element: helix, sheet, coil).^[5]

A second neural network is used to filter the predicted structure of the first network. This network is also fed with a window of 15 positions. The indicator on the possible position of the window at a chain terminus is also forwarded. This results in 60 input units, divided into 15 groups of four. The network has one hidden layer of 60 units and results in three output nodes (one for each secondary structure element: helix, sheet, coil).^[8]

The three final output nodes deliver a score for each secondary structure element for the central position of the window. Using the secondary structure with the highest score, PSIPRED generates the protein prediction.^[8] The Q3 value is the fraction of residues predicted correctly in the secondary structure states, namely helix, strand, and coil.^[8]

See also

References

1. ^{{cite web|url=http://www.imtech.res.in/raghava/betaturns/method.html|title=Prediction of beta turn types|author=Gajendra P. S. Raghava|author2=Harpreet Kaur|accessdate=5 May 2014}}
2. ^¹²{{cite book|author=Yi-Ping Phoebe Chen|title=Bioinformatics Technologies|url=https://books.google.com/books?id=M93XsFrH0VsC&pg=PA107|date=18 January 2005|publisher=Springer|isbn=978-3-540-20873-0|pages=107}}
3. ^{{cite journal|last=Cuff|first=James A.|last2=Barton|first2=Geoffrey A.|date=15 August 2000|title=Application of multiple sequence alignment profiles to improve protein secondary structure prediction.|journal=Proteins|volume=40|issue=3|pmid=10861942|pages=502–11|doi=10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q}}
4. ^{{cite journal |last=Heringa |first=Jaap |title=Computational Methods for Protein Secondary Structure Prediction Using Multiple Sequence Alignments |url=http://www.eurekaselect.com/81747/article |journal=Current Protein & Peptide Science |volume=1 |issue=3 |pages=273–301(29) |doi=10.2174/1389203003381324|year=2000|citeseerx=10.1.1.470.7673 }}
5. ^¹{{cite book|author1=S. C. Rastogi|author2=Namitra Mendiratta|author3=Parag Rastogi|title=Bioinformatics: Methods and Applications: (Genomics, Proteomics and Drug Discovery)|url=https://books.google.com/books?id=MAIiyX06fEYC&pg=PA302|date=22 May 2013|publisher=PHI Learning Pvt. Ltd.|isbn=978-81-203-4785-4|pages=302–}}
6. ^{{cite web|url=http://bioinformatictools.wordpress.com/tag/psipred/|title=PSIPRED {{!}} Bioinformatic Technology|date=10 April 2014|accessdate=7 May 2014}}
7. ^{{cite web|url=http://bioinf.cs.ucl.ac.uk/index.php?id=779|title=PSIPRED overview|accessdate=7 May 2014}}
8. ^¹²³{{cite journal|last=Jones|first=David T.|title=Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices|date=17 September 1999|url=http://www.cifn.unam.mx/~contrera/bioinfo/papers/5_psipred1999.pdf|journal=Journal of Molecular Biology|volume=292|issue=2|pages=195–202|doi=10.1006/jmbi.1999.3091|pmid=10493868 |accessdate=7 May 2014}}