PredSL

Navigation

Links

PredSL combines several methods in order to predict a protein's localization to the chloroplast and the thylakoids, the mitochondrion and the secretory pathway.
As input PredSL requires the protein's sequence in fasta format.
The algorithm cosists of 10 steps:

STEP 1:
Initially the 100 N-terminal residues of the sequence are coded (see supplementary material-neural network training) and fed to a first layer of 2 neural networks which determine whether a residue belongs or not to a chloroplast transit peptide (cTP) or a mitochondrial transit peptide (mTP). From this step we get 100 scores (one per residue) from each network.

STEP 2:
We have set a cutoff where the residues do not belong to a transit peptide any longer, and thus we calculate two approximate cleavage sites from the 100 scores we calculated in step 1.(One from each network)

STEP 3:
We take a window of 40 positions around the approximate cleavage site we estimated in step 2, and we use a set of neural networks to predict the cleavage site. Therefore we have one prediction of the cleavage site of the hypothetical cTP and one for the mTP.

STEP 3:
We calculate the average of the scores of the hypothetical peptides predicted from each network, and this results to two scores.

STEP 4:
We feed the 100 scores from step 1 to two neural networks (one for the cTP and one for the mTP), and we get two more scores. These scores represent the probability that the sequence has an mTP or a cTP.

STEP 5:
We use PrediSi to calculate one more score for each sequence. This score represents the probability of a sequence belonging to a secreted protein.

STEP 6:
We use a program that uses Markov chains to discriminate between two categories to get 6 more scores for the plant proteins and 3 more for the nonplant. (See supplementary material-Markov chains)

STEP 7:
We use HMMER to get two additional scores for each protein. One that shows the existence or not of a cTP and one that shows the existence or not of an mTP.

STEP 8:
We feed all the scores we gathered (13 for the plant and 7 for the non-plant proteins) to a neural network that does the final prediction.

STEP 9:
Finally, if a sequence is predicted to belong to a chloroplast protein, we use HMMER to determine the existence of a lumenal-transit peptide (lTP)

STEP 10:
If the user requires it, PredSL provides the possibility to make a graph for each case. The graphs for the chloroplast and mitochondrial sequences are created using the scores from Step 1 and taking a window around the predicted cleavage site from Step 3. For the secreted proteins, the graphs are created usind the hydrophobicity index (Kytte-Doolittle, 1982) for a window around the predicted cleavage site from PrediSi.