FindingSeMo

Find Motifs
Annotate Function

File

X_cutoff
T_cutoff

Email:

File

Email:

HELP

Parameters

X-cutoff:

During the construction of a motif from a suffix tree, a parameter, X-cutoff, determines the preference of amino acids at a particular position in the motif by comparing it with their natural frequency of occurrence.
Each position can have one or more (a box) preferred amino acids. If no amino acids are preferred, all the branches at the node are merged, and the position is denoted by an 'X.'
The X-cutoff is a relative parameter. The recommended values are between 1 and 3.
An X-cutoff value of 1 corresponds to amino acids occurring more than their natural frequencies. A larger value of X-cutoff corresponds to a stringent selection.

T-cutoff:

During the construction of a motif from a suffix tree, a parameter, T-cutoff, determines which branches in the tree are statistically significant and report a pattern as a motif when no significant branches are available.
T-cutoff is a relative parameter, and the recommended values are between 1 and 3.
A T-cutoff value of 1 will consider branches that occur more than a chance occurrence. A larger value of T-cutoff corresponds to stringent pruning.

Algorithm

Algorithm:

FindingSemo constructs a suffix tree of the input sequences. Motifs are found using this tree.

Finding Semo can

Find Common motifs in a set of sequences
Annotate the likely function of a protein by detecting plausible signature motifs in that sequence

1. To find motifs: FindingSemo takes as input multiple sequences in FASTA format (see the example for a sample file). It finds common motifs within these sequences. The output lists the motifs founds, the number of times it was detected, the number of sequences in which they were found and a score (observed/expected ratio based) for the motif. The larger the score the greater is its significance.
2. Significant sequences motifs were pre-detected in sequences from the UniProt50 dataset and were correlated with functions of proteins that contained these motifs. For functional annotations, we pick out those motifs from sequences that have strong functional correlations.