MnM - Minimotif Miner

Understanding the Frequency Score (FS)

NOTE: The Search Results Page now has the motifs ranked w.r.t. FS
Calculation

The Frequency Score (FS) measures the relative occurrence of motifs in the protein query with respect to the entire proteome. This is simply the frequency of the motif in the protein query divided by the frequency of the motif in the entire proteome. Frequencies are calculated based on the amino acid composition of the motif and the proteome. Scores represent the over-representation and the under-representation of a motif with scores of 1 indicating the motif is observed at its predicted frequency.


Limitations

Complexity of motif strongly influences FS score. Assumes a random sequence in proteome. Motif identified may be buried and not accessible for function. Scores are not calculated considering that some motifs may be specific for certain subcellular or extra-cellular compartment. (This can be partially addressed by choosing organelles on the input page). Furthermore, the motif definition , which often varies between references can influence the motif score.


Advantages

Scoring identifies motifs that are highly over-represented in a protein. Identifies when a protein has multiple occurrences of the same motif.


Validation

Analysis of over 2300 validated motifs annotated in the SwissProt database show that the FS score is globally significant when compared to analysis with a randomized motif database.