|
Disclaimer: This is a collection of links and pages I gathered while doing a literature
review on HMMs in bioinformatics for CMPUT 606. I do not consider myself to be an expert
on the subject, but I thought it would be useful to others to have a collection of
starting points listed somewhere.
Profile HMMs are
statistical tools that can model the commonalities of the amino acid sequences for a family of proteins.
Considered to be more expressive than a standard consensus sequence or a regular
expression, profile HMMs allow position dependent insertion and deletion
penalties, as well as the option to use a separate distribution for inserted
portions of the amino acid sequence. Once a model is trained
on a number of amino acid sequences from a given family or group, it is most
commonly used for three purposes:
- By aligning sequences to the model, one can construct multiple alignments.
- The model itself can offer insight into the characteristics of the
family when one examines the structure and probabilities of the trained HMM.
- The model can be used to score how well a new protein sequence fits the
family motif. For example, one could train a model on a number of proteins in
a family, and then match sequences in a database to that model in order to try
to find other family members. This technique is also used to
infer protein structure and function.
|
|
Some particularly useful links to start with:
-
Profile hidden Markov models. S.R. Eddy. Bioinformatics 14:755-763, 1998. A review of the
profile HMM literature from 1996-1998.
Abstract/reprints:
[Bioinformatics Online]
[PostScript].
[PDF].
- Many of this webpage's links are
borrowed from this paper's pointers. Although slightly out of date, it offers
a good introduction to Profile HMMs and their capabilities. All of the
programs linked to below and some others are briefly described.
- Hidden
Markov Models in Computational Biology: Applications to Protein Modeling
Krogh, A., Brown, M., Mian, I.S., Sjolander, K. and Haussler, D. (1994) Journal Mol. Biology, 235:1501--1531
- One of the earliest papers I've seen on the subject.
-
New Link: Instead of making your own models, you can try
searching with a target amino acid sequence against a database of
models at the Pfam
database. References can be found on their site, and on Sean Eddy's page
|
|
Listed below are links to several profile programs that make use of HMMs.
|
Site maintained by: Colin
Cherry.
|
|