What types of probability are involved in creating a HMM?
What do we need to calculate the probability of observing a sequence?
How probable are the observations under a specified model?
Forward algorithm
What are the most probable hidden states of a model for the observations?
Viterbi algorithms (this algorithm shows all the possible probability paths)
How can we learn the HMM parameters given a set of sequences?
Why are CpG islands underrepresented?
Because the cytosine is modified by methylation, and methylated C easily mutates into T
Where is methylation suppressed?
Around promoters and start regions of genes. There is a higher frequency of CpG islands in these regions.
How do we build a HMM model for sequence profiles?
How do we find similar sequences?