LAMARC offers three ways to handle variation in mutation rate among markers.

- Variation within a contiguous segment
- Known variation among segments, regions, and/or data types.
- Unknown variation among regions.
- Bottom line.

** Variation within a contiguous segment.** If the mutation
rate (or fixation rate) may vary from site to site within a single
contiguous genetic segment, such as a DNA sequence or group of
linked microsatellites, the best approach is to use the "Multiple
rate categories" option of the appropriate data model. This
option is described in the data models
section of the documentation.

This approach can also be used for areas where the mutation rate variation is known for a contiguous stretch of markers, for example a DNA sequence containing both introns and exons. If each intron and exon is assigned to a unique segment, the known relative mutation rates can be set explicitly.

Even if you are not perfectly sure of the ratio between your data, using a reasonable guess will still be better than allowing the default of identical rates everywhere. If you are not sure whether microsatellites mutate 1000x or only 100x faster than DNA, pick an intermediate value. Assuming that they mutate at the same rate will definitely give bad results.

** Unknown variation among regions.** If you suspect that
your regions vary in mutation rate, but you don't have any
information on their specific rates, you can assume that
these rates are drawn from a gamma distribution. The
gamma distribution is a somewhat arbitrarily-chosen, flexible statistical
distribution which varies from looking exponential when its
scaled shape parameter α is low, to looking like an increasingly narrow
bell curve as α increases. Low values of α correspond
to cases in which most regions are nearly invariant, and a few
evolve rapidly. High values of α correspond
to cases in which the single-region mutation rates are approximately
normally distributed about the mean single-region mutation rate.
(The gamma distribution actually has two parameters, a "shape
parameter" α and a "scale parameter" β, but LAMARC
sets β = 1/α to avoid overparameterization, and to
allow it to work with a distribution whose mean, the product αβ,
is 1.)

LAMARC can estimate α if you have no prior conception of what a good value here would be (though a reasonable starting guess will speed up maximization). In practice, it needs more than two or three regions to make a reasonable estimate of α. If you have only 2-3 regions, it is best to guess at their ratio, or fix α to a value you find reasonable; estimation of α is likely to fail because not enough information is available. (With only one region, α cannot be estimated and will not be used.) Information on setting this option is available in the gamma parameter section of the LAMARC menu documentation.

If your data consist of several microsatellites and several DNA or SNP regions, the real distribution of mutation rates probably resembles a two-humped camel and not a gamma distribution at all. You can try fitting a gamma anyway, but be aware that you are fitting an inappropriate model. A better alternative is to guess the relative mutation rates: the large difference between microsatellites and DNA data probably trumps any differences among each group. You can even do both, giving a mutation rate constant for each region and then adding a gamma on top. We believe that this has the effect of drawing the different regions from versions of the same gamma but with its mean shifted by the given mutation rate ratio. However, this combination has not been extensively tested: use it at your own risk. It assumes that α is the same for DNA and microsatellites, which is probably not the case, but sometimes a shaky assumption is better than nothing.

Please note that Lamarc can only apply a gamma distribution
to single-region relative mutation rates if all populations
are assumed to remain constant in their respective sizes.
This is due to a mathematical complication in the way Lamarc
implements the gamma distribution (in this case, Lamarc does
*not* approximate the gamma distribution by a histogram of
relative rates). This means that Lamarc cannot simultaneously
model the gamma "force" and the force of exponential population
growth, even for fixed values of α or *g*. If you
believe one or more of your populations is rapidly growing or
shrinking, and you think the single-region relative mutation
rates are approximately gamma-distributed for your data,
then your best bet is to estimate the relative rates by some
other method and supply these to Lamarc as constants, and then
to proceed to estimate growth rates.

Also, because of the way Lamarc implements this feature, it can only be used for maximum-likelihood analyses. If you want to perform a Bayesian analysis, and you think the single-region relative mutation rates are approximately gamma-distributed for your data, then your best bet is to estimate the relative rates by some other method and supply these to Lamarc as constants, and then proceed with your Bayesian analysis.