DragonFly On-Line Manual Pages

MIGRATE(1)             DragonFly General Commands Manual            MIGRATE(1)

NAME
       MIGRATE - estimate population parameters: migration rate and population
       size

SYNOPSIS
       migrate-n

DESCRIPTION
       Migrate estimates population parameters (effective population size and
       migration rates) using genetic data (Electrophoretic markers,
       microsatellite markers, sequence data, and single nucleotide
       polymorphism data). It is a maximum likelihood estimator or Bayesian
       estimator and uses a coalescent theory approach taking into account
       history of mutations and uncertainty of the genealogy.

       or get a copy of the manual in PDF format from
       http://popgen.scs.fsu.edu

OPTIONS
       there are no options on the commandline, but you can specify the
       options in a parmfile or in the menu

PARMFILE OPTIONS
       The parmfile options are split into Datatype, Input/Output, Start
       parameters, Search strategy

DATATYPE
       datatype=<Allele | Microsatellites | Brownian | Sequences |
       Nucleotide-polymorphisms | Panel-SNP | Genealogies >
              specifies the datatype used for the analyses, needless to say
              that if you have the wrong data for the chosen type the program
              will crash.  Allele: infinite allele model, suitable for
              electrophoretic markers, perhaps the "best" guess for codominant
              markers of which we do not know the mutation model.
              Microsatellite: a simple electrophoretic ladder model is used
              for the change along the branches in genealogy.  Brownian: a
              Brownian motion approximation to the stepwise mutation model for
              microsatellites us used (this is MUCH faster than exact model,
              but is not a good approximation if population sizes are small
              (say below 10).  Sequences: Data are DNA or RNA sequences and
              the mutation model used is F84, first used by Felsenstein 1984
              (actually the same as in dnaml (Phylip version 3.5), a
              description of this model can be found in Swofford et al. 1996.
              Nucleotide-polymorphism: [SNP] the data likelihood is corrected
              for sampling only variable sites. We assume that the data was
              used to find the SNP.  Panel-SNP: the data likelihood is
              corrected for using a panel of SNP sites, that were polymorphic.
              The panel has to be population 1.  Genealogies: Reads the
              sumfile of a previous run, with this options the genealogy
              sampling step will not be done and the genealogies provided in
              the sumfile are analyzed. This datatype makes it easy to rerun
              the program for different likelihood ratio test or different
              settings for the profile likelihood printouts.

Sequence data specific options
       freq-from-data=< Yes | No:freqA freqG freqC freqT>

       ttratio=< r1 r2 .....>

       interleaved=<Yes | No >

       categories=<Yes | No>
              If you specify Yes you need a file named catfile
               in the same directory with the following Syntax:
              number_of_categories cat1 cat2 cat3 ..
              categorylabel_for_each_site for each locus, a # in the first
              column can be used to start a comment-line.  Example is for a
              data set with 2 loci and 20 base pairs each
                 # Example catfile for two loci
                 # in migrate you can use # as comments
                 2 1 10          11111111112222222222
                 5 0.1 2 5 23 3 11111122223333445555

       rates=< n : r1 r2 r3 ..rn>

       prob-rates=< n : p1 p2 p3 ... pn>

       autocorrelation=<Yes:value | No>

       weights=<Yes | No>
              If you specify Yes you need a file weightfile with weights for
              each site, the weights can be the following numbers 0-9 and
              letters A-Z, so you have 35 possible weights available.
                   # Example weightfile for two loci
                   11111111112222222222
                   1111112222AAAA445XXXX5

       distfile=<Yes | No>
              You can supply a distance file for each locus (using PHYLIP
              syntax).  The sequence of indiviudals must be same as in the
              infile.  This option appears in the menu when you choose

              0 Start genealogy is estimated using a UPGMA topology

              The distance file is then used to create an UPGMA tree with a
              minimal number of migration events. For large trees this is
              options help to get better starting trees than the automatic
              tree
                   generation which uses a rather unsophisticated distance
              method (differences).

       usertree=<Yes | No>
              If you specify Yes you need a file intree. In this file you have
              starting trees for each locus. BUT these trees need to have
              migration events in them!

Microsatellite data
       micro-threshold=value
              specifies the window in which probabilities of change are
              calculated if we have allele 34 then only probabilities of a
              change from 34 to 35-44 and 24-34 are considered, the higher
              this value is the longer you wait for your
                    result, choosing it too small will produce wrong results.
              Default is micro-threshold=10

Electrophoretic data
       No special variables.

Nucleotide polymorphism
       Similar to sequence data.

INPUT/OUTPUT
       infile=filename
              Default is infile

       random-seed=<Auto | Noauto | Own:seedvalue>
              The random number seed guarantees that you can reproduce a run
              exactly.  Good random number seeds are (values * 4) * 1.  If you
              do not specify the random number seed ( seed=Auto ) the program
              will use the system clock. With seed=Noauto the program expects
              to find a file named seedfile with the random number seed. With
              random-seed=Own:seedvalue you can specify the seed value in the
              parmfile (or in the menu).

       title=titletext

       progress=<Yes|No|Verbose>
              The default is progress=Yes

       outfile=filename
              The default is obviously outfile=outfile

       print-data=<Yes|No>
              Print the data in the outfile. Default is print-data=No.

       print-fst=<Yes|No>
              Print a table of an FST estimate for comparison (Beerli and
              Felsenstein 1999, Beerli 1998) [not recommended].

       plot=<No |
       Yes>[:<Outfile|Both>[:<std|log>:{mig-axis-start,mig-axis-end,theta-axis-start,theta-axis-end}<:printpos<M
       | Nm>>]]
              If plot=No then no plot of the parameter space is shown in the
              outfile, if Yes then you can specify whether you want to have
              the accurate numbers in a separate file ( mathfile ) using
              printpos
               "pixel" in each direction,or only the ASCII-graphics plot in
              the outfile.  The last option ( M or N )let you define whether
              you want the plot in M=m/mu or (default) 4Nm units.  Default is
              plot=Yes:Outfile.  Example of a more complicated statement:
              plot=Yes:Both:std:0,10,0,0.025:100N For syntax in mathfile see
              documentation

       profile=<No|Yes<:<Fast|Percentile|Spline|Discrete|Quick >><:M | Nm >
                   Print profile likelihood. See section Likelihood ratio
              tests and profile likelihood. Default
                   is profile=Yes:Fast:N.

       l-ratio=<None | <Mean|Loci>:testparam> (N-POP)
                    Likelihood ratio tests. See section Likelihood ratio tests
              and profile likelihood. Default is l-ratio=None.

       print-trees=<All | None | Last | Best>
              Default is print-trees=None

       mathfile=filename

       sumfile=<No | Yes | Yes:filename >
              Intermediate results of the genealogy sampling process are save
              into a file named sumfile or into the file for that you specify
              the filename.  You can use this sumfile to rerun the program for
              further analysis,  e.g.  calculating likelihood ratios or
              profile likelihoods,  see datatype=Genealogy.

START VALUES FOR THE PARAMETERS
       theta=<Fst | Own:{value1,value2 ,...}>
              With Fst the programs tries to use an FST  based measure
              (Maynard Smith 1970, Nei and Feldman 1972) Own: { value1,
              value2, ... }
               defines arbitrary start values.

       migration=<Fst|Own:Migration matrix > (N-POP)
              The migration matrix is a n by n table with - on the diagonal
              and can look like this for four populations migration=OWN:{ -
              1.0 1.1 1.2 0.9 - 0.8 0.7 2.1 2.2 - 2.3 1.4 1.5 1.6 - } or like
              this
                  migration=OWN:{ -    1.0 1.1 1.2
                                  0.9 -    0.8 0.7
                                  2.1 2.2 -    2.3
                                  1.4 1.5 1.6 -    }

       mutation=<Gamma | NoGamma>
              The default is mutation=Nogamma

       fst-type=<Theta | Migration >

       custom-migration=< NONE|migration - matrix >
              The migration matrix contains the migration rates from j to i on
              row i, and the are on the diagonal. The migration matrix can
              consist of connections that are *: no restriction

              0: not estimated

              m: mean value of either 4Nm or M.

              s: symmetric migration [only for M]

              c: constant value (together with migration=OWN.. or theta=OWN..)

              The values can be spaced by blanks, newlines.  A few examples
              for 4 populations:

              Full model: custom-migration={**** **** **** ****}

              N-island model: custom-migration={m m m m mm mm m mmm mmmm}

              Stepping Stone model: with symmetric migrations, and
              unrestricted  estimates: custom-migration={*s00 s*s0 0s*s 00s*}

              Source-Sink: (the first population is the source):
              custom-migration={*000**000**0*000}

SEARCH STRATEGY
       Please read the documentation ,these settings are important and will
       influence the accuracy of your results.

       short-chains=value
              Default is 10.

       short-inc=value
              Default is 20.

       short-sample=value
              Default is 500.

       long-chains=value
              Default is 2.

       long-inc=value
              Default is 20.

       long-sample=value
              Default is 5000.

       burn-in=value
              Default is 10000.

       replicate=<NO | YES<:LONGCHAINS | number>>

       heating=<NO | YES<:{1,1.1,1.2,1.3}>>

Obscure options
       see documentation

BUGS
       This man page is not up to date and misses the Bayesian inference
       section, but see documentation.

MAIN DISTRIBUTION WEBSITE
       http://popgen.csit.fsu.edu

SEE ALSO
       coalesce, fluctuate, recombine, lamarc (the program) available from
       http://evolution.gs.washington.edu/lamarc.html

AUTHOR
       Peter Beerli <beerli@csit.fsu.edu>

       [if you use this man page, please let me know]

4.2 Berkeley Distribution        July 20 2006                       MIGRATE(1)