v1.1.alpha10: FastTree CAT model support and big-tree heuristics
28 Oct 2011, by ErickThis rolls out our FastTree CAT model support and a new collection of heuristics for the initial evaluation phase of placement.
To run pplacer using a FastTree tree, build your tree using the -gtr
flag and save the log file using the -log
option. The log file is
used in the same way as the statistics file when building a reference
package. If you haven’t built one already, just have a look at the
taxtastic quickstart.
From there, the reference package is used just like any other.
Note that pplacer won’t have to re-infer site categories (faster) if you
are using the alignment in the reference package. Placing on FastTree
trees takes about about 1/4 the memory of the equivalent tree inferred
using GTRGAMMA in RAxML.
This release also contains command line flags that control the new “fig”
heuristics. These heuristics greatly accelerate placement on reference
trees when the reference tree is big (e.g. > 10k leaves). In short, the
tree gets divided up into subtrees, that we call “figs”. These are
connected units of the tree such that the distance between any two
leaves is less than the value specified with the --fig-cutoff
flag on
the command line. The initial evaluation of edges for a placement then
happens in three phases: first, evaluate each of the figs using
representative edges. Then, merge figs that are close to one another in
score and sort them. Finally, treat each (potentially merged) fig as a
unit in the baseball heuristics; if we try all of the edges of one fig
then we drop down to the next highest scoring fig and evaluate its
edges. We have not seen a noticeable drop in accuracy using
--fig-cutoff 0.2
, and it’s much faster for a 35K taxon tree. Your
mileage may vary.
Both of these new features are experimental.
I’m afraid that there is a new version of the installation script for those of you who are compiling. Our recent work placing on big trees broke our previous XML library and we’ve had to replace it. We’re hoping this will be the last change for a while.