Boosted Decision Trees and Lists (BDTL)
The BDTL software package implements two main boosting algorithms with many variations. The first one is Turian and Melamed's extension of confidence-rated boosting (Schapire & Singer, 1999). See Joseph Turian's thesis (NYU, 2007) for details. The second algorithm is Galron and Melamed's extension of the first one, to boost decision lists rather than decision trees. (Every decision tree ensemble is equivalent to some decision list ensemble, and vice versa.)
Major features of the software include:
Classification and regression
tested on 3M+ examples with 1M+ features, but limited only by your computer's RAM.
Easily customizable loss functions and regularization methods. Currently implemented ones are logistic and exponential loss for classification, and squared loss for regression, each with L1 or L2 regularization.
Selection of weak learners to directly optimize the regularized training objective.
One-shot training following an entire regularization path, which can save a lot of time during hyperparameter optimization.
Continuous checkpointing, so that if a long learning cycle crashes, you can continue training where it left off.
Both binary and scalar feature types.
To get started, read the file README.1st in the top level directory.
Questions, suggestions, and offers of collaboration are most welcome.
- Changes to previous version:
- updated for gcc-4.8
- added missing script used in sandbox evaluation
No one has posted any comments yet. Perhaps you'd like to be the first?
Leave a comment
You must be logged in to post comments.