InputType
- The input type for the tree.OutputType
- The output type for the tree.public class CategorizationTreeLearner<InputType,OutputType> extends AbstractDecisionTreeLearner<InputType,OutputType> implements SupervisedBatchLearner<InputType,OutputType,CategorizationTree<InputType,OutputType>>
CategorizationTreeLearner
class implements a supervised learning
algorithm for learning a categorization tree.Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_LEAF_COUNT_THRESHOLD
The default threshold for making a leaf node based on count.
|
static int |
DEFAULT_MAX_DEPTH
The default maximum depth to grow the tree to.
|
protected int |
leafCountThreshold
The threshold for making a node a leaf, determined by how many
instances fall in the threshold.
|
protected int |
maxDepth
The maximum depth for the tree.
|
protected java.util.Map<OutputType,java.lang.Double> |
priors
Prior probabilities for the different categories.
|
protected java.util.Map<OutputType,java.lang.Integer> |
trainCounts
How often each category appears in training data.
|
deciderLearner
DEFAULT_ITERATION, iteration
Constructor and Description |
---|
CategorizationTreeLearner()
Creates a new instance of CategorizationTreeLearner.
|
CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner)
Creates a new instance of CategorizationTreeLearner.
|
CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner,
int leafCountThreshold,
int maxDepth)
Creates a new instance of CategorizationTreeLearner.
|
CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner,
int leafCountThreshold,
int maxDepth,
java.util.Map<OutputType,java.lang.Double> priors)
Creates a new instance of CategorizationTreeLearner.
|
Modifier and Type | Method and Description |
---|---|
CategorizationTreeLearner<InputType,OutputType> |
clone()
This makes public the clone method on the
Object class and
removes the exception that it throws. |
int |
getLeafCountThreshold()
Gets the leaf count threshold, which determines the number of elements
at which to make an element into a leaf.
|
int |
getMaxDepth()
Gets the maximum depth to grow the tree.
|
static <OutputType> |
getOutputCounts(java.util.Collection<? extends InputOutputPair<?,OutputType>> data)
Creates a histogram of values based on the output values in the given
collection of pairs.
|
CategorizationTree<InputType,OutputType> |
learn(java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>> data)
The
learn method creates an object of ResultType using
data of type DataType , using some form of "learning" algorithm. |
protected CategorizationTreeNode<InputType,OutputType,?> |
learnNode(java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>> data,
AbstractDecisionTreeNode<InputType,OutputType,?> parent)
Recursively learns the categorization tree using the given collection
of data, returning the created node.
|
void |
setCategoryPriors(java.util.Map<OutputType,java.lang.Double> priors)
Set prior category probabilities.
|
void |
setLeafCountThreshold(int leafCountThreshold)
Sets the leaf count threshold, which determines the number of elements
at which to make an element into a leaf.
|
void |
setMaxDepth(int maxDepth)
Sets the maximum depth to grow the tree.
|
areAllOutputsEqual, getDeciderLearner, learnChildNodes, setDeciderLearner, splitData
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
public static final int DEFAULT_LEAF_COUNT_THRESHOLD
public static final int DEFAULT_MAX_DEPTH
protected int leafCountThreshold
protected int maxDepth
protected java.util.Map<OutputType,java.lang.Double> priors
protected transient java.util.Map<OutputType,java.lang.Integer> trainCounts
public CategorizationTreeLearner()
public CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner)
deciderLearner
- The learner for the decision functionpublic CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner, int leafCountThreshold, int maxDepth)
deciderLearner
- The learner for the decision function.leafCountThreshold
- The leaf count threshold. Must be non-negative.maxDepth
- The maximum depth for the tree.public CategorizationTreeLearner(DeciderLearner<? super InputType,OutputType,?,?> deciderLearner, int leafCountThreshold, int maxDepth, java.util.Map<OutputType,java.lang.Double> priors)
deciderLearner
- The learner for the decision function.leafCountThreshold
- The leaf count threshold. Must be non-negative.maxDepth
- The maximum depth for the tree.priors
- Prior probabilities for categories. (See setCategoryPriors().)public CategorizationTreeLearner<InputType,OutputType> clone()
AbstractCloneableSerializable
Object
class and
removes the exception that it throws. Its default behavior is to
automatically create a clone of the exact type of object that the
clone is called on and to copy all primitives but to keep all references,
which means it is a shallow copy.
Extensions of this class may want to override this method (but call
super.clone()
to implement a "smart copy". That is, to target
the most common use case for creating a copy of the object. Because of
the default behavior being a shallow copy, extending classes only need
to handle fields that need to have a deeper copy (or those that need to
be reset). Some of the methods in ObjectUtil
may be helpful in
implementing a custom clone method.
Note: The contract of this method is that you must use
super.clone()
as the basis for your implementation.clone
in interface CloneableSerializable
clone
in class AbstractDecisionTreeLearner<InputType,OutputType>
public CategorizationTree<InputType,OutputType> learn(java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>> data)
BatchLearner
learn
method creates an object of ResultType
using
data of type DataType
, using some form of "learning" algorithm.learn
in interface BatchLearner<java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,CategorizationTree<InputType,OutputType>>
data
- The data that the learning algorithm will use to create an
object of ResultType
.protected CategorizationTreeNode<InputType,OutputType,?> learnNode(java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>> data, AbstractDecisionTreeNode<InputType,OutputType,?> parent)
learnNode
in class AbstractDecisionTreeLearner<InputType,OutputType>
data
- The set of data to learn a node from.parent
- The parent node.public static <OutputType> DefaultDataDistribution<OutputType> getOutputCounts(java.util.Collection<? extends InputOutputPair<?,OutputType>> data)
OutputType
- The type of the outputs to count over.data
- The data to create the output count histogram for.public int getLeafCountThreshold()
public void setLeafCountThreshold(int leafCountThreshold)
leafCountThreshold
- The leaf count threshold. Must be non-negative.public int getMaxDepth()
public void setMaxDepth(int maxDepth)
maxDepth
- The maximum depth to grow the tree. Zero or less means no
maximum depth.public void setCategoryPriors(java.util.Map<OutputType,java.lang.Double> priors)
Set prior category probabilities. A higher prior probability for a category will cause the tree learner to weight examples from that category more highly.
If the priors are not manually specified (through this method or passing priors into the constructor), prior probabilities default to the frequencies of the different categories in the training data.
priors
- If null, use default prior probabilities. Otherwise, priors
becomes the new prior weights. In the latter case,
priors.keySet() contain the same values as the possible
categories in data passed to the learn() method.