OutputType
- The type of the output categories to learn over.@PublicationReference(author="Wikipedia", title="Decision tree learning", year=2010, type=WebPage, url="http://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity") public class VectorThresholdGiniImpurityLearner<OutputType> extends AbstractVectorThresholdMaximumGainLearner<OutputType>
DEFAULT_MIN_SPLIT_SIZE, dimensionsToConsider, minSplitSize
Constructor and Description |
---|
VectorThresholdGiniImpurityLearner()
Creates a new instance of VectorThresholdGiniImpurityLearner.
|
VectorThresholdGiniImpurityLearner(int minSplitSize)
Creates a new
VectorThresholdGiniImpurityLearner . |
Modifier and Type | Method and Description |
---|---|
VectorThresholdGiniImpurityLearner<OutputType> |
clone()
This makes public the clone method on the
Object class and
removes the exception that it throws. |
double |
computeSplitGain(DefaultDataDistribution<OutputType> baseCounts,
DefaultDataDistribution<OutputType> positiveCounts,
DefaultDataDistribution<OutputType> negativeCounts)
Computes the split gain by computing the Gini impurity for the
given split.
|
static <DataType> double |
giniImpurity(DefaultDataDistribution<DataType> counts)
Computes the Gini impurity of a histogram.
|
computeBestGainAndThreshold, computeBestGainAndThreshold, getDimensionsToConsider, getMinSplitSize, learn, setDimensionsToConsider, setMinSplitSize
public VectorThresholdGiniImpurityLearner()
public VectorThresholdGiniImpurityLearner(int minSplitSize)
VectorThresholdGiniImpurityLearner
.minSplitSize
- The minimum split size. Must be positive.public VectorThresholdGiniImpurityLearner<OutputType> clone()
AbstractCloneableSerializable
Object
class and
removes the exception that it throws. Its default behavior is to
automatically create a clone of the exact type of object that the
clone is called on and to copy all primitives but to keep all references,
which means it is a shallow copy.
Extensions of this class may want to override this method (but call
super.clone()
to implement a "smart copy". That is, to target
the most common use case for creating a copy of the object. Because of
the default behavior being a shallow copy, extending classes only need
to handle fields that need to have a deeper copy (or those that need to
be reset). Some of the methods in ObjectUtil
may be helpful in
implementing a custom clone method.
Note: The contract of this method is that you must use
super.clone()
as the basis for your implementation.clone
in interface CloneableSerializable
clone
in class AbstractVectorThresholdMaximumGainLearner<OutputType>
public double computeSplitGain(DefaultDataDistribution<OutputType> baseCounts, DefaultDataDistribution<OutputType> positiveCounts, DefaultDataDistribution<OutputType> negativeCounts)
computeSplitGain
in class AbstractVectorThresholdMaximumGainLearner<OutputType>
baseCounts
- The histogram of counts before the split.positiveCounts
- The counts on the positive side of the threshold.negativeCounts
- The counts on the negative side of the threshold.public static <DataType> double giniImpurity(DefaultDataDistribution<DataType> counts)
DataType
- The type of data the counts are over.counts
- The distribution to compute the impurity over.