public class VectorThresholdVarianceLearner extends AbstractCloneableSerializable implements VectorThresholdLearner<java.lang.Double>
VectorThresholdVarianceLearner computes the best threshold over
a dataset of vectors using the reduction in variance to determine the
optimal index and threshold. This is an implementation of what is used in
the CART regression tree algorithm.| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_MIN_SPLIT_SIZE
The default value for the minimum split size is 1.
|
protected int[] |
dimensionsToConsider
The array of 0-based dimensions to consider in the input.
|
protected int |
minSplitSize
The threshold for allowing a split to be made, determined by how many
instances fall in each left or right sides of the split.
|
| Constructor and Description |
|---|
VectorThresholdVarianceLearner()
Creates a new
VectorThresholdVarianceLearner. |
VectorThresholdVarianceLearner(int minSplitSize)
Creates a new
VectorThresholdVarianceLearner |
VectorThresholdVarianceLearner(int minSplitSize,
int... dimensionsToConsider)
Creates a new
VectorThresholdVarianceLearner. |
| Modifier and Type | Method and Description |
|---|---|
DefaultPair<java.lang.Double,java.lang.Double> |
computeBestGainThreshold(java.util.Collection<? extends InputOutputPair<? extends Vectorizable,java.lang.Double>> data,
int dimension,
double baseVariance)
Computes the best information gain-threshold pair for the given
dimension on the given data.
|
int[] |
getDimensionsToConsider()
Gets the dimensions that the learner is to consider.
|
int |
getMinSplitSize()
Gets the minimum split size.
|
VectorElementThresholdCategorizer |
learn(java.util.Collection<? extends InputOutputPair<? extends Vectorizable,java.lang.Double>> data)
Learns a VectorElementThresholdCategorizer from the given data by
picking the vector element and threshold that best maximizes information
gain.
|
void |
setDimensionsToConsider(int... dimensionsToConsider)
Gets the dimensions that the learner is to consider.
|
void |
setMinSplitSize(int minSplitSize)
Sets the minimum split size.
|
cloneequals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitclonepublic static final int DEFAULT_MIN_SPLIT_SIZE
protected int minSplitSize
protected int[] dimensionsToConsider
public VectorThresholdVarianceLearner()
VectorThresholdVarianceLearner.public VectorThresholdVarianceLearner(int minSplitSize)
VectorThresholdVarianceLearnerminSplitSize - The minimum split size. Must be positive.public VectorThresholdVarianceLearner(int minSplitSize,
int... dimensionsToConsider)
VectorThresholdVarianceLearner.minSplitSize - The minimum split size. Must be positive.dimensionsToConsider - The array of vector dimensions to consider. Null means all of them
are considered.public VectorElementThresholdCategorizer learn(java.util.Collection<? extends InputOutputPair<? extends Vectorizable,java.lang.Double>> data)
learn in interface BatchLearner<java.util.Collection<? extends InputOutputPair<? extends Vectorizable,java.lang.Double>>,VectorElementThresholdCategorizer>data - The data to learn from.public DefaultPair<java.lang.Double,java.lang.Double> computeBestGainThreshold(java.util.Collection<? extends InputOutputPair<? extends Vectorizable,java.lang.Double>> data, int dimension, double baseVariance)
data - The data to use.dimension - The dimension to compute the best threshold over.baseVariance - The variance of the data.public int[] getDimensionsToConsider()
DimensionFilterableLearnergetDimensionsToConsider in interface DimensionFilterableLearnerpublic void setDimensionsToConsider(int... dimensionsToConsider)
DimensionFilterableLearnersetDimensionsToConsider in interface DimensionFilterableLearnerdimensionsToConsider - The array of vector dimensions to consider. Null means all of them
are considered.public int getMinSplitSize()
public void setMinSplitSize(int minSplitSize)
minSplitSize - The minimum split size. Must be positive.