InputType
- The input type for supervised learning. Passed on to the internal
learning algorithm. Also the input type for the learned ensemble.OutputType
- The output type for supervised learning. Passed on to the internal
learning algorithm. Also the output type of the learned ensemble.MemberType
- The type of ensemble member created by the inner learning algorithm.
Usually an evaluator.EnsembleType
- The type of ensemble that the algorithm fills with ensemble members.@PublicationReference(title="Bagging Predictors", author="Leo Breiman", year=1996, type=Journal, publication="Machine Learning", pages={123,140}, url="http://www.springerlink.com/index/L4780124W2874025.pdf") public abstract class AbstractBaggingLearner<InputType,OutputType,MemberType,EnsembleType extends Evaluator<? super InputType,? extends OutputType>> extends AbstractAnytimeSupervisedBatchLearner<InputType,OutputType,EnsembleType> implements Randomized, BatchLearnerContainer<BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType>>
Modifier and Type | Field and Description |
---|---|
protected java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> |
bag
The current bag of data.
|
protected int[] |
dataInBag
An indicator of whether or not the data is in the current bag.
|
protected java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> |
dataList
The data stored for efficient random access.
|
static int |
DEFAULT_MAX_ITERATIONS
The default maximum number of iterations is 100.
|
static double |
DEFAULT_PERCENT_TO_SAMPLE
The default percent to sample is 1.0 (which represents 100%).
|
protected EnsembleType |
ensemble
The ensemble being created by the learner.
|
protected BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> |
learner
The learner to use to create the categorizer for each iteration.
|
protected double |
percentToSample
The percentage of the data to sample with replacement on each iteration.
|
protected java.util.Random |
random
The random number generator to use.
|
data, keepGoing
maxIterations
DEFAULT_ITERATION, iteration
Constructor and Description |
---|
AbstractBaggingLearner()
Creates a new instance of AbstractBaggingLearner.
|
AbstractBaggingLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner)
Creates a new instance of AbstractBaggingLearner.
|
AbstractBaggingLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner,
int maxIterations,
double percentToSample,
java.util.Random random)
Creates a new instance of AbstractBaggingLearner.
|
Modifier and Type | Method and Description |
---|---|
protected abstract void |
addEnsembleMember(MemberType member)
Adds a new member to the ensemble.
|
protected void |
cleanupAlgorithm()
Called to clean up the learning algorithm's state after learning has
finished.
|
protected abstract EnsembleType |
createInitialEnsemble()
Create the initial, empty ensemble for the algorithm to use.
|
protected void |
fillBag(int sampleCount)
Fills the internal bag field by sampling the given number of samples.
|
java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> |
getBag()
Gets the most recently created bag.
|
int[] |
getDataInBag()
Gets the array of counts of the number of samples of each example in
the current bag.
|
java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> |
getDataList()
Gets the data the learner is using as an array list.
|
EnsembleType |
getEnsemble()
Gets the ensemble created by this learner.
|
BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> |
getLearner()
Gets the learner used to learn each ensemble member.
|
double |
getPercentToSample()
Gets the percentage of the total data to sample on each iteration.
|
java.util.Random |
getRandom()
Gets the random number generator used by this object.
|
EnsembleType |
getResult()
Gets the ensemble created by this learner.
|
protected boolean |
initializeAlgorithm()
Called to initialize the learning algorithm's state based on the
data that is stored in the data field.
|
protected void |
setBag(java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> bag)
Sets the most recently created bag.
|
protected void |
setDataInBag(int[] dataInBag)
Sets the array of counts of the number of samples of each example in
the current bag.
|
protected void |
setDataList(java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> dataList)
Sets the data the learner is using as an array list.
|
protected void |
setEnsemble(EnsembleType ensemble)
Sets the ensemble created by this learner.
|
void |
setLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner)
Sets the learner used to learn each ensemble member.
|
void |
setPercentToSample(double percentToSample)
Sets the percentage of the data to sample (with replacement) on each
iteration.
|
void |
setRandom(java.util.Random random)
Sets the random number generator used by this object.
|
protected boolean |
step()
Called to take a single step of the learning algorithm.
|
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
getMaxIterations, isResultValid, setMaxIterations
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
learn
clone
getMaxIterations, setMaxIterations
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
isResultValid
public static final int DEFAULT_MAX_ITERATIONS
public static final double DEFAULT_PERCENT_TO_SAMPLE
protected BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner
protected double percentToSample
protected java.util.Random random
protected transient EnsembleType extends Evaluator<? super InputType,? extends OutputType> ensemble
protected transient java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> dataList
protected transient int[] dataInBag
protected transient java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> bag
public AbstractBaggingLearner()
public AbstractBaggingLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner)
learner
- The learner to use to create the ensemble member on each iteration.public AbstractBaggingLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner, int maxIterations, double percentToSample, java.util.Random random)
learner
- The learner to use to create the ensemble member on each iteration.maxIterations
- The maximum number of iterations to run for, which is also the
number of learners to create.percentToSample
- The percentage of the total size of the data to sample on each
iteration. Must be positive.random
- The random number generator to use.protected boolean initializeAlgorithm()
AbstractAnytimeBatchLearner
initializeAlgorithm
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,EnsembleType extends Evaluator<? super InputType,? extends OutputType>>
protected boolean step()
AbstractAnytimeBatchLearner
step
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,EnsembleType extends Evaluator<? super InputType,? extends OutputType>>
protected abstract EnsembleType createInitialEnsemble()
protected abstract void addEnsembleMember(MemberType member)
member
- The new member to add to the ensemble.protected void fillBag(int sampleCount)
sampleCount
- The number to sample.protected void cleanupAlgorithm()
AbstractAnytimeBatchLearner
cleanupAlgorithm
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,EnsembleType extends Evaluator<? super InputType,? extends OutputType>>
public EnsembleType getResult()
getResult
in interface AnytimeAlgorithm<EnsembleType extends Evaluator<? super InputType,? extends OutputType>>
public BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> getLearner()
getLearner
in interface BatchLearnerContainer<BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType>>
public void setLearner(BatchLearner<? super java.util.Collection<? extends InputOutputPair<? extends InputType,OutputType>>,? extends MemberType> learner)
learner
- The learner used for each ensemble member.public double getPercentToSample()
public void setPercentToSample(double percentToSample)
percentToSample
- The percent of the data to sample on each iteration. Must be greater
than zero. Defaults to 100%.public java.util.Random getRandom()
Randomized
getRandom
in interface Randomized
public void setRandom(java.util.Random random)
Randomized
setRandom
in interface Randomized
random
- The random number generator for this object to use.public EnsembleType getEnsemble()
protected void setEnsemble(EnsembleType ensemble)
ensemble
- The ensemble created by this learner.public java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> getDataList()
protected void setDataList(java.util.ArrayList<? extends InputOutputPair<? extends InputType,OutputType>> dataList)
dataList
- The data as an array list.public int[] getDataInBag()
protected void setDataInBag(int[] dataInBag)
dataInBag
- The bag counts.public java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> getBag()
protected void setBag(java.util.ArrayList<InputOutputPair<? extends InputType,OutputType>> bag)
bag
- The most recently created bag.