DataType
- The type of the data to cluster. This is typically
defined by the divergence function used.ClusterType
- The type of Cluster
created by the algorithm.
This is typically defined by the cluster creator function used.@CodeReview(reviewer="Kevin R. Dixon", date="2008-07-22", changesNeeded=false, comments={"Made setRemovalThreshold check to ensure removalThreshold is < 1.0","Cleaned up javadoc.","Code generally looks fine."}) public class KMeansClustererWithRemoval<DataType,ClusterType extends Cluster<DataType>> extends KMeansClusterer<DataType,ClusterType>
assignments, clusterCounts, clusters, DEFAULT_MAX_ITERATIONS, DEFAULT_NUM_REQUESTED_CLUSTERS, divergenceFunction, initializer, numRequestedClusters
data, keepGoing
maxIterations
DEFAULT_ITERATION, iteration
Constructor and Description |
---|
KMeansClustererWithRemoval()
Default constructor
|
KMeansClustererWithRemoval(int numRequestedClusters,
int maxIterations,
FixedClusterInitializer<ClusterType,DataType> initializer,
ClusterDivergenceFunction<ClusterType,DataType> divergenceFunction,
ClusterCreator<ClusterType,DataType> creator,
double removalThreshold)
Creates a new instance of KMeansClusterer using the given parameters.
|
Modifier and Type | Method and Description |
---|---|
double |
getRemovalThreshold()
Getter for removalThreshold
|
protected void |
removeCluster(int clusterIndex)
Removes the cluster at the specified index, and does the internal
bookkeeping as well
|
void |
setRemovalThreshold(double removalThreshold)
Setter for removalThreshold
|
protected boolean |
step()
Do a step of the clustering algorithm.
|
assignDataFromIndices, assignDataToClusters, cleanupAlgorithm, clone, createClustersFromAssignments, getAssignments, getClosestClusterIndex, getCluster, getClusterCounts, getClusters, getCreator, getDivergenceFunction, getInitializer, getNumChanged, getNumClusters, getNumElements, getNumRequestedClusters, getPerformance, getResult, initializeAlgorithm, setAssignment, setClusters, setCreator, setData, setDivergenceFunction, setInitializer, setNumChanged, setNumRequestedClusters
getData, getKeepGoing, learn, setKeepGoing, stop
getMaxIterations, isResultValid, setMaxIterations
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
learn
getMaxIterations, setMaxIterations
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
isResultValid
public KMeansClustererWithRemoval()
public KMeansClustererWithRemoval(int numRequestedClusters, int maxIterations, FixedClusterInitializer<ClusterType,DataType> initializer, ClusterDivergenceFunction<ClusterType,DataType> divergenceFunction, ClusterCreator<ClusterType,DataType> creator, double removalThreshold)
numRequestedClusters
- The number of clusters requested (k).maxIterations
- Number of iterations before stoppinginitializer
- The initializer for the clusters.divergenceFunction
- The divergence function.creator
- The cluster creator.removalThreshold
- fraction of the expected number of data points
assigned to a cluster below which the cluster will be removed. (Suppose
there are 1000 datapoint, 10 clusters, and removalThreshold=0.1. A
cluster may be removed only if is has membership less than
0.1*1000/10= 10 elements assigned to it.)public double getRemovalThreshold()
public void setRemovalThreshold(double removalThreshold)
removalThreshold
- fraction of the expected number of data points
assigned to a cluster below which the cluster will be removed. (Suppose
there are 1000 datapoint, 10 clusters, and removalThreshold=0.1. A
cluster may be removed only if is has membership less than
0.1*1000/10= 10 elements assigned to it.) Must be less than 1.0.protected void removeCluster(int clusterIndex)
clusterIndex
- zero-based cluster index to removeprotected boolean step()
KMeansClusterer
step
in class KMeansClusterer<DataType,ClusterType extends Cluster<DataType>>