DataType
- The type of the data to cluster. This is typically
defined by the divergence function used.ClusterType
- The type of Cluster
created by the algorithm.
This is typically defined by the cluster creator function used.@PublicationReference(author="Halil Bisgin", title="Parallel Clustering Algorithms with Application to Climatology", type=Thesis, year=2007, url="http://www.halilbisgin.com/thesis/thesis.pdf") public class ParallelizedKMeansClusterer<DataType,ClusterType extends Cluster<DataType>> extends KMeansClusterer<DataType,ClusterType> implements ParallelAlgorithm
Modifier and Type | Class and Description |
---|---|
protected class |
ParallelizedKMeansClusterer.AssignDataToCluster
Callable task for the evaluate() method.
|
protected class |
ParallelizedKMeansClusterer.CreateClustersFromAssignments
Callable task for that creates clusters from assigned data
|
assignments, clusterCounts, clusters, DEFAULT_MAX_ITERATIONS, DEFAULT_NUM_REQUESTED_CLUSTERS, divergenceFunction, initializer, numRequestedClusters
data, keepGoing
maxIterations
DEFAULT_ITERATION, iteration
Constructor and Description |
---|
ParallelizedKMeansClusterer()
Default constructor
|
ParallelizedKMeansClusterer(int numRequestedClusters,
int maxIterations,
java.util.concurrent.ThreadPoolExecutor threadPool,
FixedClusterInitializer<ClusterType,DataType> initializer,
ClusterDivergenceFunction<? super ClusterType,? super DataType> divergenceFunction,
ClusterCreator<ClusterType,DataType> creator)
Creates a new instance of ParallelizedKMeansClusterer2
|
Modifier and Type | Method and Description |
---|---|
protected int[] |
assignDataToClusters(java.util.Collection<? extends DataType> data)
Creates the cluster assignments given the current locations of clusters
|
ParallelizedKMeansClusterer<DataType,ClusterType> |
clone()
This makes public the clone method on the
Object class and
removes the exception that it throws. |
protected void |
createAssignmentTasks()
Creates the assignment tasks given the number of threads requested
|
protected void |
createClustersFromAssignments()
Creates the set of clusters using the current cluster assignments.
|
int |
getNumThreads()
Gets the number of threads in the thread pool.
|
java.util.concurrent.ThreadPoolExecutor |
getThreadPool()
Gets the thread pool for the algorithm to use.
|
protected boolean |
initializeAlgorithm()
Called to initialize the learning algorithm's state based on the
data that is stored in the data field.
|
void |
setThreadPool(java.util.concurrent.ThreadPoolExecutor threadPool)
Sets the thread pool for the algorithm to use.
|
assignDataFromIndices, cleanupAlgorithm, getAssignments, getClosestClusterIndex, getCluster, getClusterCounts, getClusters, getCreator, getDivergenceFunction, getInitializer, getNumChanged, getNumClusters, getNumElements, getNumRequestedClusters, getPerformance, getResult, setAssignment, setClusters, setCreator, setData, setDivergenceFunction, setInitializer, setNumChanged, setNumRequestedClusters, step
getData, getKeepGoing, learn, setKeepGoing, stop
getMaxIterations, isResultValid, setMaxIterations
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
learn
getMaxIterations, setMaxIterations
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
isResultValid
public ParallelizedKMeansClusterer()
public ParallelizedKMeansClusterer(int numRequestedClusters, int maxIterations, java.util.concurrent.ThreadPoolExecutor threadPool, FixedClusterInitializer<ClusterType,DataType> initializer, ClusterDivergenceFunction<? super ClusterType,? super DataType> divergenceFunction, ClusterCreator<ClusterType,DataType> creator)
numRequestedClusters
- The number of clusters requested (k).maxIterations
- Maximum number of iterations before stoppingthreadPool
- Thread pool to use for parallelizationinitializer
- The initializer for the clusters.divergenceFunction
- The divergence function.creator
- The cluster creator.public ParallelizedKMeansClusterer<DataType,ClusterType> clone()
AbstractCloneableSerializable
Object
class and
removes the exception that it throws. Its default behavior is to
automatically create a clone of the exact type of object that the
clone is called on and to copy all primitives but to keep all references,
which means it is a shallow copy.
Extensions of this class may want to override this method (but call
super.clone()
to implement a "smart copy". That is, to target
the most common use case for creating a copy of the object. Because of
the default behavior being a shallow copy, extending classes only need
to handle fields that need to have a deeper copy (or those that need to
be reset). Some of the methods in ObjectUtil
may be helpful in
implementing a custom clone method.
Note: The contract of this method is that you must use
super.clone()
as the basis for your implementation.clone
in interface CloneableSerializable
clone
in class KMeansClusterer<DataType,ClusterType extends Cluster<DataType>>
public java.util.concurrent.ThreadPoolExecutor getThreadPool()
ParallelAlgorithm
getThreadPool
in interface ParallelAlgorithm
public void setThreadPool(java.util.concurrent.ThreadPoolExecutor threadPool)
ParallelAlgorithm
setThreadPool
in interface ParallelAlgorithm
threadPool
- Thread pool used for parallelization.public int getNumThreads()
ParallelAlgorithm
getNumThreads
in interface ParallelAlgorithm
protected void createAssignmentTasks()
protected boolean initializeAlgorithm()
AbstractAnytimeBatchLearner
initializeAlgorithm
in class KMeansClusterer<DataType,ClusterType extends Cluster<DataType>>
protected int[] assignDataToClusters(java.util.Collection<? extends DataType> data)
KMeansClusterer
assignDataToClusters
in class KMeansClusterer<DataType,ClusterType extends Cluster<DataType>>
data
- Data to assignprotected void createClustersFromAssignments()
KMeansClusterer
createClustersFromAssignments
in class KMeansClusterer<DataType,ClusterType extends Cluster<DataType>>