ClusterType
- Type of Cluster<DataType>
used in theaceous learn()
method.DataType
- The algorithm operates on a Collection<DataType>
, so
DataType
will be something like Vector or String.@PublicationReference(author={"David Arthur","Sergei Vassilvitskii"}, title="k-means++: the advantages of careful seeding", year=2007, type=Conference, publication="Proceedings of the eighteenth annual ACM-SIAM Symposium on Discrete algorithms (SODA)", url="http://portal.acm.org/citation.cfm?id=1283383.1283494") public class DistanceSamplingClusterInitializer<ClusterType extends Cluster<DataType>,DataType> extends AbstractMinDistanceFixedClusterInitializer<ClusterType,DataType>
FixedClusterInitializer
that initializes clusters by
first selecting a random point for the first cluster and then randomly
sampling each successive cluster based on the squared minimum distance from
the point to the existing selected clusters. This is also known as the
K-means++ initialization algorithm.creator, random
divergenceFunction
Constructor and Description |
---|
DistanceSamplingClusterInitializer()
Creates a new, empty instance of
MinDistanceSamplingClusterInitializer . |
DistanceSamplingClusterInitializer(DivergenceFunction<? super DataType,? super DataType> divergenceFunction,
ClusterCreator<ClusterType,DataType> creator,
java.util.Random random)
Creates a new instance of
MinDistanceSamplingClusterInitializer . |
Modifier and Type | Method and Description |
---|---|
DistanceSamplingClusterInitializer<ClusterType,DataType> |
clone()
This makes public the clone method on the
Object class and
removes the exception that it throws. |
protected int |
selectNextClusterIndex(double[] minDistances,
boolean[] selected)
Select the index for the next cluster based on the given minimum
distances and array indicating which clusters have already been selected.
|
getCreator, getRandom, initializeClusters, setCreator, setRandom
getDivergenceFunction, setDivergenceFunction
public DistanceSamplingClusterInitializer()
MinDistanceSamplingClusterInitializer
.public DistanceSamplingClusterInitializer(DivergenceFunction<? super DataType,? super DataType> divergenceFunction, ClusterCreator<ClusterType,DataType> creator, java.util.Random random)
MinDistanceSamplingClusterInitializer
.divergenceFunction
- The divergence function to use.creator
- The cluster creator to use.random
- The random number generator to use.public DistanceSamplingClusterInitializer<ClusterType,DataType> clone()
AbstractCloneableSerializable
Object
class and
removes the exception that it throws. Its default behavior is to
automatically create a clone of the exact type of object that the
clone is called on and to copy all primitives but to keep all references,
which means it is a shallow copy.
Extensions of this class may want to override this method (but call
super.clone()
to implement a "smart copy". That is, to target
the most common use case for creating a copy of the object. Because of
the default behavior being a shallow copy, extending classes only need
to handle fields that need to have a deeper copy (or those that need to
be reset). Some of the methods in ObjectUtil
may be helpful in
implementing a custom clone method.
Note: The contract of this method is that you must use
super.clone()
as the basis for your implementation.clone
in interface CloneableSerializable
clone
in class AbstractMinDistanceFixedClusterInitializer<ClusterType extends Cluster<DataType>,DataType>
protected int selectNextClusterIndex(double[] minDistances, boolean[] selected)
AbstractMinDistanceFixedClusterInitializer
selectNextClusterIndex
in class AbstractMinDistanceFixedClusterInitializer<ClusterType extends Cluster<DataType>,DataType>
minDistances
- The array of minimum distances.selected
- The array corresponding to whether or not an item has already
been selected.