DataType - The type of data the algorithm is to cluster, which it
passes to the divergence function. For example, this could be
Vector or String.@CodeReview(reviewer="Kevin R. Dixon", date="2008-07-22", changesNeeded=false, comments={"Removed transient declaration on members.","Fixed a few typos in javadoc.","Added PublicationReference annotation.","Added comment about use of direct-member access.","Code generally looked fine."}) @PublicationReference(author={"Brendan J. Frey","Delbert Dueck"}, title="Clustering by Passing Messages Between Data Points.", type=Journal, publication="Science", notes="Volume 315, number 5814", pages={972,976}, year=2007) public class AffinityPropagation<DataType> extends AbstractAnytimeBatchLearner<java.util.Collection<? extends DataType>,java.util.Collection<CentroidCluster<DataType>>> implements BatchClusterer<DataType,CentroidCluster<DataType>>, MeasurablePerformanceAlgorithm, DivergenceFunctionContainer<DataType,DataType>
AffinityPropagation algorithm requires three parameters:
a divergence function, a value to use for self-divergence, and a damping
factor (called lambda in the paper; 0.5 is the default). It clusters by
passing messages between each point to determine the best exemplar for the
point.
| Modifier and Type | Field and Description |
|---|---|
protected int[] |
assignments
The assignments of each example to an exemplar (cluster).
|
protected double[][] |
availabilities
The array of example-example availabilities.
|
protected int |
changedCount
The number of examples that have changed assignments in the last
iteration.
|
protected java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> |
clusters
The clusters that have been found so far.
|
protected double |
dampingFactor
The damping factor (lambda).
|
static double |
DEFAULT_DAMPING_FACTOR
The default damping factor (lambda) is 0.5.
|
static int |
DEFAULT_MAX_ITERATIONS
The default maximum number of iterations is 100.
|
static double |
DEFAULT_SELF_DIVERGENCE
The default self similarity is 0.0.
|
protected DivergenceFunction<? super DataType,? super DataType> |
divergence
The divergence function to use.
|
protected int |
exampleCount
The number of examples.
|
protected java.util.ArrayList<DataType> |
examples
The examples.
|
protected double |
oneMinusDampingFactor
The cached value of one minus the damping factor.
|
protected double[][] |
responsibilities
The array of example-example responsibilities.
|
protected double[][] |
similarities
The array of example-example similarities.
|
data, keepGoingmaxIterationsDEFAULT_ITERATION, iteration| Constructor and Description |
|---|
AffinityPropagation()
Creates a new instance of AffinityPropagation.
|
AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence,
double selfDivergence)
Creates a new instance of AffinityPropagation.
|
AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence,
double selfDivergence,
double dampingFactor)
Creates a new instance of AffinityPropagation.
|
AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence,
double selfDivergence,
double dampingFactor,
int maxIterations)
Creates a new instance of AffinityPropagation.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
assignCluster(int i,
int newAssignment)
Assigns example "i" to the new cluster index.
|
protected void |
cleanupAlgorithm()
Called to clean up the learning algorithm's state after learning has
finished.
|
AffinityPropagation<DataType> |
clone()
This makes public the clone method on the
Object class and
removes the exception that it throws. |
protected int[] |
getAssignments()
Gets the assignments of examples to exemplars (clusters).
|
protected double[][] |
getAvailabilities()
Gets the availability values.
|
int |
getChangedCount()
Gets the number of cluster assignments that have changed in the most
recent iteration.
|
protected java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> |
getClusters()
Gets the current clusters, which is a sparse mapping of exemplar
identifier to cluster object.
|
double |
getDampingFactor()
Gets the damping factor.
|
DivergenceFunction<? super DataType,? super DataType> |
getDivergence()
Gets the divergence function used by the algorithm.
|
DivergenceFunction<? super DataType,? super DataType> |
getDivergenceFunction()
Gets the divergence function used by this object.
|
protected java.util.ArrayList<DataType> |
getExamples()
Gets the array list of examples to cluster.
|
NamedValue<java.lang.Integer> |
getPerformance()
Gets the performance, which is the number changed on the last iteration.
|
protected double[][] |
getResponsibilities()
Gets the responsibility values.
|
java.util.ArrayList<CentroidCluster<DataType>> |
getResult()
Gets the current result of the algorithm.
|
double |
getSelfDivergence()
Gets the value used for self-divergence, which controls how many
clusters are generated.
|
protected double[][] |
getSimilarities()
Gets the array of similarities.
|
protected boolean |
initializeAlgorithm()
Called to initialize the learning algorithm's state based on the
data that is stored in the data field.
|
protected void |
setAssignments(int[] assignments)
Sets the assignments of examples to exemplars (clusters).
|
protected void |
setAvailabilities(double[][] availabilities)
Sets the availability values.
|
protected void |
setChangedCount(int changedCount)
Sets the number of cluster assignments that have changed in the most
recent iteration.
|
protected void |
setClusters(java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> clusters)
Sets the current clusters, which is a sparse mapping of exemplar
identifier to cluster object.
|
void |
setDampingFactor(double dampingFactor)
Sets the damping factor.
|
void |
setDivergence(DivergenceFunction<? super DataType,? super DataType> divergence)
Sets the divergence function used by the algorithm.
|
protected void |
setExamples(java.util.ArrayList<DataType> examples)
Sets the array list of examples to cluster.
|
protected void |
setResponsibilities(double[][] responsibilities)
Sets the responsibility values.
|
void |
setSelfDivergence(double selfDivergence)
Sets the value used for self-divergence, which controls how many
clusters are generated.
|
protected void |
setSimilarities(double[][] similarities)
Sets the array of similarities.
|
protected boolean |
step()
Called to take a single step of the learning algorithm.
|
protected void |
updateAssignments()
Updates the assignments of all the examples to their exemplars (clusters)
using the current availability and responsibility values.
|
protected void |
updateAvailabilities()
Updates the availabilities matrix based on the current responsibility
values.
|
protected void |
updateResponsibilities()
Updates the responsibilities matrix using the similarity values and the
current availability values.
|
getData, getKeepGoing, learn, setData, setKeepGoing, stopgetMaxIterations, isResultValid, setMaxIterationsaddIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListenersequals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitlearngetMaxIterations, setMaxIterationsaddIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListenerisResultValidpublic static final int DEFAULT_MAX_ITERATIONS
public static final double DEFAULT_SELF_DIVERGENCE
public static final double DEFAULT_DAMPING_FACTOR
protected DivergenceFunction<? super DataType,? super DataType> divergence
protected double dampingFactor
protected double oneMinusDampingFactor
protected transient int exampleCount
protected java.util.ArrayList<DataType> examples
protected double[][] similarities
protected double[][] responsibilities
protected double[][] availabilities
protected int[] assignments
protected int changedCount
protected java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> clusters
public AffinityPropagation()
public AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence, double selfDivergence)
divergence - The divergence function to use to determine the
divergence between two examples.selfDivergence - The value for self-divergence to use, which
controls the number of clusters created.public AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence, double selfDivergence, double dampingFactor)
divergence - The divergence function to use to determine the
divergence between two examples.selfDivergence - The value for self-divergence to use, which
controls the number of clusters created.dampingFactor - The damping factor (lambda). Must be between 0.0
and 1.0.public AffinityPropagation(DivergenceFunction<? super DataType,? super DataType> divergence, double selfDivergence, double dampingFactor, int maxIterations)
divergence - The divergence function to use to determine the
divergence between two examples.selfDivergence - The value for self-divergence to use, which
controls the number of clusters created.dampingFactor - The damping factor (lambda). Must be between 0.0
and 1.0.maxIterations - The maximum number of iterations.public AffinityPropagation<DataType> clone()
AbstractCloneableSerializableObject class and
removes the exception that it throws. Its default behavior is to
automatically create a clone of the exact type of object that the
clone is called on and to copy all primitives but to keep all references,
which means it is a shallow copy.
Extensions of this class may want to override this method (but call
super.clone() to implement a "smart copy". That is, to target
the most common use case for creating a copy of the object. Because of
the default behavior being a shallow copy, extending classes only need
to handle fields that need to have a deeper copy (or those that need to
be reset). Some of the methods in ObjectUtil may be helpful in
implementing a custom clone method.
Note: The contract of this method is that you must use
super.clone() as the basis for your implementation.clone in interface CloneableSerializableclone in class AbstractAnytimeBatchLearner<java.util.Collection<? extends DataType>,java.util.Collection<CentroidCluster<DataType>>>protected boolean initializeAlgorithm()
AbstractAnytimeBatchLearnerinitializeAlgorithm in class AbstractAnytimeBatchLearner<java.util.Collection<? extends DataType>,java.util.Collection<CentroidCluster<DataType>>>protected boolean step()
AbstractAnytimeBatchLearnerstep in class AbstractAnytimeBatchLearner<java.util.Collection<? extends DataType>,java.util.Collection<CentroidCluster<DataType>>>protected void updateResponsibilities()
protected void updateAvailabilities()
protected void updateAssignments()
protected void assignCluster(int i,
int newAssignment)
i - The index of the example to assign to the cluster.newAssignment - The new assignment for "i".protected void cleanupAlgorithm()
AbstractAnytimeBatchLearnercleanupAlgorithm in class AbstractAnytimeBatchLearner<java.util.Collection<? extends DataType>,java.util.Collection<CentroidCluster<DataType>>>public java.util.ArrayList<CentroidCluster<DataType>> getResult()
AnytimeAlgorithmgetResult in interface AnytimeAlgorithm<java.util.Collection<CentroidCluster<DataType>>>public DivergenceFunction<? super DataType,? super DataType> getDivergence()
public void setDivergence(DivergenceFunction<? super DataType,? super DataType> divergence)
divergence - The divergence function.public double getSelfDivergence()
public void setSelfDivergence(double selfDivergence)
selfDivergence - The value for self-divergence.public double getDampingFactor()
public void setDampingFactor(double dampingFactor)
dampingFactor - The damping factor. Must be between 0.0 and 1.0.protected java.util.ArrayList<DataType> getExamples()
protected void setExamples(java.util.ArrayList<DataType> examples)
examples - The array list of examples to cluster.protected double[][] getSimilarities()
protected void setSimilarities(double[][] similarities)
similarities - The array of similarities.protected double[][] getResponsibilities()
protected void setResponsibilities(double[][] responsibilities)
responsibilities - The responsibilities.protected double[][] getAvailabilities()
protected void setAvailabilities(double[][] availabilities)
availabilities - The availabilities.protected int[] getAssignments()
protected void setAssignments(int[] assignments)
assignments - The assignments of examples to exemplars (clusters).public int getChangedCount()
protected void setChangedCount(int changedCount)
changedCount - The number of changed cluster assignments.protected java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> getClusters()
protected void setClusters(java.util.HashMap<java.lang.Integer,CentroidCluster<DataType>> clusters)
clusters - The current clusters.public DivergenceFunction<? super DataType,? super DataType> getDivergenceFunction()
DivergenceFunctionContainergetDivergenceFunction in interface DivergenceFunctionContainer<DataType,DataType>public NamedValue<java.lang.Integer> getPerformance()
getPerformance in interface MeasurablePerformanceAlgorithm