@PublicationReference(author="Thomas Hofmann",title="Probabilistic Latent Semantic Analysis",year=1999,type=Conference,publication="Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI)",pages={289,296},url="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.1187") @PublicationReference(author="Thomas Hofmann",title="Probabilistic Latent Semantic Indexing",year=1999,type=Conference,publication="Proceedings of the 22nd Conference of the ACM Special Interest Group on Information Retreival (SIGIR)",pages={50,57},url="http://portal.acm.org/citation.cfm?id=312649") @PublicationReference(author="Thomas Hofmann",title="Unsupervised Learning by Probabilistic Latent Semantic Analysis",year=2001,type=Journal,publication="Machine Learning",pages={177,196},url="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.6341") public class ProbabilisticLatentSemanticAnalysis extends AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result> implements Randomized, VectorFactoryContainer
| Modifier and Type | Class and Description |
|---|---|
static class |
ProbabilisticLatentSemanticAnalysis.LatentData
The information about each latent variable.
|
static class |
ProbabilisticLatentSemanticAnalysis.Result
The dimensionality transform created by probabilistic latent semantic
analysis.
|
static class |
ProbabilisticLatentSemanticAnalysis.StatusPrinter
Prints out the status of the probabilistic latent semantic analysis
algorithm.
|
| Modifier and Type | Field and Description |
|---|---|
protected double |
changeOfLogLikelihood
The change in log-likelihood of the algorithm from the current
iteration.
|
static int |
DEFAULT_MAX_ITERATIONS
The default maximum number of iterations is 250.
|
static double |
DEFAULT_MINIMUM_CHANGE
The default minimum change is 1.0E-10.
|
static int |
DEFAULT_REQUESTED_RANK
The default requested rank is 10.
|
protected int |
documentCount
The number of documents.
|
protected Matrix |
documentsByTerms
The document-by-term matrix.
|
protected int |
latentCount
The number of latent variables.
|
protected ProbabilisticLatentSemanticAnalysis.LatentData[] |
latents
The information about each of the latent variables.
|
protected double |
logLikelihood
The current log-likelihood of the algorithm.
|
protected MatrixFactory<? extends Matrix> |
matrixFactory
The matrix factory.
|
protected double |
minimumChange
The minimum change required in log-likelihood to continue iterating.
|
protected java.util.Random |
random
The random number generator to use.
|
protected int |
requestedRank
The requested rank to reduce the dimensionality to.
|
protected ProbabilisticLatentSemanticAnalysis.Result |
result
The result being produced by the algorithm.
|
protected int |
termCount
The number of terms.
|
protected VectorFactory<? extends Vector> |
vectorFactory
The vector factory.
|
data, keepGoingmaxIterationsDEFAULT_ITERATION, iteration| Constructor and Description |
|---|
ProbabilisticLatentSemanticAnalysis()
Creates a new ProbabilisticSemanticAnalysis with default parameters.
|
ProbabilisticLatentSemanticAnalysis(int requestedRank)
Creates a new ProbabilisticLatentSemanticAnalysis with the given rank
and otherwise default parameters.
|
ProbabilisticLatentSemanticAnalysis(int requestedRank,
double minimumChange,
java.util.Random random)
Creates a new ProbabilisticLatentSemanticAnalysis with the given
parameters.
|
ProbabilisticLatentSemanticAnalysis(java.util.Random random)
Creates a new ProbabilisticLatentSemanticAnalysis with default parameters
and the given random number generator.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
cleanupAlgorithm()
Called to clean up the learning algorithm's state after learning has
finished.
|
MatrixFactory<? extends Matrix> |
getMatrixFactory()
Gets the matrix factory to use.
|
double |
getMinimumChange()
Gets the minimum change in log-likelihood to allow before stopping the
algorithm.
|
java.util.Random |
getRandom()
Gets the random number generator used by this object.
|
int |
getRequestedRank()
Gets the requested rank to conduct the analysis for.
|
ProbabilisticLatentSemanticAnalysis.Result |
getResult()
Gets the current result of the algorithm.
|
VectorFactory<? extends Vector> |
getVectorFactory()
Gets the vector factory to use.
|
protected boolean |
initializeAlgorithm()
Called to initialize the learning algorithm's state based on the
data that is stored in the data field.
|
void |
setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
Sets the matrix factory to use.
|
void |
setMinimumChange(double minimumChange)
Sets the minimum change in log-likelihood to allow before stopping the
algorithm.
|
void |
setRandom(java.util.Random random)
Sets the random number generator used by this object.
|
void |
setRequestedRank(int requestedRank)
Sets the requested rank to conduct the analysis for.
|
void |
setVectorFactory(VectorFactory<? extends Vector> vectorFactory)
Sets the vector factory to use.
|
protected boolean |
step()
Called to take a single step of the learning algorithm.
|
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stopgetMaxIterations, isResultValid, setMaxIterationsaddIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListenersequals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetMaxIterations, setMaxIterationsaddIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListenerisResultValidpublic static final int DEFAULT_REQUESTED_RANK
public static final int DEFAULT_MAX_ITERATIONS
public static final double DEFAULT_MINIMUM_CHANGE
protected int requestedRank
protected double minimumChange
protected java.util.Random random
protected VectorFactory<? extends Vector> vectorFactory
protected MatrixFactory<? extends Matrix> matrixFactory
protected transient Matrix documentsByTerms
protected transient int termCount
protected transient int documentCount
protected transient int latentCount
protected transient ProbabilisticLatentSemanticAnalysis.LatentData[] latents
protected transient double logLikelihood
protected transient double changeOfLogLikelihood
protected transient ProbabilisticLatentSemanticAnalysis.Result result
public ProbabilisticLatentSemanticAnalysis()
public ProbabilisticLatentSemanticAnalysis(java.util.Random random)
random - The random number generator to use.public ProbabilisticLatentSemanticAnalysis(int requestedRank)
requestedRank - The requested rank. Must be non-negative.public ProbabilisticLatentSemanticAnalysis(int requestedRank,
double minimumChange,
java.util.Random random)
requestedRank - The requested rank. Must be non-negative.minimumChange - The minimum change in log-likelihood to stop.random - The random number generator to use.protected boolean initializeAlgorithm()
AbstractAnytimeBatchLearnerinitializeAlgorithm in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>protected boolean step()
AbstractAnytimeBatchLearnerstep in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>protected void cleanupAlgorithm()
AbstractAnytimeBatchLearnercleanupAlgorithm in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>public ProbabilisticLatentSemanticAnalysis.Result getResult()
AnytimeAlgorithmgetResult in interface AnytimeAlgorithm<ProbabilisticLatentSemanticAnalysis.Result>public java.util.Random getRandom()
RandomizedgetRandom in interface Randomizedpublic void setRandom(java.util.Random random)
RandomizedsetRandom in interface Randomizedrandom - The random number generator for this object to use.public VectorFactory<? extends Vector> getVectorFactory()
getVectorFactory in interface VectorFactoryContainerpublic void setVectorFactory(VectorFactory<? extends Vector> vectorFactory)
vectorFactory - The vector factory to use.public MatrixFactory<? extends Matrix> getMatrixFactory()
public void setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
matrixFactory - The matrix factory to use.public int getRequestedRank()
public void setRequestedRank(int requestedRank)
requestedRank - The requested rank. Must be positive.public double getMinimumChange()
public void setMinimumChange(double minimumChange)
minimumChange - The minimum change in log-likelihood to allow before stopping.
Must be non-negative.