@PublicationReference(author="Thomas Hofmann",title="Probabilistic Latent Semantic Analysis",year=1999,type=Conference,publication="Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI)",pages={289,296},url="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.33.1187") @PublicationReference(author="Thomas Hofmann",title="Probabilistic Latent Semantic Indexing",year=1999,type=Conference,publication="Proceedings of the 22nd Conference of the ACM Special Interest Group on Information Retreival (SIGIR)",pages={50,57},url="http://portal.acm.org/citation.cfm?id=312649") @PublicationReference(author="Thomas Hofmann",title="Unsupervised Learning by Probabilistic Latent Semantic Analysis",year=2001,type=Journal,publication="Machine Learning",pages={177,196},url="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.130.6341") public class ProbabilisticLatentSemanticAnalysis extends AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result> implements Randomized, VectorFactoryContainer
Modifier and Type | Class and Description |
---|---|
static class |
ProbabilisticLatentSemanticAnalysis.LatentData
The information about each latent variable.
|
static class |
ProbabilisticLatentSemanticAnalysis.Result
The dimensionality transform created by probabilistic latent semantic
analysis.
|
static class |
ProbabilisticLatentSemanticAnalysis.StatusPrinter
Prints out the status of the probabilistic latent semantic analysis
algorithm.
|
Modifier and Type | Field and Description |
---|---|
protected double |
changeOfLogLikelihood
The change in log-likelihood of the algorithm from the current
iteration.
|
static int |
DEFAULT_MAX_ITERATIONS
The default maximum number of iterations is 250.
|
static double |
DEFAULT_MINIMUM_CHANGE
The default minimum change is 1.0E-10.
|
static int |
DEFAULT_REQUESTED_RANK
The default requested rank is 10.
|
protected int |
documentCount
The number of documents.
|
protected Matrix |
documentsByTerms
The document-by-term matrix.
|
protected int |
latentCount
The number of latent variables.
|
protected ProbabilisticLatentSemanticAnalysis.LatentData[] |
latents
The information about each of the latent variables.
|
protected double |
logLikelihood
The current log-likelihood of the algorithm.
|
protected MatrixFactory<? extends Matrix> |
matrixFactory
The matrix factory.
|
protected double |
minimumChange
The minimum change required in log-likelihood to continue iterating.
|
protected java.util.Random |
random
The random number generator to use.
|
protected int |
requestedRank
The requested rank to reduce the dimensionality to.
|
protected ProbabilisticLatentSemanticAnalysis.Result |
result
The result being produced by the algorithm.
|
protected int |
termCount
The number of terms.
|
protected VectorFactory<? extends Vector> |
vectorFactory
The vector factory.
|
data, keepGoing
maxIterations
DEFAULT_ITERATION, iteration
Constructor and Description |
---|
ProbabilisticLatentSemanticAnalysis()
Creates a new ProbabilisticSemanticAnalysis with default parameters.
|
ProbabilisticLatentSemanticAnalysis(int requestedRank)
Creates a new ProbabilisticLatentSemanticAnalysis with the given rank
and otherwise default parameters.
|
ProbabilisticLatentSemanticAnalysis(int requestedRank,
double minimumChange,
java.util.Random random)
Creates a new ProbabilisticLatentSemanticAnalysis with the given
parameters.
|
ProbabilisticLatentSemanticAnalysis(java.util.Random random)
Creates a new ProbabilisticLatentSemanticAnalysis with default parameters
and the given random number generator.
|
Modifier and Type | Method and Description |
---|---|
protected void |
cleanupAlgorithm()
Called to clean up the learning algorithm's state after learning has
finished.
|
MatrixFactory<? extends Matrix> |
getMatrixFactory()
Gets the matrix factory to use.
|
double |
getMinimumChange()
Gets the minimum change in log-likelihood to allow before stopping the
algorithm.
|
java.util.Random |
getRandom()
Gets the random number generator used by this object.
|
int |
getRequestedRank()
Gets the requested rank to conduct the analysis for.
|
ProbabilisticLatentSemanticAnalysis.Result |
getResult()
Gets the current result of the algorithm.
|
VectorFactory<? extends Vector> |
getVectorFactory()
Gets the vector factory to use.
|
protected boolean |
initializeAlgorithm()
Called to initialize the learning algorithm's state based on the
data that is stored in the data field.
|
void |
setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
Sets the matrix factory to use.
|
void |
setMinimumChange(double minimumChange)
Sets the minimum change in log-likelihood to allow before stopping the
algorithm.
|
void |
setRandom(java.util.Random random)
Sets the random number generator used by this object.
|
void |
setRequestedRank(int requestedRank)
Sets the requested rank to conduct the analysis for.
|
void |
setVectorFactory(VectorFactory<? extends Vector> vectorFactory)
Sets the vector factory to use.
|
protected boolean |
step()
Called to take a single step of the learning algorithm.
|
clone, getData, getKeepGoing, learn, setData, setKeepGoing, stop
getMaxIterations, isResultValid, setMaxIterations
addIterativeAlgorithmListener, fireAlgorithmEnded, fireAlgorithmStarted, fireStepEnded, fireStepStarted, getIteration, getListeners, removeIterativeAlgorithmListener, setIteration, setListeners
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getMaxIterations, setMaxIterations
addIterativeAlgorithmListener, getIteration, removeIterativeAlgorithmListener
isResultValid
public static final int DEFAULT_REQUESTED_RANK
public static final int DEFAULT_MAX_ITERATIONS
public static final double DEFAULT_MINIMUM_CHANGE
protected int requestedRank
protected double minimumChange
protected java.util.Random random
protected VectorFactory<? extends Vector> vectorFactory
protected MatrixFactory<? extends Matrix> matrixFactory
protected transient Matrix documentsByTerms
protected transient int termCount
protected transient int documentCount
protected transient int latentCount
protected transient ProbabilisticLatentSemanticAnalysis.LatentData[] latents
protected transient double logLikelihood
protected transient double changeOfLogLikelihood
protected transient ProbabilisticLatentSemanticAnalysis.Result result
public ProbabilisticLatentSemanticAnalysis()
public ProbabilisticLatentSemanticAnalysis(java.util.Random random)
random
- The random number generator to use.public ProbabilisticLatentSemanticAnalysis(int requestedRank)
requestedRank
- The requested rank. Must be non-negative.public ProbabilisticLatentSemanticAnalysis(int requestedRank, double minimumChange, java.util.Random random)
requestedRank
- The requested rank. Must be non-negative.minimumChange
- The minimum change in log-likelihood to stop.random
- The random number generator to use.protected boolean initializeAlgorithm()
AbstractAnytimeBatchLearner
initializeAlgorithm
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>
protected boolean step()
AbstractAnytimeBatchLearner
step
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>
protected void cleanupAlgorithm()
AbstractAnytimeBatchLearner
cleanupAlgorithm
in class AbstractAnytimeBatchLearner<java.util.Collection<? extends Vectorizable>,ProbabilisticLatentSemanticAnalysis.Result>
public ProbabilisticLatentSemanticAnalysis.Result getResult()
AnytimeAlgorithm
getResult
in interface AnytimeAlgorithm<ProbabilisticLatentSemanticAnalysis.Result>
public java.util.Random getRandom()
Randomized
getRandom
in interface Randomized
public void setRandom(java.util.Random random)
Randomized
setRandom
in interface Randomized
random
- The random number generator for this object to use.public VectorFactory<? extends Vector> getVectorFactory()
getVectorFactory
in interface VectorFactoryContainer
public void setVectorFactory(VectorFactory<? extends Vector> vectorFactory)
vectorFactory
- The vector factory to use.public MatrixFactory<? extends Matrix> getMatrixFactory()
public void setMatrixFactory(MatrixFactory<? extends Matrix> matrixFactory)
matrixFactory
- The matrix factory to use.public int getRequestedRank()
public void setRequestedRank(int requestedRank)
requestedRank
- The requested rank. Must be positive.public double getMinimumChange()
public void setMinimumChange(double minimumChange)
minimumChange
- The minimum change in log-likelihood to allow before stopping.
Must be non-negative.