- Type Parameters:
InputType
- Input class of the Evaluator
that we are trying to minimize,
such as Vector
OutputType
- Output class of the Evaluator
that we are trying to minimize,
such as Double
EvaluatorType
- Evaluator
class that this minimization algorithm can handle, such as
Evaluator
or DifferentiableEvaluator
.
- All Superinterfaces:
- AnytimeAlgorithm<InputOutputPair<InputType,OutputType>>, BatchLearner<EvaluatorType,InputOutputPair<InputType,OutputType>>, java.lang.Cloneable, CloneableSerializable, IterativeAlgorithm, java.io.Serializable, StoppableAlgorithm
- All Known Subinterfaces:
- LineMinimizer<EvaluatorType>
- All Known Implementing Classes:
- AbstractAnytimeFunctionMinimizer, AbstractAnytimeLineMinimizer, ConjugateGradientMatrixSolver, ConjugateGradientWithPreconditionerMatrixSolver, FunctionMinimizerBFGS, FunctionMinimizerConjugateGradient, FunctionMinimizerDFP, FunctionMinimizerDirectionSetPowell, FunctionMinimizerFletcherReeves, FunctionMinimizerGradientDescent, FunctionMinimizerLiuStorey, FunctionMinimizerNelderMead, FunctionMinimizerPolakRibiere, FunctionMinimizerQuasiNewton, IterativeMatrixSolver, LineMinimizerBacktracking, LineMinimizerDerivativeBased, LineMinimizerDerivativeFree, OverconstrainedConjugateGradientMatrixMinimizer, SteepestDescentMatrixSolver
public interface FunctionMinimizer<InputType,OutputType,EvaluatorType extends Evaluator<? super InputType,? extends OutputType>>
extends BatchLearner<EvaluatorType,InputOutputPair<InputType,OutputType>>, AnytimeAlgorithm<InputOutputPair<InputType,OutputType>>
Interface for unconstrained minimization of nonlinear functions. Implementing
classes find the input that minimizes the output of a function. To find the
parameter set that minimizes a cost function, use ParameterCostMinimizer
.
Broadly speaking, this package is decomposed into multivariate
(Vector
) and line (scalar or Double
) minimization. Furthermore,
each of these categories can rely exclusively on function evaluations
(derivative-free) or could additionally require first-order derivative
(gradient) information. Generally speaking, derivative-based minimizers
require fewer function evaluations and gradient evaluations than their
derivative-free counterparts.
Here are my opinions:
Multivariate minimization, averaged function evaluations and gradient
evaluations for a random (though repeated) set of initial conditions for
minimizing the Rosenbrock function:
There is broad consensus that the BFGS Quasi-Newton algorithm
(FunctionMinimizerBFGS, 73.49/59.35) is the most efficient on real-world
problems. However, all Quasi-Newton algorithms require the storage of an
N-by-N matrix, where N is the size of the input space. If storing that
matrix is impractical, then use Liu-Storey Conjugate Gradient
(FunctionMinimizerLiuStorey, 92.91/44.58). In my experience, this CG
variant performs almost as well as BFGS but doesn't require the N-by-N
matrix.
When gradient information isn't available, or gradients are expensive to
compute, then I would strongly suggest trying finite-difference
approximation to emulate first-order derivatives and then using one of the
algorithms mentioned above. If you are still not satisfied, then we have
implemented minimization algorithms that do not require derivative
information. The very clever direction-set minimization method of Smith,
Powell, and Brent (FunctionMinimizerDirectionSet, 448.66/0.0) is my personal
method of choice for derivative-free minimization. Another option is the
brute-force downhill simplex algorithm of Nelder and Mead
(FunctionMinimizerNelderMead, 187.32/0.0), which can be quite zoomy on some
problems. Since they are both inherently heuristic and neither is uniformly
better than the other, try them both before settling on a particular
algorithm for a particular domain.
Line minimization, reported as minimizing a cosine and a nonlinear
polynomial:
We have three line minimizers, Fletcher-type derivative-based, Brent-type
derivative-free, and Newton-type backtracking. I have yet to find an
instance when backtracking is beneficial, so I won't discuss it further.
An extremely important distinction between line search and
multivariate minimization is that derivative information appears to be much
less important, so it is not necessarily a given that a derivative-based
line minimizer is inherently superior to one that is derivative-free. With
that said, in my experience the Fletcher-type line minimizer has superior
performance, particularly because it is vastly superior in the manner by
which it brackets a minimum. However, I would try both the Fletcher-type
(LineMinimizerDerivativeBased, 4.300/3.435, 7.982/4.987) and Brent-type
(LineMinimizerDerivativeFree, 7.705/0.0, 11.300/0.00)
algorithms before settling on one of them for a particular domain.
- Since:
- 2.0
- Author:
- Kevin R. Dixon
- See Also:
ParameterCostMinimizer