gov.sandia.cognition.statistics.method

## Class BernoulliConfidence

• All Implemented Interfaces:
ConfidenceIntervalEvaluator<java.util.Collection<java.lang.Boolean>>, CloneableSerializable, java.io.Serializable, java.lang.Cloneable

```public class BernoulliConfidence
extends AbstractCloneableSerializable
implements ConfidenceIntervalEvaluator<java.util.Collection<java.lang.Boolean>>```
Computes the Bernoulli confidence interval. In other words, computes the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5
Since:
2.0
Author:
Kevin R. Dixon
Serialized Form
• ### Field Summary

Fields
Modifier and Type Field and Description
`static BernoulliConfidence` `INSTANCE`
This class has no members, so here's a static instance.
• ### Constructor Summary

Constructors
Constructor and Description
`BernoulliConfidence()`
Creates a new instance of BernoulliConfidence
• ### Method Summary

All Methods
Modifier and Type Method and Description
`ConfidenceInterval` ```computeConfidenceInterval(java.util.Collection<java.lang.Boolean> data, double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence.
`ConfidenceInterval` ```computeConfidenceInterval(double mean, double variance, int numSamples, double confidence)```
Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval
`static ConfidenceInterval` ```computeConfidenceInterval(double bernoulliParameter, int numSamples, double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence.
`static int` ```computeSampleSize(double accuracy, double confidence)```
Computes the number of samples needed to estimate the Bernoulli parameter "p" (mean) within "accuracy" with probability at least "confidence".
• ### Methods inherited from class gov.sandia.cognition.util.AbstractCloneableSerializable

`clone`
• ### Methods inherited from class java.lang.Object

`equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`
• ### Field Detail

• #### INSTANCE

`public static final BernoulliConfidence INSTANCE`
This class has no members, so here's a static instance.
• ### Constructor Detail

• #### BernoulliConfidence

`public BernoulliConfidence()`
Creates a new instance of BernoulliConfidence
• ### Method Detail

• #### computeConfidenceInterval

```public ConfidenceInterval computeConfidenceInterval(java.util.Collection<java.lang.Boolean> data,
double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5
Specified by:
`computeConfidenceInterval` in interface `ConfidenceIntervalEvaluator<java.util.Collection<java.lang.Boolean>>`
Parameters:
`data` - Correct/Wrong data
`confidence` - Confidence level to place on the confidence interval, must be (0,1]
Returns:
Range of values for the accuracy of the classifier at the desired confidence
• #### computeConfidenceInterval

```@PublicationReference(author="Wikipedia",
title="",
type=WebPage,
year=2009,
url="http://en.wikipedia.org/wiki/Margin_of_error")
public static ConfidenceInterval computeConfidenceInterval(double bernoulliParameter,
int numSamples,
double confidence)```
Computes the ConfidenceInterval for the Bernoulli parameter based on the given data and the desired level of confidence. This answers the question, "What is true range of classification rates given a collection of correct/incorrect guesses at a given level of confidence?" For example, if my classifier gets { Correct, Wrong, Correct, Correct, Correct, Wrong, Correct, Correct }, the true classification rate of my classifier at 50% confidence is Pr{ 0.5335 <= p <= 0.9665 } >= 0.5
Parameters:
`bernoulliParameter` - Estimated Bernoulli parameter, classifier success rate, must be [0,1]
`numSamples` - Number of samples used in the determination
`confidence` - Confidence level to place on the confidence interval, must be (0,1]
Returns:
Range of values for the accuracy of the classifier at the desired confidence
• #### computeConfidenceInterval

```public ConfidenceInterval computeConfidenceInterval(double mean,
double variance,
int numSamples,
double confidence)```
Description copied from interface: `ConfidenceIntervalEvaluator`
Computes the confidence interval given the mean and variance of the samples, number of samples, and corresponding confidence interval
Specified by:
`computeConfidenceInterval` in interface `ConfidenceIntervalEvaluator<java.util.Collection<java.lang.Boolean>>`
Parameters:
`mean` - Mean of the distribution.
`variance` - Variance of the distribution.
`numSamples` - Number of samples in the underlying data
`confidence` - Confidence value to assume for the ConfidenceInterval
Returns:
ConfidenceInterval capturing the range of the mean of the data at the desired level of confidence
• #### computeSampleSize

```@PublicationReference(author="Wikipedia",
title="",
type=WebPage,
year=2009,
url="http://en.wikipedia.org/wiki/Margin_of_error")
public static int computeSampleSize(double accuracy,
double confidence)```
Computes the number of samples needed to estimate the Bernoulli parameter "p" (mean) within "accuracy" with probability at least "confidence". Answers the question, "How many people do I need to survey to estimate how many people would vote for Budweiser as the King of Beers within a desired accuracy and a set confidence?" For example, to correctly determine the accuracy within 0.01 with confidence=0.95, we need up to 50000 samples.
Parameters:
`accuracy` - Desired accuracy to estimate, on the interval (0,1]
`confidence` - Desired confidence, on the interval (0,1]
Returns:
Maximum number of samples needed to achieve the accuracy with the level of confidence