Какова формула (приблизительная или точная) для интервала предсказания для биномиальной случайной величины?
Предположим, что , и мы наблюдаем (взятый из ). известно.
Наша цель состоит в том, чтобы получить интервал прогнозирования на 95% для нового розыгрыша от .
Оценка точка , где р = у . Доверительный интервал для р является простым, но я не могу найти формулу для интервала прогнозирования дляY. Если бы мы зналир(а не р ), то интервал прогнозирования 95% просто предполагает нахождение квантилей бинома. Есть ли что-то очевидное, что я пропускаю?
confidence-interval
binomial
prediction-interval
Statseeker
источник
источник
Ответы:
Хорошо, давайте попробуем это. Я дам два ответа - байесовский, который, на мой взгляд, простой и естественный, и один из возможных частых.
Байесовское решение
Мы предполагаем , бета перед на , я, е., Р ~ Б е т в ( & alpha ; , & beta ) , так как модель Бета-биномиальное сопряжена, что означает , что задняя распределение также бета - распределение с параметрами α = α + к , β = β + п - к , (я использую K , чтобы обозначить число успехов в п испытаниях, вместо того , у ). Таким образом, вывод значительно упрощается. Теперь, если у вас есть некоторые предварительные знания о вероятных значенияхp p∼Beta(α,β) α^=α+k,β^=β+n−k k n y p , you could use it to set the values of α and β , i.e., to define your Beta prior, otherwise you could assume a uniform (noninformative) prior, with α=β=1 , or other noninformative priors (see for example here). In any case, your posterior is
В байесовском умозаключении все, что имеет значение, это апостериорная вероятность, означающая, что, как только вы это знаете, вы можете делать выводы для всех других величин в вашей модели. Вы хотите сделать вывод о наблюдаемых : в частности, о векторе новых результатов y = y 1 , … , y m , где m не обязательно равно n . В частности, для каждого j = 0 , … , m мы хотим вычислить вероятность достижения именно j успехов в следующих m испытаниях, учитывая, что мы получили ky y=y1,…,ym m n j=0,…,m j m k успехи в предыдущих испытаниях; задняя предиктивная функция массы:n
However, our Binomial model forY means that, conditionally on p having a certain value, the probability of having j successes in m trials doesn't depend on past results: it's simply
Thus the expression becomes
The result of this integral is a well-known distribution called the Beta-Binomial distribution: skipping the passages, we get the horrible expression
Our point estimate forj , given quadratic loss, is of course the mean of this distribution, i.e.,
Now, let's look for a prediction interval. Since this is a discrete distribution, we don't have a closed form expression for[j1,j2] , such that Pr(j1≤j≤j2)=0.95 . The reason is that, depending on how you define a quantile, for a discrete distribution the quantile function is either not a function or is a discontinuous function. But this is not a big problem: for small m , you can just write down the m probabilities Pr(j=0|m,n,k),Pr(j≤1|m,n,k),…,Pr(j≤m−1|m,n,k) and from here find j1,j2 such that
Of course you would find more than one couple, so you would ideally look for the smallest[j1,j2] such that the above is satisfied. Note that
are just the values of the CMF (Cumulative Mass Function) of the Beta-Binomial distribution, and as such there is a closed form expression, but this is in terms of the generalized hypergeometric function and thus is quite complicated. I'd rather just install the R packagep0,…,pm−1 in one go, just write:
extraDistr
and callpbbinom
to compute the CMF of the Beta-Binomial distribution. Specifically, if you want to compute all the probabilitieswhereα and β (thus 1 if you're using a uniform prior over p ). Of course it would all be much simpler if R provided a quantile function for the Beta-Binomial distribution, but unfortunately it doesn't.
alpha
andbeta
are the values of the parameters of your Beta prior, i.e.,Practical example with the Bayesian solution
Letn=100 , k=70 (thus we initially observed 70 successes in 100 trials). We want a point estimate and a 95%-prediction interval for the number of successes j in the next m=20 trials. Then
where I assumed a uniform prior onp : depending on the prior knowledge for your specific application, this may or may not be a good prior. Thus
Clearly a non-integer estimate forj doesn't make sense, so we could just round to the nearest integer (14). Then, for the prediction interval:
The probabilities are
For an equal-tail probabilities interval, we want the smallestj2 such that Pr(j≤j2|m,n,k)≥0.975 and the largest j1 such that Pr(j<j1|m,n,k)=Pr(j≤j1−1|m,n,k)≤0.025 . This way, we will have
Thus, by looking at the above probabilities, we see thatj2=18 and j1=9 . The probability of this Bayesian prediction interval is 0.9778494, which is larger than 0.95. We could find shorter intervals such that Pr(j1≤j≤j2|m,n,k)≥0.95 , but in that case at least one of the two inequalities for the tail probabilities wouldn't be satisfied.
Frequentist solution
I'll follow the treatment of Krishnamoorthy and Peng, 2011. LetY∼Binom(m,p) and X∼Binom(n,p) be independently Binominally distributed. We want a 1−2α− prediction interval for Y , based on a observation of X . In other words we look for I=[L(X;n,m,α),U(X;n,m,α)] such that:
The "≥1−2α " is due to the fact that we are dealing with a discrete random variable, and thus we cannot expect to get exact coverage...but we can look for an interval which has always at least the nominal coverage, thus a conservative interval. Now, it can be proved that the conditional distribution of X given X+Y=k+j=s is hypergeometric with sample size s , number of successes in the population n and population size n+m . Thus the conditional pmf is
The conditional CDF ofX given X+Y=s is thus
The first great thing about this CDF is that it doesn't depend onp , which we don't know. The second great thing is that it allows to easily find our PI: as a matter of fact, if we observed a value k of X, then the 1−α lower prediction limit is the smallest integer L such that
correspondingly, the the1−α upper prediction limit is the largest integer such that
Thus,[L,U] is a prediction interval for Y of coverage at least 1−2α . Note that when p is close to 0 or 1, this interval is conservative even for large n , m , i.e., its coverage is quite larger than 1−2α .
Practical example with the Frequentist solution
Same setting as before, but we don't need to specifyα and β (there are no priors in the Frequentist framework):
The point estimate is now obtained using the MLE estimate for the probability of successes,p^=kn , which in turns leads to the following estimate for the number of successes in m trials:
For the prediction interval, the procedure is a bit different. We look for the largestU such that Pr(X≤k|k+U,n,n+m)=H(k;k+U,n,n+m)>α , thus let's compute the above expression for all U in [0,m] :
We can see that the largestU such that the probability is still larger than 0.025 is
Same as for the Bayesian approach. The lower prediction boundL is the smallest integer such that Pr(X≥k|k+L,n,n+m)=1−H(k−1;k+L,n,n+m)>α , thus
Thus our frequentist "exact" prediction interval is[L,U]=[8,18] .
источник