Это чисто гипотетический вопрос. Очень распространенным утверждением является то, что никогда не соответствует действительности, это просто вопрос размера выборки.
Предположим, что для реального нет абсолютно никакой измеримой разницы между двумя средними ( ), взятыми из нормально распределенной популяции (как для и для приблизительно ). Мы предполагаем на группу и используем -test. Это будет означать, что значение равно что указывает на то, что нет абсолютно никакого расхождения с . Это будет означать, что статистика теста равна . Средняя разница между группами будет . Каковы будут пределы доверительного интервала для среднего различия в этом случае? Будут ли они ?
Главный вопрос в моем вопросе состоял в том, что когда мы можем действительно сказать, что истинно, то есть в этом случае? Или когда в частых рамках мы можем действительно сказать «без разницы» при сравнении двух средств?
источник
Ответы:
Доверительный интервал для t-теста имеет вид , где и - примерные средние значения, - критическое значение при заданной , а - стандартная ошибка разности средних. Если , то . Таким образом, формула просто , а пределы - просто { ,x¯1−x¯2±tcrit,αsx¯1−x¯2 x¯1 x¯2 tcrit,α t α sx¯1−x¯2 p=1.0 x¯1−x¯2=0 ±tcrit,αsx¯1−x¯2 −tcrit,αsx¯1−x¯2 tcrit,αsx¯1−x¯2 }.
I'm not sure why you would think the limits would be{0,0}. The critical t value is not zero and the standard error of the mean difference is not zero.
источник
Being super-lazy, using R to solve the problem numerically rather than doing the calculations by hand:
Define a function that will give normally distributed values with a mean of (almost!) exactly zero and a SD of exactly 1:
Run a t-test:
The means are not exactly zero because of floating-point imprecision.
More directly, the CIs are±
sqrt(1/8)*qt(0.975,df=30)
; the variance of each mean is 1/16, so the pooled variance is 1/8.источник
The CI can have any limits, but it is centered exactly around zero
For a two-sample T-test (testing for a difference in the means of two populations), a p-value of exactly one corresponds to the case where the observed sample means are exactly equal.† (The sample variances can take on any values.) To see this, note that the p-value function for the test is is:
Thus, settingx¯=y¯ yields:
Now, suppose you form the standard (approximate) confidence interval using the Welch-Satterwaite approximation. In this case, assuming thatx¯=y¯ (to give an exact p-value of one) gives the confidence interval:
where the degrees-of-freedomDF is determined by the Welch-Satterwaite approximation. Depending on the observed sample variances in the problem, the confidence interval can be any finite interval centered around zero. That is, the confidence interval can have any limits, so long as it is centered exactly around zero.
источник
It is difficult to have a cogent philosophical discussion about things that have 0 probability of happening. So I will show you some examples that relate to your question.
If you have two enormous independent samples from the same distribution, then both samples will still have some variability, the pooled 2-sample t statistic will be near, but not exactly 0, the P-value will be distributed asUnif(0,1), and the 95% confidence interval will be very short
and centered very near 0.
An example of one such dataset and t test:
Here are summarized results from 10,000 such situations. First, the distribution of P-values.
Next the test statistic:
And so on for the width of the CI.
It is almost impossible to get a P-value of unity doing an exact test with continuous data, where assumptions are met. So much so, that a wise statistician will ponder what might have gone wrong upon seeing a P-value of 1.
For example, you might give the software two identical large samples. The programming will carry on as if these are two independent samples, and give strange results. But even then the CI will not be of 0 width.
источник
The straightforward answer (+1 to Noah) will explain that the confidence interval for the mean difference may still be of nonzero length because it depends on the observed variation in the sample in a different way than the p-value does.
However you might still wonder why it is like that. Since it is not soo strange to imagine that a high p-value also means a small confidence interval. After all they both correspond to something that is close to a confirmation of the null hypothesis. So why is this thought not correct?
A high p-value is not the same as a small confidence interval.
The p-value is an indicator of how extreme a particular observation is (extreme given some hypothesis) by expressing how probable it is to observe a given deviation. It is an expression of the observed effect size in relation to the accuracy of the experiment (a large observed effect size might not mean very much when the experiment is such 'inaccurate' that these observations are not extreme from a statistical/probabilistic point of view). When you observe a p-value of 1 then this (only) means that you observed zero effect because the probability to observe such zero result or larger is equal to 1 (but this is not the same as that there is zero effect).
Sidenote: Why p-values? The p-value expresses the actual observed effect size in relation to the expected effect sizes (probabilities). This is relevant because experiments might, by design, generate observations of some relevant effect size by pure chance due to common fluctuations in data/onservations. Requiring that an observation/experiment has a low p-value means that the experiment has a high precision - that is: the observed effect size is less often/likely due to chance/fluctuations (and it might be likely due to a true effect).
Sidenote: for continuous variables this p-value equal to 1 occurs almost never because it is an event that has zero measure (E.g. for a normal distributed variableX∼N(0,1) you have P(X=0)=0 ). But for a discrete variable or discretized continuous variable it can be the case (at least the probability is nonzero).
The confidence interval might be seen as the range of values for which anα level hypothesis test would succeed (for which the p-value is above α ).
You should note that a high p-value is not (neccesarily) a proof/support/whatever for the null hypothesis. The high p-value only means that the observation is not remarkable/extreme for a given null hypothesis, but this might just as well be the case for the alternative hypothesis (ie the result is in accordance with both hypotheses yes/no effect). This typically occurs when the data does not carry much information (eg high noise or small sample).
Example: Imagine you have a bag of coins for which you have fair and unfair coins and you want to classify a certain coin by flipping it 20 times. (say the coin is a bernoulli variable withp≈0.5 for fair coins and p∼U(0,1) for unfair coins. In this case, when you observe 10 heads and 10 tails, then you might say the p-value is equal to 1, but I guess that it is obvious that an unfair coin might just as well create this result and we should not rule out the possibility that the coin is unfair.
источник
No, because "absence of evidence is not evidence of absence." Probability can be thought as an extension of logic, with added uncertainties, so imagine for a moment that instead of real numbers on unit interval, the hypothesis test would return only the binary values: 0 (false) or 1 (true). In such case, the basic rules of logic apply, as in the following example:
As about confidence interval, if your sample is large, andμ1−μ2→0 , then the confidence interval for the difference would become extremely narrow, but non-zero. As noticed by others, you could observe things like exact ones and zeros, but rather because of the floating-point precision limitations.
Even if you observedp=1 and the ±0 confidence interval, you still need to keep in mind that the test gives you only the approximate answer. When doing hypothesis testing, we not only make the assumption that H0 is true, but also make a number of other assumptions, like that the samples are independent and come from normal distribution, what is never the case for real-world data. The test gives you an approximate answer, to ill-posed question, so it cannot "prove" the hypothesis, it can just say "under those unreasonable assumptions, this would be unlikely".
источник
Nothing stops you from using standard t- or Gauss-formulae for computing the confidence interval - all informations needed are given in your question. p=1 doesn't mean that there's anything wrong with that. Note that p=1 does not mean that you can be particularly sure that the H0 is true. Random variation is still present and if u0=u1 can happen under the H0, it can also happen if the true value of u0 is slightly different from the true u1, so there will be more in the confidence interval than just equality.
источник
Not among people who know what they're talking about, and are speaking precisely. Traditional hypothesis testing never concludes that the null is true, but whether the null is true or not is separate from whether the null is concluded to be true.
For a two-tailed test, yes.
To first approximation, the limits of a 95% confidence interval are about twice the applicable standard deviation. There is no discontinuity at zero. If you find a functionf(ϵ) that finds the 95% confidence interval for a difference in means of ϵ , you can simply take limϵ→0f(ϵ) to find the confidence interval for a mean difference of zero.
We can say whatever we want. However, saying that a test shows the null to be true is not consistent with traditional hypothesis testing, regardless of the results. And doing so is not well-founded from an evidenciary standpoint. The alternative hypothesis, that the means are not the same, encompasses all possible difference in means. The alternative hypothesis is "The difference in means is1 , or 2 , or 3 , or .5 , or .1 , ..." We can posit an arbitrarily small difference in means, and that will be consistent with the alternative hypothesis. And with an arbitrarily small difference, the probability given that mean is arbitrarily close to the probability given the null. Also, the alternative hypothesis encompasses not only the possibility that the parameters of the distributions, such as the mean, are different, but that there's an entirely different distribution. For instance, the alternative hypothesis encompasses "The two samples will always have a difference in means that this is either exactly 1 or exactly 0, with probability .5 for each". The results are more consistent with that then they are with the null.
источник