![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
![]() | This article may be too technical for most readers to understand.(September 2010) |
In the section "Bracketing the confidence interval" I don't think the statement " z is the 1-⍺/2 quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate ⍺" is true for the Michael Short formula. It should just be the 1-⍺ quantile, and probably use another symbol than z, since z is already defined above in the Wald formula. Documentation: Short's formulas are found in his cited article p. 4188 and 4189, z^2 is here called C5 and defined as C5 = Φ^(-1)(R), where R is just the confidence level. My claim is also supported by the example he goes through on p. 4189 using a confidence level of 95% which yields C5 = 2.706, which would be C5 = 3.841 if using the 1-⍺/2 quantile. — Preceding unsigned comment added by 185.45.22.138 ( talk) 11:04, 20 September 2023 (UTC)
Hi, is there a reason why Wald formula is explained before the Wilson one in this doc? Looks like it is strictly worse. As someone who referred to this article in a "give me some formula that works" mode, I expected that the top one would be most reasonable to try, and I spent a few days before realizing I should be using the second one to avoid obvious errors like LB < 0. 192.184.216.253 ( talk) 22:33, 26 June 2023 (UTC)
I wrote some of the material a few years ago. Maybe it is a bit too technical. I'll add an extra paragraph in the introduction that explains why there is more than one formula. Steve Simon ( talk) 15:00, 9 September 2008 (UTC)
The article has been labelled too technical, but I don't see it as being that much more technical than a lot of other mathematical articles. One could leave out detail to make things more succinct, but that might make it more difficult to follow. Some suggestions:
27 Nov 2006: I am not a statistician but I believe there may be an important error in the Wilson score interval. According to http://www.ppsw.rug.nl/~boomsma/confbin.pdf, the final term in the numerator under the square root sign should be (z squared)/(4n squared), not (z squared)/4n as is written. I don't have the mathematical capacity to determine which is correct, but for my data the former calculation makes a lot more sense than the latter, so I suspect that wikipedia's entry is wrong. I hope a statistician reviews this at some point!
Actually I think the Wilson score interval was right the first time. The formula in the cited article only looks different because the expression inside the square root was multiplied out.
131.111.8.104
15:34, 29 May 2007 (UTC)
I don't understand the comment that the Clopper-Pearson intervals are conservative due to the discreteness of the Binomial distribution; they are based on the beta distribution which IS continuous and well behaved in the interval. So, in fact, I think the comment is wrong (Fredrik x nilsson).
I don't agree with the last section in its summary of those papers. First, it makes no mention of the Wald interval, which it flat out states that it generally does not produce coverage probabilities of the levels desired. Moreover, I think the description of "better" is ambiguous and misleading at best. The exact coverage probabilities are inherently good, since they always guarantee that you reach at least the desired intervals. However, they may calculate an interval that is too large than desired. The approximations are "better" in the sense they don't over-estimate as much. In either case, however, both the exact and non-Wald approximations are generally generally better CI estimates for small n. —Preceding
unsigned comment added by
134.174.140.216 (
talk)
21:01, 11 June 2010 (UTC)
The entry is technical, not too technical, in my opinion. However, for readers unfamiliar with all the formulas, -- myself included -- it would be helpful to see the calculation and result for an example case using each formula. — Preceding unsigned comment added by Tjrm ( talk • contribs) 21:36, 9 January 2011 (UTC)
I have made two amendments which I view as essential on this topic. The normal approximation section replicates a common error in explanations of the binomial proportion interval, one which inevitably leaves the reader totally confused. This error is to conflate the distribution of the likely position of an observation p about a population true value P (which is binomial) with the likely position of P about an observation p (which is not). Since the latter is what we wish to obtain, this is very serious indeed! Moreover, if you don't recognise this, the rest of the page makes no sense! Why should we bother with improvements if the actual interval is binomial (and almost-normal)? So, my suggested amendments are: (1) revise the explanation of the normal approximation to avoid this error and then (2) deal with it conclusively under the Wilson score interval section. Sean a wallis ( talk) 21:58, 5 July 2013 (UTC)
Further amendments, sharpened up the Normal approximation section to discourage casual and incorrect use of this problematic method, and added a new section on the Wilson interval with continuity correction. The latter could be a sub-section of the Wilson interval. Sean a wallis ( talk) 14:36, 12 July 2013 (UTC)
I came across the lower bound of the Wilson Score Interval being used as a 'confidence' metric for decision tree nodes [1]. Perhaps adding a subsection to the Wilson Interval about how it can be applied to decision trees would be useful. Gkarthik92 ( talk) 19:52, 29 December 2015 (UTC)
@Qwfp: Can you please elaborate on why you reverted my edit? The old version was certainly not clear enough for a layperson. I can see your objecting to my use of the word "likelihood" instead of "probability", though I was using it in what I thought was a layperson-friendly way; so I agree to changing it to "probability". But you also said "Added text contained incorrect interpretation of confidence interval" -- can you explain what's incorrect about it? The added text said, with "likelihood" replaced by "probability",
What's wrong with that? Duoduoduo ( talk) 14:40, 14 February 2011 (UTC)
The above discussion suggests that there may be a good reason to expand the scope of ths article to become "Interval estimation for binomial proportions" and to allow an equal footing in the article for credible intervals, with distinctions being made where appropriate. But that might just get too confusing. Perhaps there could be a separate article called "Binomial proportion crediible interval". Melcombe ( talk) 10:06, 15 February 2011 (UTC)
At the end of the section of the A-C interval, we read: "this is the "add 2 successes and 2 failures" interval in [7]."
The citation is superscripted. Because that quotation comes from the article, should the "[7]" be part of the sentence or should it be superscripted as in the present article?
Or one could use the full article quotation: …this is the "'add two successes and two failures' adjusted Wald interval." And the give the citation. (Also it's at p.122b)
Hello fellow Wikipedians,
I have just modified one external link on Binomial proportion confidence interval. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— InternetArchiveBot ( Report bug) 21:05, 2 November 2016 (UTC)
The article states the Wilson score is asymmetric. This is sourced to an article from Journal of Quantitative Linguistics. This may be a stupid question but how exactly is the Wilson score asymmetric? The formula for the Wilson score confidence interval is:
Point +/- stuff
It’s “+/-“ the same thing (“stuff”). Adding and subtracting the same thing to a point estimate gives a symmetric CI.
I think the word “asymmetric” here is being misunderstood here and used in the wrong way. The Wilson Score CI may be asymmetric in some other sense (density within the interval? That doesn’t make sense either though) but here the wording appears to be be confused.
Can anyone clarify? Volunteer Marek 00:24, 1 January 2022 (UTC)
![]() | This article is rated Start-class on Wikipedia's
content assessment scale. It is of interest to the following WikiProjects: | ||||||||||
|
![]() | This article may be too technical for most readers to understand.(September 2010) |
In the section "Bracketing the confidence interval" I don't think the statement " z is the 1-⍺/2 quantile of a standard normal distribution (i.e., the probit) corresponding to the target error rate ⍺" is true for the Michael Short formula. It should just be the 1-⍺ quantile, and probably use another symbol than z, since z is already defined above in the Wald formula. Documentation: Short's formulas are found in his cited article p. 4188 and 4189, z^2 is here called C5 and defined as C5 = Φ^(-1)(R), where R is just the confidence level. My claim is also supported by the example he goes through on p. 4189 using a confidence level of 95% which yields C5 = 2.706, which would be C5 = 3.841 if using the 1-⍺/2 quantile. — Preceding unsigned comment added by 185.45.22.138 ( talk) 11:04, 20 September 2023 (UTC)
Hi, is there a reason why Wald formula is explained before the Wilson one in this doc? Looks like it is strictly worse. As someone who referred to this article in a "give me some formula that works" mode, I expected that the top one would be most reasonable to try, and I spent a few days before realizing I should be using the second one to avoid obvious errors like LB < 0. 192.184.216.253 ( talk) 22:33, 26 June 2023 (UTC)
I wrote some of the material a few years ago. Maybe it is a bit too technical. I'll add an extra paragraph in the introduction that explains why there is more than one formula. Steve Simon ( talk) 15:00, 9 September 2008 (UTC)
The article has been labelled too technical, but I don't see it as being that much more technical than a lot of other mathematical articles. One could leave out detail to make things more succinct, but that might make it more difficult to follow. Some suggestions:
27 Nov 2006: I am not a statistician but I believe there may be an important error in the Wilson score interval. According to http://www.ppsw.rug.nl/~boomsma/confbin.pdf, the final term in the numerator under the square root sign should be (z squared)/(4n squared), not (z squared)/4n as is written. I don't have the mathematical capacity to determine which is correct, but for my data the former calculation makes a lot more sense than the latter, so I suspect that wikipedia's entry is wrong. I hope a statistician reviews this at some point!
Actually I think the Wilson score interval was right the first time. The formula in the cited article only looks different because the expression inside the square root was multiplied out.
131.111.8.104
15:34, 29 May 2007 (UTC)
I don't understand the comment that the Clopper-Pearson intervals are conservative due to the discreteness of the Binomial distribution; they are based on the beta distribution which IS continuous and well behaved in the interval. So, in fact, I think the comment is wrong (Fredrik x nilsson).
I don't agree with the last section in its summary of those papers. First, it makes no mention of the Wald interval, which it flat out states that it generally does not produce coverage probabilities of the levels desired. Moreover, I think the description of "better" is ambiguous and misleading at best. The exact coverage probabilities are inherently good, since they always guarantee that you reach at least the desired intervals. However, they may calculate an interval that is too large than desired. The approximations are "better" in the sense they don't over-estimate as much. In either case, however, both the exact and non-Wald approximations are generally generally better CI estimates for small n. —Preceding
unsigned comment added by
134.174.140.216 (
talk)
21:01, 11 June 2010 (UTC)
The entry is technical, not too technical, in my opinion. However, for readers unfamiliar with all the formulas, -- myself included -- it would be helpful to see the calculation and result for an example case using each formula. — Preceding unsigned comment added by Tjrm ( talk • contribs) 21:36, 9 January 2011 (UTC)
I have made two amendments which I view as essential on this topic. The normal approximation section replicates a common error in explanations of the binomial proportion interval, one which inevitably leaves the reader totally confused. This error is to conflate the distribution of the likely position of an observation p about a population true value P (which is binomial) with the likely position of P about an observation p (which is not). Since the latter is what we wish to obtain, this is very serious indeed! Moreover, if you don't recognise this, the rest of the page makes no sense! Why should we bother with improvements if the actual interval is binomial (and almost-normal)? So, my suggested amendments are: (1) revise the explanation of the normal approximation to avoid this error and then (2) deal with it conclusively under the Wilson score interval section. Sean a wallis ( talk) 21:58, 5 July 2013 (UTC)
Further amendments, sharpened up the Normal approximation section to discourage casual and incorrect use of this problematic method, and added a new section on the Wilson interval with continuity correction. The latter could be a sub-section of the Wilson interval. Sean a wallis ( talk) 14:36, 12 July 2013 (UTC)
I came across the lower bound of the Wilson Score Interval being used as a 'confidence' metric for decision tree nodes [1]. Perhaps adding a subsection to the Wilson Interval about how it can be applied to decision trees would be useful. Gkarthik92 ( talk) 19:52, 29 December 2015 (UTC)
@Qwfp: Can you please elaborate on why you reverted my edit? The old version was certainly not clear enough for a layperson. I can see your objecting to my use of the word "likelihood" instead of "probability", though I was using it in what I thought was a layperson-friendly way; so I agree to changing it to "probability". But you also said "Added text contained incorrect interpretation of confidence interval" -- can you explain what's incorrect about it? The added text said, with "likelihood" replaced by "probability",
What's wrong with that? Duoduoduo ( talk) 14:40, 14 February 2011 (UTC)
The above discussion suggests that there may be a good reason to expand the scope of ths article to become "Interval estimation for binomial proportions" and to allow an equal footing in the article for credible intervals, with distinctions being made where appropriate. But that might just get too confusing. Perhaps there could be a separate article called "Binomial proportion crediible interval". Melcombe ( talk) 10:06, 15 February 2011 (UTC)
At the end of the section of the A-C interval, we read: "this is the "add 2 successes and 2 failures" interval in [7]."
The citation is superscripted. Because that quotation comes from the article, should the "[7]" be part of the sentence or should it be superscripted as in the present article?
Or one could use the full article quotation: …this is the "'add two successes and two failures' adjusted Wald interval." And the give the citation. (Also it's at p.122b)
Hello fellow Wikipedians,
I have just modified one external link on Binomial proportion confidence interval. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— InternetArchiveBot ( Report bug) 21:05, 2 November 2016 (UTC)
The article states the Wilson score is asymmetric. This is sourced to an article from Journal of Quantitative Linguistics. This may be a stupid question but how exactly is the Wilson score asymmetric? The formula for the Wilson score confidence interval is:
Point +/- stuff
It’s “+/-“ the same thing (“stuff”). Adding and subtracting the same thing to a point estimate gives a symmetric CI.
I think the word “asymmetric” here is being misunderstood here and used in the wrong way. The Wilson Score CI may be asymmetric in some other sense (density within the interval? That doesn’t make sense either though) but here the wording appears to be be confused.
Can anyone clarify? Volunteer Marek 00:24, 1 January 2022 (UTC)