![]() | This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
A variety of mergers has been proposed. Please add to the discussion at Talk:Linear regression.
An automated Wikipedia link suggester has some possible wiki link suggestions for the Analysis_of_variance article:
Additionally, there are some other articles which may be able to linked to this one (also known as "backlinks"):
Notes: The article text has not been changed in any way; Some of these suggestions may be wrong, some may be right.
Feedback:
I like it,
I hate it,
Please don't link to —
LinkBot 11:28, 1 Dec 2004 (UTC)
Frankly, I don't think this page is very clear. It would be nice to add a few words about the type problems ANOVA is applied to. The second bullet describing classes ("Random-effects models assume that the data describe a hierarchy of different populations whose differences are constrained by the hierarchy.") is hard to understand for the non-expert. The meaning of SS is not explained (I guess it means "sum of squares"?). Is there anybody with a backgroud in statistics who could improve upon this article? -- Agd 20:28, 9 November 2005 (UTC)
Agree that the page is not clear. Needs major revision. 212.179.209.45 01:01, 16 January 2006 (UTC)
Agreed this page lacks many anova tests. I was going to use this as a reference before I got my stats book for class, and to my dismay there is not a single formula. NormDor 11:56, 17 March 2006 (UTC)
I came here after reading the biographical entry on R. A. Fisher. I agree with the foregoing comments that this page doesn't do a good job of explaining what Analysis of variance is about, or used for, that is useful for the numerate reader who is not already familiar with the topic. 198.161.198.74 ( talk) 19:40, 15 January 2010 (UTC)
The sections labeled "Example of [...]" are not actually examples of the ANOVA procedure, they are examples of study designs on which analysis by ANOVA might be useful. A fine distinction, but an important one I think. Should revise either the section headings or the sections' contents.
Sorry, but this page really completely useless. It's a bit like an "About" box for ANOVA. Gives you a hint about what it might be. But without any actual worked examples, showing you how to get the results you need, and then how to interpret those results, nothing useful is learned. Barfly42 15:24, 17 December 2006 (UTC)
I changed the definition of ANOVA (the first paragraph). The old definitions was not generale enough. ANOVA can be used to compare several distribution, but it is just an example of its applications. Some work should be devoted to change the definitions of fixed and random effects. Gideon Fell 10:33, 14 March 2007 (UTC)
In the examples section it says: "one-way ANOVA with repeated measures". Shouldn't it be "two way ANOVA" since the experiment of using the same subjects with repeated measurements matches the description of "two way ANOVA" from the Overview section. —Preceding unsigned comment added by 190.64.28.235 ( talk) 15:21, 27 October 2009 (UTC)
According to "Statistics for experimenters" by Box, Hunter and Hunter the use of ANOVA makes no sense for factorial experiments (section 5.10 "Misuse of the ANOVA for 2^k factorial experiments"). This appears to be because each combination of factors only has a single degree of freedom, so the value actually calculated is equivalent to referral to a t table. This seems to be a common misconception. —The preceding unsigned comment was added by 129.215.37.22 ( talk) 20:46, 14 March 2007 (UTC).
There is a link to "Factorial ANOVA" in the 'Overview' section, which links back to the ANOVA page(this same article!). This is stupid. Either the link should be removed, or a "Factorial ANOVA" page should be added, or it should be an anchor going to a "Factorial ANOVA" section. —Preceding unsigned comment added by Mr.maddamsetti ( talk • contribs) 20:10, 23 March 2010 (UTC)
Could do with at least one example of an ANOVA table here, either with numbers in or showing notation for sums of squares, mean squares etc. Maybe also worth mentioning skeleton ANOVA tables, i.e. showing with entries only for df. I may add these myself at some point... Qwfp ( talk) 18:18, 19 January 2008 (UTC)
it might be well worth the effort to add a section describing visualization techniques of ANOVA. through plots such as boxplot and others. I am willing to have a go at it, but I don't know who is responsible to this article and don't want to step anyone's tows... (p.s: I am currently doing my second degree in biostatistics) Talgalili ( talk) 11:49, 1 December 2008 (UTC)
In the article there are four different notations, which unnecessarily confuse the reader:
Which notation should Wikipedia / the editors opt for?
Might I suggest:
where
A is treatment (factor A)
T is total
E is error
What are your thoughts on this?
Ostracon (
talk)
15:57, 16 August 2009 (UTC)
ANOVA assumes neither
nor
What is true is that the p-values of the randomization test of the ANOVA null-hypothesis are well approximated by the p-values of the F test using the F-distribution (Chapter 6, Hinkelmann & Kempthorne).
(Of course, it is easier to teach the mechanics of ANOVA testing by assuming a so-called "normal" linear model and using the F-distribution.)
Therefore, the "Assumptions" sections needs revision, imho. Kiefer.Wolfowitz ( talk) 17:10, 24 November 2009 (UTC)
There is a well known saying in mathematics and statistics: "A mathematician is only given credit for his discoveries that his colleagues agree to give him." Quoting an expository article by Seaman a decade after the discovery work of Sawilowsky on the rank transform is not only unfair, but is typical of the shallow scholarship that is becoming legendary on Wiki. So, I've decided to jump right in and set the record straight, although by now I know that scholarship and Wiki warriors rarely peacefully coexist.
I also took the opportunity to move the references misplaced in the middle of the article to the end, and put them in alpha order. —Preceding unsigned comment added by 141.217.105.21 ( talk) 15:09, 4 January 2010 (UTC)
Hettmansperger and McKean's book "Robust Nonparametric Statistical Methods"
{{
cite book}}
: Unknown parameter |location2=
ignored (
help); Unknown parameter |publisher2=
ignored (
help)
MR
1604954state that the "R transform" works well, even for small samples (McKean and Seavers), pages 254-258. Distinguishing the R-transform from the Rank-transform is difficult for the public . Maybe the article should discuss the R-transform (following Hetmansperger & McKean, the best authority known to this amateur) first and foremost. Then the article can continue to discuss the rank-transform, and mention that its use seems to be deprecated (for some time). Would that be agreeable? Sincerely, Kiefer.Wolfowitz ( talk) 21:35, 19 January 2010 (UTC)
I looked in the recent SAS Linear Models and Mixed Linear Models books, and they contain no references to rank (as far as I can see). Would an editor please either provide a current reference or please delete/modify the statements about SAS? Again, Statistical Science in 2004 had a lot of papers on rank-based procedurs in its special issue on nonparametrics´, so it doesn't seem useful to include a reference to rank-based methods in the 1980s. Kiefer.Wolfowitz ( talk) 05:04, 14 January 2010 (UTC)
This section is confusing two different effect size measures when it states the following:
"The generally-accepted regression benchmark for effect size comes from (Cohen, 1992; 1988): 0.20 is a minimal solution (but significant in social science research); 0.50 is a medium effect; anything equal to or greater than 0.80 is a large effect size (Keppel & Wickens, 2004; Cohen, 1992)."
"Nevertheless, alternative rules of thumb have emerged in certain disciplines: Small = 0.01; medium = 0.06; large = 0.14 (Kittler, Menard & Phillips, 2007)."
The first paragraph refers to rule of thumb guidelines for categorizing Cohen's d. The second paragraph refers to rule of thumb guidelines for categorizing eta-squared.
68.54.107.114 ( talk) 02:17, 11 January 2010 (UTC)AmateurStatistician
Generally there seems to be a bit of a disorder in ANOVA information. There is a One-way ANOVA page on wikipedia but the two way ANOVA page redirects here without giving any reasonable comparison or differentiation between the two. Since this is one of the most widely used tests in social sciences it should be clear what are the distinctions in clear and simple terms. JakubHampl ( talk) 17:31, 17 April 2010 (UTC)
Perhaps something should be said about ANOVAs not always having explanatory power wrt causality (in observational studies). This is perhaps most controversial in heritability estimates, particularly in human subjects. From Stoltenberg, S. F. (1997). "Coming to terms with heritability". Genetica. 99 (2–3): 89–96. doi: 10.1023/A:1018366705281. PMID 9463077.
“ | However, the language that surrounds the partitioning of variance is prone to misunderstanding in its own right (Lewontin, 1974; Kempthorne, 1978), therefore I avoid using terms such as ‘due to’ or ‘caused by’ when referring to the statistical relations between an independent variable and a dependent variable (e.g., in an analysis of variance [ANOVA]), but instead use terms such as ‘associated with’ to avoid deterministic implications. | ” |
The papers cited which go into more detail on this are: Lewontin and Kempthorne. Note that "due to" is used here right in the lead. Tijfo098 ( talk) 05:06, 26 October 2010 (UTC)
I put the cleanup tag due to sheer size and unclear usefulness to the article. Ideally, they should be references directly used in the article, through footnotes. Here, they seem not to be attached to footnotes and are more of a "further reading" section (See WP:CITE and WP:LAY), but it still contain duplications. For examples, SAS user guides from 25 and 23 years ago are listed. Same book, different editions: why are are they both listed? Is it content that is in one but not the other and both are relevant to Anova? Why not cite a more recent edition which would be more up to date and accessible instead? It seems to me quality should be picked over quantity.-- 137.122.49.102 ( talk) 19:04, 3 November 2010 (UTC)
Following the anonymous editor's concerns, I removed this section, but include it here for archival purposes and to facilitate its use in a stand-alone article:
When the data do not meet the assumptions of normality, the suggestion has arisen to replace each original data value by its rank (from 1 for the smallest to N for the largest), then run a standard ANOVA calculation on the rank-transformed data. Conover and Iman (1981) provided a review of the four main types of rank transformations. Commercial statistical software packages (e.g., SAS, 1985, 1987, 2008) followed with recommendations to data analysts to run their data sets through a ranking procedure (e.g., PROC RANK) prior to conducting standard analyses using parametric procedures.
This rank-based procedure has been recommended as being robust to non-normal errors, resistant to outliers, and highly efficient for many distributions. It may result in a known statistic (e.g., Wilcoxon Rank-Sum / Mann-Whitney U), and indeed provide the desired robustness and increased statistical power that is sought. For example, Monte Carlo studies have shown that the rank transformation in the two independent samples t test layout can be successfully extended to the one-way independent samples ANOVA, as well as the two independent samples multivariate Hotelling's T2 layouts (Nanna, 2002).
Conducting factorial ANOVA on the ranks of original scores has also been suggested (Conover & Iman, 1976, Iman, 1974, and Iman & Conover, 1976). However, Monte Carlo studies by Sawilowsky (1985a; 1989 et al.; 1990) and Blair, Sawilowsky, and Higgins (1987), and subsequent asymptotic studies (e.g. Thompson & Ammann, 1989; "there exist values for the main effects such that, under the null hypothesis of no interaction, the expected value of the rank transform test statistic goes to infinity as the sample size increases," Thompson, 1991, p. 697), found that the rank transformation is inappropriate for testing interaction effects in a 4x3 and a 2x2x2 factorial design. As the number of effects (i.e., main, interaction) become non-null, and as the magnitude of the non-null effects increase, there is an increase in Type I error, resulting in a complete failure of the statistic with as high as a 100% probability of making a false positive decision. Similarly, Blair and Higgins (1985) found that the rank transformation increasingly fails in the two dependent samples layout as the correlation between pretest and posttest scores increase. Headrick (1997) discovered the Type I error rate problem was exacerbated in the context of Analysis of Covariance, particularly as the correlation between the covariate and the dependent variable increased. For a review of the properties of the rank transformation in designed experiments see Sawilowsky (2000).
A variant of rank-transformation is 'quantile normalization' in which a further transformation is applied to the ranks such that the resulting values have some defined distribution (often a normal distribution with a specified mean and variance). Further analyses of quantile-normalized data may then assume that distribution to compute significance values. However, two specific types of secondary transformations, the random normal scores and expected normal scores transformation, have been shown to greatly inflate Type I errors and severely reduce statistical power (Sawilowsky, 1985a, 1985b).
According to Hettmansperger and McKean [1] "Sawilowsky (1990) [2] provides an excellent review of nonparametric approaches to testing for interaction" in ANOVA.
I believe that most of these books and articles are related to Sawilowski's publications or unpublished writings, and were added in excellent faith by Edstat, I add in good faith (having just removed many references that were zealously added by me, when I was evangelizing for generalized randomized block designs!). I'll come back and look for references to them in other sections. Again, they would be very useful in an article about academics closely associated with Sawilowski (not necessarily on Wikipedia) or in a stand alone article on rank-transforms, if that is a notable topic (e.g. is it covered in statistical encyclopedias or recent surveys in notable reliable journals?). Thanks Kiefer.Wolfowitz ( talk) 19:43, 3 November 2010 (UTC)
{{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite book}}
: Unknown parameter |location2=
ignored (
help); Unknown parameter |publisher2=
ignored (
help)
MR
1604954{{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link)I note that Hettsmansberger and McKean is notable and reliable, given the writers' being asked to be head editors of e.g. the Statistical Science special issue on nonparametrics or to write the JASA 2000 article reviewing nonparametrics and robust statistics. (I am happy that, as first noted in the article on Sawilowski, that H & McK have nice comments in a few pages about Professor Sawilowski.) I don't see why the other articles should stay in an article on Anova here, unless they are cited by reliable books on ANOVA. Thanks, Kiefer.Wolfowitz ( talk) 19:47, 3 November 2010 (UTC)
Let the discussion begin! Kiefer.Wolfowitz ( talk) 19:32, 3 November 2010 (UTC)
This example has no randomized assignment of treatment to subjects. It seems that group-status is perfectly confounded with treatment, so this is a worthless "experiment". Kiefer.Wolfowitz ( talk) 20:48, 3 November 2010 (UTC)
In a first experiment, Group A is given vodka, Group B is given gin, and Group C is given a placebo. All groups are then tested with a memory task. A one-way ANOVA can be used to assess the effect of the various treatments (that is, the vodka, gin, and placebo).
In a second experiment, Group A is given vodka and tested on a memory task. The same group is allowed a rest period of five days and then the experiment is repeated with gin. The procedure is repeated using a placebo. A one-way ANOVA with repeated measures can be used to assess the effect of the vodka versus the impact of the placebo.
In a third experiment testing the effects of expectations, subjects are randomly assigned to four groups:
Each group is then tested on a memory task. The advantage of this design is that multiple variables can be tested at the same time instead of running two different experiments. Also, the experiment can determine whether one variable affects the other variable (known as interaction effects). A factorial ANOVA (2×2) can be used to assess the effect of expecting vodka or the placebo and the actual reception of either.
In a balanced design, the factors's induce an orthogonal decomposition of a Euclidean space; and the converse holds (see Bailey). First project the data onto the mean-value subspace, and then consider that subspace's orthogonal complement, which then needs be intersected with the subspaces of treatment & block subspaces (which may have further decompositions). The squared Euclidean norm of the projected residuals is the sum of squares. The degrees of freedom are the dimensions of the subspace.
With this orthogonality (orthomodularity), the sums of squares add nicely, regardless of any normality of the residuals.
This geometric account of Anova is given in friendlier fashion in Bailey, in Christensen, and in the very friendly Saville & Woods (in 2 volumes) for example. It should be given here. Kiefer.Wolfowitz ( talk) 21:11, 3 November 2010 (UTC)
This article suffers from obtuse pedagogy (it's essentially useless) to downright inaccurate information about ANOVA, its assumptions, and its small sample robustness and power properties. (The ANOVA F test of difference in means is robust to departures from independence, homoscedasticity, and/or normality? Tell that to the hundreds of Monte Carlo studies published since 1980!) A thorough reading of the Monte Carlo literature after 1980 would benefit this article greatly. My suggestion is that the current editors step back and ask for some help, preferably not from the asymptotic maths lobby, but from qualified applied statisticians who have read the literature post 1980 (but for starters, read Glass, Peckham, & Sanders, 1972; Bradley, 1969, 1972, etc.; Blair, 1980, 1981, 1985, etc.; Sawilowsky, 1990, 1992, etc.) It's just a suggestion - don't reach for the aspirin or saltines. Edstat ( talk) 03:49, 15 November 2010 (UTC)
Editor Edstat raised concerns about a non-normality, and about heteroscedacity (alternatively, differing variances, or a failure of homoscedacity!), etc. In the section on the randomization analysis, references to Cox and to Kempthorne are given to support the statement that a proper randomization procedure and unit-treatment additivity imply constant variance. Thus result explains why both Cox & Kempthorne (and Rosenbaum, Rubin, Imbens, Abadie, Angchrist, etc.) emphasize proper randomization and why they emphasize the unit-treatment additivity assumption. When this unit-treatment additivity is implausible, the analysis is more difficult (although local average unit-treatment additivity saves much of the standard analysis). While the article's few paragraphs are not a substitute for a textbook, they at least sketch the central issues, and reference the most reliable sources. Edstat's claim that normality is so important is not supported by the analysis by these authors, who are usually regarded as the most reliable sources. Kiefer.Wolfowitz ( talk) 17:25, 19 November 2010 (UTC)
After reading this article, I am still left with absolutely no idea how this technique is actually employed. There are many references to "treatments" -- is it used exclusively in medical research? A fully-worked example (including computation) would be a great boon. 121a0012 ( talk) 05:59, 5 January 2011 (UTC)
Please see the following section (copied below):
"Though, considering that η2 are comparable to r2 when df of the numerator equals 1 (both measures proportion of variance accounted for), these guidelines may overestimate the size of the effect. If going by the r guidelines (0.1 is a small effect, 0.3 a medium effect and 0.5 a large effect) then the equivalent guidelines for eta-squared would be the squareroot of these, i.e. 01 is a small effect, 0.09 a medium effect and 0.25 a large effect, and these should also be applicable to eta-squared. When the df of the numerator exceeds 1, eta-squared is comparable to R-squared (Levine & Hullett, 2002)."
Note that it is self-contradictory. First it says "η2 are comparable to r2 when df of the numerator equals 1" and later says "When the df of the numerator exceeds 1, eta-squared is comparable to R-squared". Any suggestions on which is correct?
I also suggest that this section is removed until consensus is reached.
Trevorzink ( talk) 02:28, 6 April 2011 (UTC)
On this day Kiefer.Wolfowitz and I worked in mild opposition. He was removing references that I was strengthening. I will wait a few days to let the dust settle. 159.83.196.1 ( talk) 01:51, 14 March 2012 (UTC)
Under-cited: Cox - mentioned often in text, but no specific references cited. Two references in list. Freedman - no specific reference cited. Two references in list. Kempthorne - often mentioned without citation. Two references in list. 159.83.196.1 ( talk) 01:54, 14 March 2012 (UTC)
Cox (1958) is an odd and interesting reference for ANOVA. Analysis of
variance does not appear in the index. Page 12: "...methods of
statistical analysis will not be described in this book." He talks
some about models, specifically additivity (on page 15) which is
applicable to this article. Is Cox (cited repeatedly) one of the best
references on a subject admittedly not discussed half a century ago?
I suspect a flawed citation. Cox makes frequent mention
(30 references) to Cochran & Cox (1957, 2e) Experimental Designs.
Page 292: "Cochran and Cox (1957) have given numerous detailed plans
as well as worked numerical examples of the analysis of biological
experiments."
Gertrude Mary Cox might be a better reference
than
Sir David Roxbee Cox for ANOVA.
Both books are statistical classics, available as reprints. Both
authors are mentioned favorably in
Design of experiments.
159.83.196.1 (
talk)
23:22, 1 May 2012 (UTC)
On April 4, 2012 readers of ANOVA assigned the following scores: Trustworthy 3.3 of 5; Objective 3.3 of 5; Complete 3.0 of 5; Well-written 2.3 of 5
Mathematicians assigned a "start" grade (less than a C). Statisticians assigned a grade of C. Both groups claimed ANOVA to be important. The currency of these grades is unknown.
The article was recently locked (no edits) for a few days over vandalism concerns. 159.83.196.1 ( talk) 21:04, 5 April 2012 (UTC)
(The following, or similar, needs to be added. It's only hinted at, in the article, and all the links substantiate it. Cites, of course, can be added. 72.37.249.60 ( talk) 19:06, 18 April 2012 (UTC) )
Consistent with ANOVA's purpose to analyze a variable's components attributable to different sources of variation, it is possible to view most any general linear model as a regression.
where the xi, i=1,2,...,p, are quantitative variables, in some cases merely 0 or 1 representing the absence or presence, respectively, of different levels of qualitative variables, and where their multiple degrees of freedom are distributed with df 1 assigned to each level as a separate variable. From this model, consistent with standard formulas for expectation and variance,
and bi2σi2 is the variance component of y attributable to xi. The size of the estimated variance component, usually relative to mse, the estimated σ2error, determines xi's significance in the model of y. At least asymptotic normality of the estimators is a fundamental assumption, allowing F-tests or t-tests of "H0:xi's contribution is insignificant.", but there are instances when the assumption of normality is unjustified and nonparametric alternatives to ANOVA are needed.
But "tests" are not basic to ANOVA, and neither are models involving distributions for models errors: read the label "analysis of variance". The basis is "explained variance" and a comparison of the explained variance of a sequence of models. Then for a sequence of nested models this becomes the question of how much more variance is explained by the extra complexity of one model expanded from another. Of course in general ANOVA there is not necessarily a unique way of sequencing a nested set of models. But the initial steps are getting the increase in explained variance in terms of increase in the sum of squares of the predictions and then converting these to a mean square. Basic information at an intuitive level (the relative importance of components in a model) can be gained simply by comparing the numerical values of these mean squares. All of this is at the level of "least squares" and does not involve modelling using either type of model for random components. Of course, once the "explained SS due to a model component" have been defined by least squares these can be interpreted in the context of either or both types of sources of random variation. At this stage the models provide, firstly, an extra guide as to how to interpret the mean squares (in terms of their expected values) and only then provide the possibility of formal hypothesis testing. ANOVA procedures are a tool of practical statistics, not something derived ab initio to be optimal under one pre-specified model in theoretical statistics and it would be misleading to present it as such.
Melcombe (
talk)
08:25, 20 April 2012 (UTC)
This article is opaque to those innocent of the terminology of the design of experiments. Defining terminology offers some help. Many technical terms remain undefined. Editors should feel particularly free to remove terms unused or already defined in the text. 159.83.196.1 ( talk) 23:31, 20 June 2012 (UTC)
While I improved the consistency of the math symbols used, the result was a rich mixture of fonts. A typesetter could substantially improve on my effort. 159.83.196.1 ( talk) 20:54, 7 July 2012 (UTC)
Derived linear model section: "However, there are differences. For example, the randomization-based analysis results in a small but (strictly) negative correlation between the observations." The statement is supported by two citations, both broken. Bailey's book does not have a section 1.14 (sorry). I found support in the equations (not the prose) of H&K. The structure of the covariance matrix equations implies that total errors are independent but that observational errors are not. This requires more explanation than the section justifies. I removed the weakly supported sentences. 159.83.196.1 ( talk) 23:30, 13 September 2012 (UTC)
Texts use different conventions regarding subscripts. Howell (2002) uses i as an index into experimental units and j as an index into treatment groups. Howell's convention is used here to define additivity. Montgomery (2001) follows a long tradition of reversing the roles of i & j. Montgomery's convention is hinted at here by defining I as the number of treatments. I propose to adopt Montgomery's convention regarding subscripts throughout. Objections? 159.83.196.1 ( talk) 21:34, 12 October 2012 (UTC)
The F-test is used in planning the experiment and the anova, because the non-centrality parameter shifts the F-distribution to the right. Using t-tests to plan experiments, as Bailey does in an otherwise fine book, results in larger numbers of subjects than needed, in many cases. This is not discussed, despite it being the main motivation. (Non-central t-distributions are less readily accessible, and don't appear in textbooks on Anova.)
03:22, 9 April 2013 (UTC)
A major enhancement, deserving expert review was made to the sister article: One-way analysis of variance 159.83.196.1 ( talk) 21:35, 30 October 2012 (UTC)
The weird nature of this articles formatting seems to lend to some copy and paste issue. I've addressed some of them; but this diff shows major issues between the formatting. I couldn't find any evidence of a CV myself; but the book source might be used. [3]
The section "ANOVA for multiple factors" points to the main article Two-way analysis of variance which is now a stub. Editors do not wish Wikipedia to contain lengthy examples. 159.83.196.1 ( talk) 20:03, 12 July 2013 (UTC)
The example in the introduction is excellent, but some elaboration is needed of the statement: "An attempt to explain the weight distribution by dividing the dog population into groups (young vs old)(short-haired vs long-haired) would probably be a failure (no fit at all)."
Someone thinking of the weight distribution as an empircal histogram can object that any histogram can be written as sum of histograms corresponding to subgroups of the population. The phrase "no fit at all" might be interpreted as a claim that blue histograms do not actually add to the yellow one.
What the sentence is trying to convey is that "success" at dividing up the population into categories means that if you are given the category of a dog, you can use the corresponding histogram to estimate the dogs weight well. Hence "no fit at all" refers to the fact that if you are given a dog is (for example) young then you can't make a good guess of the dog's weight by using the weight histogram of young dogs.
Tashiro ( talk) 06:18, 22 March 2014 (UTC)
The term "treatment" is apparently central in this exposition of ANOVA, it appears early in the "Background and terminology" section, but ... before it is defined!
The later definition of "treatment" in the "Design-of-experiments terms" section says it is "a combination of factor levels." What kind of combination of levels? A sum of the level numbers? Look up "factor" to find out about factor levels: a factor is an investigator-manipulated process that causes a change in output. What kind of process might this mean? Adding and removing data? Why would you do that? Output of what? Might this "output" refer to how variance changes when the investigator manipulates the data like this? How can a process have "levels?"
Please clarify. — Preceding unsigned comment added by Randallbsmith ( talk • contribs) 19:13, 1 May 2014 (UTC)
Hello fellow Wikipedians,
I have just modified 2 external links on Analysis of variance. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— InternetArchiveBot ( Report bug) 08:52, 12 October 2016 (UTC)
I think this is either wrong, or needs some clarification. Surely if the multiple t-tests are taken at face value and no correction is applied, then it is they that are less conservative than ANOVA, and not the other way around, as they will result in more type I errors. If 'conservative' here simply means 'producing fewer type I errors', then ANOVA is more conservative than uncorrected multiple t-tests, not less. Right? L T T H U ( talk) 11:30, 5 February 2016 (UTC)
I also noticed this and I agree. By definition, any statistical test should give type I errors with a frequency equal to the significance level -- so it does not depend on the test. I have marked this as requiring a citation. -- Denziloe ( talk) 16:37, 16 August 2017 (UTC)
Under "Design of Experiments Terms" for the Two Way ANOVA Table, the error DF reads (h-1)*(k-1). I believe this is incorrect. For a two way with no interaction term it should be N-h-k+1. Is that correct? — Preceding unsigned comment added by 2601:8C:C000:F9B3:C1CF:1AF3:44D:CA67 ( talk) 01:20, 11 December 2018 (UTC)
Article has bold phrases within quotations:
As a result: ANOVA "has long enjoyed the status of being the most used (some would say abused) statistical technique in psychological research."[14] ANOVA "is probably the most useful technique in the field of statistical inference."[15]
I suspect that this formatting doesn't appear in the original resources, and so should be either removed, or else noted as an editorial modification introduced here.
—DIV (
120.17.160.228 (
talk)
05:55, 21 January 2019 (UTC))
I tried to learn the basics of ANOVA by this article, but was forced to find better sources. Half the article goes by before the calculations involved are even defined, which may be fine for a textbook, but as a encyclopedia article this should get to the point much faster. In my opinion, most of the terminology, types of models, and characteristics should be moved later. The assumptions should also be moved later, but maybe add in a couple sentences of summary into the lede or where relevant. The background section should be shortened significantly: it's not necessary to explain every detail about statistical testing in this section. For example, let's take the following paragraph: "By construction, hypothesis testing limits the rate of Type I errors (false positives) to a significance level. Experimenters also wish to limit Type II errors (false negatives). The rate of Type II errors depends largely on sample size (the rate is larger for smaller samples), significance level (when the standard of proof is high, the chances of overlooking a discovery are also high) and effect size (a smaller effect size is more prone to Type II error)." This is basically the third time that hypothesis testing has been defined in this section (each time slightly differently) and so the whole thing can probably just be omitted. Readers can follow the link to statistical hypothesis testing if they want a detailed explanation
In particular, the "background and terminology" section is a rambling mess that doesn't actually provide any "background" and doesn't actually explain any "terminology" (except for sloppy definitions of a few general concepts in null hypothesis testing, such as statistical significance). Most of the section is unsourced, and the rest is unclearly sourced (e.g., what is Gelman, 2005?). I'm removing the section, since it likely adds more confusion than useful information. 23.242.195.76 ( talk) 08:22, 24 July 2021 (UTC)
![]() | This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
A variety of mergers has been proposed. Please add to the discussion at Talk:Linear regression.
An automated Wikipedia link suggester has some possible wiki link suggestions for the Analysis_of_variance article:
Additionally, there are some other articles which may be able to linked to this one (also known as "backlinks"):
Notes: The article text has not been changed in any way; Some of these suggestions may be wrong, some may be right.
Feedback:
I like it,
I hate it,
Please don't link to —
LinkBot 11:28, 1 Dec 2004 (UTC)
Frankly, I don't think this page is very clear. It would be nice to add a few words about the type problems ANOVA is applied to. The second bullet describing classes ("Random-effects models assume that the data describe a hierarchy of different populations whose differences are constrained by the hierarchy.") is hard to understand for the non-expert. The meaning of SS is not explained (I guess it means "sum of squares"?). Is there anybody with a backgroud in statistics who could improve upon this article? -- Agd 20:28, 9 November 2005 (UTC)
Agree that the page is not clear. Needs major revision. 212.179.209.45 01:01, 16 January 2006 (UTC)
Agreed this page lacks many anova tests. I was going to use this as a reference before I got my stats book for class, and to my dismay there is not a single formula. NormDor 11:56, 17 March 2006 (UTC)
I came here after reading the biographical entry on R. A. Fisher. I agree with the foregoing comments that this page doesn't do a good job of explaining what Analysis of variance is about, or used for, that is useful for the numerate reader who is not already familiar with the topic. 198.161.198.74 ( talk) 19:40, 15 January 2010 (UTC)
The sections labeled "Example of [...]" are not actually examples of the ANOVA procedure, they are examples of study designs on which analysis by ANOVA might be useful. A fine distinction, but an important one I think. Should revise either the section headings or the sections' contents.
Sorry, but this page really completely useless. It's a bit like an "About" box for ANOVA. Gives you a hint about what it might be. But without any actual worked examples, showing you how to get the results you need, and then how to interpret those results, nothing useful is learned. Barfly42 15:24, 17 December 2006 (UTC)
I changed the definition of ANOVA (the first paragraph). The old definitions was not generale enough. ANOVA can be used to compare several distribution, but it is just an example of its applications. Some work should be devoted to change the definitions of fixed and random effects. Gideon Fell 10:33, 14 March 2007 (UTC)
In the examples section it says: "one-way ANOVA with repeated measures". Shouldn't it be "two way ANOVA" since the experiment of using the same subjects with repeated measurements matches the description of "two way ANOVA" from the Overview section. —Preceding unsigned comment added by 190.64.28.235 ( talk) 15:21, 27 October 2009 (UTC)
According to "Statistics for experimenters" by Box, Hunter and Hunter the use of ANOVA makes no sense for factorial experiments (section 5.10 "Misuse of the ANOVA for 2^k factorial experiments"). This appears to be because each combination of factors only has a single degree of freedom, so the value actually calculated is equivalent to referral to a t table. This seems to be a common misconception. —The preceding unsigned comment was added by 129.215.37.22 ( talk) 20:46, 14 March 2007 (UTC).
There is a link to "Factorial ANOVA" in the 'Overview' section, which links back to the ANOVA page(this same article!). This is stupid. Either the link should be removed, or a "Factorial ANOVA" page should be added, or it should be an anchor going to a "Factorial ANOVA" section. —Preceding unsigned comment added by Mr.maddamsetti ( talk • contribs) 20:10, 23 March 2010 (UTC)
Could do with at least one example of an ANOVA table here, either with numbers in or showing notation for sums of squares, mean squares etc. Maybe also worth mentioning skeleton ANOVA tables, i.e. showing with entries only for df. I may add these myself at some point... Qwfp ( talk) 18:18, 19 January 2008 (UTC)
it might be well worth the effort to add a section describing visualization techniques of ANOVA. through plots such as boxplot and others. I am willing to have a go at it, but I don't know who is responsible to this article and don't want to step anyone's tows... (p.s: I am currently doing my second degree in biostatistics) Talgalili ( talk) 11:49, 1 December 2008 (UTC)
In the article there are four different notations, which unnecessarily confuse the reader:
Which notation should Wikipedia / the editors opt for?
Might I suggest:
where
A is treatment (factor A)
T is total
E is error
What are your thoughts on this?
Ostracon (
talk)
15:57, 16 August 2009 (UTC)
ANOVA assumes neither
nor
What is true is that the p-values of the randomization test of the ANOVA null-hypothesis are well approximated by the p-values of the F test using the F-distribution (Chapter 6, Hinkelmann & Kempthorne).
(Of course, it is easier to teach the mechanics of ANOVA testing by assuming a so-called "normal" linear model and using the F-distribution.)
Therefore, the "Assumptions" sections needs revision, imho. Kiefer.Wolfowitz ( talk) 17:10, 24 November 2009 (UTC)
There is a well known saying in mathematics and statistics: "A mathematician is only given credit for his discoveries that his colleagues agree to give him." Quoting an expository article by Seaman a decade after the discovery work of Sawilowsky on the rank transform is not only unfair, but is typical of the shallow scholarship that is becoming legendary on Wiki. So, I've decided to jump right in and set the record straight, although by now I know that scholarship and Wiki warriors rarely peacefully coexist.
I also took the opportunity to move the references misplaced in the middle of the article to the end, and put them in alpha order. —Preceding unsigned comment added by 141.217.105.21 ( talk) 15:09, 4 January 2010 (UTC)
Hettmansperger and McKean's book "Robust Nonparametric Statistical Methods"
{{
cite book}}
: Unknown parameter |location2=
ignored (
help); Unknown parameter |publisher2=
ignored (
help)
MR
1604954state that the "R transform" works well, even for small samples (McKean and Seavers), pages 254-258. Distinguishing the R-transform from the Rank-transform is difficult for the public . Maybe the article should discuss the R-transform (following Hetmansperger & McKean, the best authority known to this amateur) first and foremost. Then the article can continue to discuss the rank-transform, and mention that its use seems to be deprecated (for some time). Would that be agreeable? Sincerely, Kiefer.Wolfowitz ( talk) 21:35, 19 January 2010 (UTC)
I looked in the recent SAS Linear Models and Mixed Linear Models books, and they contain no references to rank (as far as I can see). Would an editor please either provide a current reference or please delete/modify the statements about SAS? Again, Statistical Science in 2004 had a lot of papers on rank-based procedurs in its special issue on nonparametrics´, so it doesn't seem useful to include a reference to rank-based methods in the 1980s. Kiefer.Wolfowitz ( talk) 05:04, 14 January 2010 (UTC)
This section is confusing two different effect size measures when it states the following:
"The generally-accepted regression benchmark for effect size comes from (Cohen, 1992; 1988): 0.20 is a minimal solution (but significant in social science research); 0.50 is a medium effect; anything equal to or greater than 0.80 is a large effect size (Keppel & Wickens, 2004; Cohen, 1992)."
"Nevertheless, alternative rules of thumb have emerged in certain disciplines: Small = 0.01; medium = 0.06; large = 0.14 (Kittler, Menard & Phillips, 2007)."
The first paragraph refers to rule of thumb guidelines for categorizing Cohen's d. The second paragraph refers to rule of thumb guidelines for categorizing eta-squared.
68.54.107.114 ( talk) 02:17, 11 January 2010 (UTC)AmateurStatistician
Generally there seems to be a bit of a disorder in ANOVA information. There is a One-way ANOVA page on wikipedia but the two way ANOVA page redirects here without giving any reasonable comparison or differentiation between the two. Since this is one of the most widely used tests in social sciences it should be clear what are the distinctions in clear and simple terms. JakubHampl ( talk) 17:31, 17 April 2010 (UTC)
Perhaps something should be said about ANOVAs not always having explanatory power wrt causality (in observational studies). This is perhaps most controversial in heritability estimates, particularly in human subjects. From Stoltenberg, S. F. (1997). "Coming to terms with heritability". Genetica. 99 (2–3): 89–96. doi: 10.1023/A:1018366705281. PMID 9463077.
“ | However, the language that surrounds the partitioning of variance is prone to misunderstanding in its own right (Lewontin, 1974; Kempthorne, 1978), therefore I avoid using terms such as ‘due to’ or ‘caused by’ when referring to the statistical relations between an independent variable and a dependent variable (e.g., in an analysis of variance [ANOVA]), but instead use terms such as ‘associated with’ to avoid deterministic implications. | ” |
The papers cited which go into more detail on this are: Lewontin and Kempthorne. Note that "due to" is used here right in the lead. Tijfo098 ( talk) 05:06, 26 October 2010 (UTC)
I put the cleanup tag due to sheer size and unclear usefulness to the article. Ideally, they should be references directly used in the article, through footnotes. Here, they seem not to be attached to footnotes and are more of a "further reading" section (See WP:CITE and WP:LAY), but it still contain duplications. For examples, SAS user guides from 25 and 23 years ago are listed. Same book, different editions: why are are they both listed? Is it content that is in one but not the other and both are relevant to Anova? Why not cite a more recent edition which would be more up to date and accessible instead? It seems to me quality should be picked over quantity.-- 137.122.49.102 ( talk) 19:04, 3 November 2010 (UTC)
Following the anonymous editor's concerns, I removed this section, but include it here for archival purposes and to facilitate its use in a stand-alone article:
When the data do not meet the assumptions of normality, the suggestion has arisen to replace each original data value by its rank (from 1 for the smallest to N for the largest), then run a standard ANOVA calculation on the rank-transformed data. Conover and Iman (1981) provided a review of the four main types of rank transformations. Commercial statistical software packages (e.g., SAS, 1985, 1987, 2008) followed with recommendations to data analysts to run their data sets through a ranking procedure (e.g., PROC RANK) prior to conducting standard analyses using parametric procedures.
This rank-based procedure has been recommended as being robust to non-normal errors, resistant to outliers, and highly efficient for many distributions. It may result in a known statistic (e.g., Wilcoxon Rank-Sum / Mann-Whitney U), and indeed provide the desired robustness and increased statistical power that is sought. For example, Monte Carlo studies have shown that the rank transformation in the two independent samples t test layout can be successfully extended to the one-way independent samples ANOVA, as well as the two independent samples multivariate Hotelling's T2 layouts (Nanna, 2002).
Conducting factorial ANOVA on the ranks of original scores has also been suggested (Conover & Iman, 1976, Iman, 1974, and Iman & Conover, 1976). However, Monte Carlo studies by Sawilowsky (1985a; 1989 et al.; 1990) and Blair, Sawilowsky, and Higgins (1987), and subsequent asymptotic studies (e.g. Thompson & Ammann, 1989; "there exist values for the main effects such that, under the null hypothesis of no interaction, the expected value of the rank transform test statistic goes to infinity as the sample size increases," Thompson, 1991, p. 697), found that the rank transformation is inappropriate for testing interaction effects in a 4x3 and a 2x2x2 factorial design. As the number of effects (i.e., main, interaction) become non-null, and as the magnitude of the non-null effects increase, there is an increase in Type I error, resulting in a complete failure of the statistic with as high as a 100% probability of making a false positive decision. Similarly, Blair and Higgins (1985) found that the rank transformation increasingly fails in the two dependent samples layout as the correlation between pretest and posttest scores increase. Headrick (1997) discovered the Type I error rate problem was exacerbated in the context of Analysis of Covariance, particularly as the correlation between the covariate and the dependent variable increased. For a review of the properties of the rank transformation in designed experiments see Sawilowsky (2000).
A variant of rank-transformation is 'quantile normalization' in which a further transformation is applied to the ranks such that the resulting values have some defined distribution (often a normal distribution with a specified mean and variance). Further analyses of quantile-normalized data may then assume that distribution to compute significance values. However, two specific types of secondary transformations, the random normal scores and expected normal scores transformation, have been shown to greatly inflate Type I errors and severely reduce statistical power (Sawilowsky, 1985a, 1985b).
According to Hettmansperger and McKean [1] "Sawilowsky (1990) [2] provides an excellent review of nonparametric approaches to testing for interaction" in ANOVA.
I believe that most of these books and articles are related to Sawilowski's publications or unpublished writings, and were added in excellent faith by Edstat, I add in good faith (having just removed many references that were zealously added by me, when I was evangelizing for generalized randomized block designs!). I'll come back and look for references to them in other sections. Again, they would be very useful in an article about academics closely associated with Sawilowski (not necessarily on Wikipedia) or in a stand alone article on rank-transforms, if that is a notable topic (e.g. is it covered in statistical encyclopedias or recent surveys in notable reliable journals?). Thanks Kiefer.Wolfowitz ( talk) 19:43, 3 November 2010 (UTC)
{{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite book}}
: Unknown parameter |location2=
ignored (
help); Unknown parameter |publisher2=
ignored (
help)
MR
1604954{{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link){{
cite journal}}
: CS1 maint: multiple names: authors list (
link)I note that Hettsmansberger and McKean is notable and reliable, given the writers' being asked to be head editors of e.g. the Statistical Science special issue on nonparametrics or to write the JASA 2000 article reviewing nonparametrics and robust statistics. (I am happy that, as first noted in the article on Sawilowski, that H & McK have nice comments in a few pages about Professor Sawilowski.) I don't see why the other articles should stay in an article on Anova here, unless they are cited by reliable books on ANOVA. Thanks, Kiefer.Wolfowitz ( talk) 19:47, 3 November 2010 (UTC)
Let the discussion begin! Kiefer.Wolfowitz ( talk) 19:32, 3 November 2010 (UTC)
This example has no randomized assignment of treatment to subjects. It seems that group-status is perfectly confounded with treatment, so this is a worthless "experiment". Kiefer.Wolfowitz ( talk) 20:48, 3 November 2010 (UTC)
In a first experiment, Group A is given vodka, Group B is given gin, and Group C is given a placebo. All groups are then tested with a memory task. A one-way ANOVA can be used to assess the effect of the various treatments (that is, the vodka, gin, and placebo).
In a second experiment, Group A is given vodka and tested on a memory task. The same group is allowed a rest period of five days and then the experiment is repeated with gin. The procedure is repeated using a placebo. A one-way ANOVA with repeated measures can be used to assess the effect of the vodka versus the impact of the placebo.
In a third experiment testing the effects of expectations, subjects are randomly assigned to four groups:
Each group is then tested on a memory task. The advantage of this design is that multiple variables can be tested at the same time instead of running two different experiments. Also, the experiment can determine whether one variable affects the other variable (known as interaction effects). A factorial ANOVA (2×2) can be used to assess the effect of expecting vodka or the placebo and the actual reception of either.
In a balanced design, the factors's induce an orthogonal decomposition of a Euclidean space; and the converse holds (see Bailey). First project the data onto the mean-value subspace, and then consider that subspace's orthogonal complement, which then needs be intersected with the subspaces of treatment & block subspaces (which may have further decompositions). The squared Euclidean norm of the projected residuals is the sum of squares. The degrees of freedom are the dimensions of the subspace.
With this orthogonality (orthomodularity), the sums of squares add nicely, regardless of any normality of the residuals.
This geometric account of Anova is given in friendlier fashion in Bailey, in Christensen, and in the very friendly Saville & Woods (in 2 volumes) for example. It should be given here. Kiefer.Wolfowitz ( talk) 21:11, 3 November 2010 (UTC)
This article suffers from obtuse pedagogy (it's essentially useless) to downright inaccurate information about ANOVA, its assumptions, and its small sample robustness and power properties. (The ANOVA F test of difference in means is robust to departures from independence, homoscedasticity, and/or normality? Tell that to the hundreds of Monte Carlo studies published since 1980!) A thorough reading of the Monte Carlo literature after 1980 would benefit this article greatly. My suggestion is that the current editors step back and ask for some help, preferably not from the asymptotic maths lobby, but from qualified applied statisticians who have read the literature post 1980 (but for starters, read Glass, Peckham, & Sanders, 1972; Bradley, 1969, 1972, etc.; Blair, 1980, 1981, 1985, etc.; Sawilowsky, 1990, 1992, etc.) It's just a suggestion - don't reach for the aspirin or saltines. Edstat ( talk) 03:49, 15 November 2010 (UTC)
Editor Edstat raised concerns about a non-normality, and about heteroscedacity (alternatively, differing variances, or a failure of homoscedacity!), etc. In the section on the randomization analysis, references to Cox and to Kempthorne are given to support the statement that a proper randomization procedure and unit-treatment additivity imply constant variance. Thus result explains why both Cox & Kempthorne (and Rosenbaum, Rubin, Imbens, Abadie, Angchrist, etc.) emphasize proper randomization and why they emphasize the unit-treatment additivity assumption. When this unit-treatment additivity is implausible, the analysis is more difficult (although local average unit-treatment additivity saves much of the standard analysis). While the article's few paragraphs are not a substitute for a textbook, they at least sketch the central issues, and reference the most reliable sources. Edstat's claim that normality is so important is not supported by the analysis by these authors, who are usually regarded as the most reliable sources. Kiefer.Wolfowitz ( talk) 17:25, 19 November 2010 (UTC)
After reading this article, I am still left with absolutely no idea how this technique is actually employed. There are many references to "treatments" -- is it used exclusively in medical research? A fully-worked example (including computation) would be a great boon. 121a0012 ( talk) 05:59, 5 January 2011 (UTC)
Please see the following section (copied below):
"Though, considering that η2 are comparable to r2 when df of the numerator equals 1 (both measures proportion of variance accounted for), these guidelines may overestimate the size of the effect. If going by the r guidelines (0.1 is a small effect, 0.3 a medium effect and 0.5 a large effect) then the equivalent guidelines for eta-squared would be the squareroot of these, i.e. 01 is a small effect, 0.09 a medium effect and 0.25 a large effect, and these should also be applicable to eta-squared. When the df of the numerator exceeds 1, eta-squared is comparable to R-squared (Levine & Hullett, 2002)."
Note that it is self-contradictory. First it says "η2 are comparable to r2 when df of the numerator equals 1" and later says "When the df of the numerator exceeds 1, eta-squared is comparable to R-squared". Any suggestions on which is correct?
I also suggest that this section is removed until consensus is reached.
Trevorzink ( talk) 02:28, 6 April 2011 (UTC)
On this day Kiefer.Wolfowitz and I worked in mild opposition. He was removing references that I was strengthening. I will wait a few days to let the dust settle. 159.83.196.1 ( talk) 01:51, 14 March 2012 (UTC)
Under-cited: Cox - mentioned often in text, but no specific references cited. Two references in list. Freedman - no specific reference cited. Two references in list. Kempthorne - often mentioned without citation. Two references in list. 159.83.196.1 ( talk) 01:54, 14 March 2012 (UTC)
Cox (1958) is an odd and interesting reference for ANOVA. Analysis of
variance does not appear in the index. Page 12: "...methods of
statistical analysis will not be described in this book." He talks
some about models, specifically additivity (on page 15) which is
applicable to this article. Is Cox (cited repeatedly) one of the best
references on a subject admittedly not discussed half a century ago?
I suspect a flawed citation. Cox makes frequent mention
(30 references) to Cochran & Cox (1957, 2e) Experimental Designs.
Page 292: "Cochran and Cox (1957) have given numerous detailed plans
as well as worked numerical examples of the analysis of biological
experiments."
Gertrude Mary Cox might be a better reference
than
Sir David Roxbee Cox for ANOVA.
Both books are statistical classics, available as reprints. Both
authors are mentioned favorably in
Design of experiments.
159.83.196.1 (
talk)
23:22, 1 May 2012 (UTC)
On April 4, 2012 readers of ANOVA assigned the following scores: Trustworthy 3.3 of 5; Objective 3.3 of 5; Complete 3.0 of 5; Well-written 2.3 of 5
Mathematicians assigned a "start" grade (less than a C). Statisticians assigned a grade of C. Both groups claimed ANOVA to be important. The currency of these grades is unknown.
The article was recently locked (no edits) for a few days over vandalism concerns. 159.83.196.1 ( talk) 21:04, 5 April 2012 (UTC)
(The following, or similar, needs to be added. It's only hinted at, in the article, and all the links substantiate it. Cites, of course, can be added. 72.37.249.60 ( talk) 19:06, 18 April 2012 (UTC) )
Consistent with ANOVA's purpose to analyze a variable's components attributable to different sources of variation, it is possible to view most any general linear model as a regression.
where the xi, i=1,2,...,p, are quantitative variables, in some cases merely 0 or 1 representing the absence or presence, respectively, of different levels of qualitative variables, and where their multiple degrees of freedom are distributed with df 1 assigned to each level as a separate variable. From this model, consistent with standard formulas for expectation and variance,
and bi2σi2 is the variance component of y attributable to xi. The size of the estimated variance component, usually relative to mse, the estimated σ2error, determines xi's significance in the model of y. At least asymptotic normality of the estimators is a fundamental assumption, allowing F-tests or t-tests of "H0:xi's contribution is insignificant.", but there are instances when the assumption of normality is unjustified and nonparametric alternatives to ANOVA are needed.
But "tests" are not basic to ANOVA, and neither are models involving distributions for models errors: read the label "analysis of variance". The basis is "explained variance" and a comparison of the explained variance of a sequence of models. Then for a sequence of nested models this becomes the question of how much more variance is explained by the extra complexity of one model expanded from another. Of course in general ANOVA there is not necessarily a unique way of sequencing a nested set of models. But the initial steps are getting the increase in explained variance in terms of increase in the sum of squares of the predictions and then converting these to a mean square. Basic information at an intuitive level (the relative importance of components in a model) can be gained simply by comparing the numerical values of these mean squares. All of this is at the level of "least squares" and does not involve modelling using either type of model for random components. Of course, once the "explained SS due to a model component" have been defined by least squares these can be interpreted in the context of either or both types of sources of random variation. At this stage the models provide, firstly, an extra guide as to how to interpret the mean squares (in terms of their expected values) and only then provide the possibility of formal hypothesis testing. ANOVA procedures are a tool of practical statistics, not something derived ab initio to be optimal under one pre-specified model in theoretical statistics and it would be misleading to present it as such.
Melcombe (
talk)
08:25, 20 April 2012 (UTC)
This article is opaque to those innocent of the terminology of the design of experiments. Defining terminology offers some help. Many technical terms remain undefined. Editors should feel particularly free to remove terms unused or already defined in the text. 159.83.196.1 ( talk) 23:31, 20 June 2012 (UTC)
While I improved the consistency of the math symbols used, the result was a rich mixture of fonts. A typesetter could substantially improve on my effort. 159.83.196.1 ( talk) 20:54, 7 July 2012 (UTC)
Derived linear model section: "However, there are differences. For example, the randomization-based analysis results in a small but (strictly) negative correlation between the observations." The statement is supported by two citations, both broken. Bailey's book does not have a section 1.14 (sorry). I found support in the equations (not the prose) of H&K. The structure of the covariance matrix equations implies that total errors are independent but that observational errors are not. This requires more explanation than the section justifies. I removed the weakly supported sentences. 159.83.196.1 ( talk) 23:30, 13 September 2012 (UTC)
Texts use different conventions regarding subscripts. Howell (2002) uses i as an index into experimental units and j as an index into treatment groups. Howell's convention is used here to define additivity. Montgomery (2001) follows a long tradition of reversing the roles of i & j. Montgomery's convention is hinted at here by defining I as the number of treatments. I propose to adopt Montgomery's convention regarding subscripts throughout. Objections? 159.83.196.1 ( talk) 21:34, 12 October 2012 (UTC)
The F-test is used in planning the experiment and the anova, because the non-centrality parameter shifts the F-distribution to the right. Using t-tests to plan experiments, as Bailey does in an otherwise fine book, results in larger numbers of subjects than needed, in many cases. This is not discussed, despite it being the main motivation. (Non-central t-distributions are less readily accessible, and don't appear in textbooks on Anova.)
03:22, 9 April 2013 (UTC)
A major enhancement, deserving expert review was made to the sister article: One-way analysis of variance 159.83.196.1 ( talk) 21:35, 30 October 2012 (UTC)
The weird nature of this articles formatting seems to lend to some copy and paste issue. I've addressed some of them; but this diff shows major issues between the formatting. I couldn't find any evidence of a CV myself; but the book source might be used. [3]
The section "ANOVA for multiple factors" points to the main article Two-way analysis of variance which is now a stub. Editors do not wish Wikipedia to contain lengthy examples. 159.83.196.1 ( talk) 20:03, 12 July 2013 (UTC)
The example in the introduction is excellent, but some elaboration is needed of the statement: "An attempt to explain the weight distribution by dividing the dog population into groups (young vs old)(short-haired vs long-haired) would probably be a failure (no fit at all)."
Someone thinking of the weight distribution as an empircal histogram can object that any histogram can be written as sum of histograms corresponding to subgroups of the population. The phrase "no fit at all" might be interpreted as a claim that blue histograms do not actually add to the yellow one.
What the sentence is trying to convey is that "success" at dividing up the population into categories means that if you are given the category of a dog, you can use the corresponding histogram to estimate the dogs weight well. Hence "no fit at all" refers to the fact that if you are given a dog is (for example) young then you can't make a good guess of the dog's weight by using the weight histogram of young dogs.
Tashiro ( talk) 06:18, 22 March 2014 (UTC)
The term "treatment" is apparently central in this exposition of ANOVA, it appears early in the "Background and terminology" section, but ... before it is defined!
The later definition of "treatment" in the "Design-of-experiments terms" section says it is "a combination of factor levels." What kind of combination of levels? A sum of the level numbers? Look up "factor" to find out about factor levels: a factor is an investigator-manipulated process that causes a change in output. What kind of process might this mean? Adding and removing data? Why would you do that? Output of what? Might this "output" refer to how variance changes when the investigator manipulates the data like this? How can a process have "levels?"
Please clarify. — Preceding unsigned comment added by Randallbsmith ( talk • contribs) 19:13, 1 May 2014 (UTC)
Hello fellow Wikipedians,
I have just modified 2 external links on Analysis of variance. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{
Sourcecheck}}
).
This message was posted before February 2018.
After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than
regular verification using the archive tool instructions below. Editors
have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
RfC before doing mass systematic removals. This message is updated dynamically through the template {{
source check}}
(last update: 5 June 2024).
Cheers.— InternetArchiveBot ( Report bug) 08:52, 12 October 2016 (UTC)
I think this is either wrong, or needs some clarification. Surely if the multiple t-tests are taken at face value and no correction is applied, then it is they that are less conservative than ANOVA, and not the other way around, as they will result in more type I errors. If 'conservative' here simply means 'producing fewer type I errors', then ANOVA is more conservative than uncorrected multiple t-tests, not less. Right? L T T H U ( talk) 11:30, 5 February 2016 (UTC)
I also noticed this and I agree. By definition, any statistical test should give type I errors with a frequency equal to the significance level -- so it does not depend on the test. I have marked this as requiring a citation. -- Denziloe ( talk) 16:37, 16 August 2017 (UTC)
Under "Design of Experiments Terms" for the Two Way ANOVA Table, the error DF reads (h-1)*(k-1). I believe this is incorrect. For a two way with no interaction term it should be N-h-k+1. Is that correct? — Preceding unsigned comment added by 2601:8C:C000:F9B3:C1CF:1AF3:44D:CA67 ( talk) 01:20, 11 December 2018 (UTC)
Article has bold phrases within quotations:
As a result: ANOVA "has long enjoyed the status of being the most used (some would say abused) statistical technique in psychological research."[14] ANOVA "is probably the most useful technique in the field of statistical inference."[15]
I suspect that this formatting doesn't appear in the original resources, and so should be either removed, or else noted as an editorial modification introduced here.
—DIV (
120.17.160.228 (
talk)
05:55, 21 January 2019 (UTC))
I tried to learn the basics of ANOVA by this article, but was forced to find better sources. Half the article goes by before the calculations involved are even defined, which may be fine for a textbook, but as a encyclopedia article this should get to the point much faster. In my opinion, most of the terminology, types of models, and characteristics should be moved later. The assumptions should also be moved later, but maybe add in a couple sentences of summary into the lede or where relevant. The background section should be shortened significantly: it's not necessary to explain every detail about statistical testing in this section. For example, let's take the following paragraph: "By construction, hypothesis testing limits the rate of Type I errors (false positives) to a significance level. Experimenters also wish to limit Type II errors (false negatives). The rate of Type II errors depends largely on sample size (the rate is larger for smaller samples), significance level (when the standard of proof is high, the chances of overlooking a discovery are also high) and effect size (a smaller effect size is more prone to Type II error)." This is basically the third time that hypothesis testing has been defined in this section (each time slightly differently) and so the whole thing can probably just be omitted. Readers can follow the link to statistical hypothesis testing if they want a detailed explanation
In particular, the "background and terminology" section is a rambling mess that doesn't actually provide any "background" and doesn't actually explain any "terminology" (except for sloppy definitions of a few general concepts in null hypothesis testing, such as statistical significance). Most of the section is unsourced, and the rest is unclearly sourced (e.g., what is Gelman, 2005?). I'm removing the section, since it likely adds more confusion than useful information. 23.242.195.76 ( talk) 08:22, 24 July 2021 (UTC)