This user subpage is currently inactive and is retained for historical reference. If you want to revive discussion regarding the subject, you might try contacting the user in question or seeking broader input via a forum such as the village pump. |
The datasets below are old (2006-7), tiny, and not useful except as a historical reference.
I was bored waiting for my very slow program to run, so I clicked "random article" 250 times and kept track of what kinds of articles popped up. 48 articles (19.2%) were stubs or had at least one cleanup tag. (I tried to count "citation needed" as a cleanup tag but may have missed a few.) The results as of 11 Nov 2006:
Type of article | Number | Percent of sample |
---|---|---|
Biography | 60 | 24% |
Places/geographical locations | 34 | 13.6% |
TV shows/movies | 17 | 6.8% |
Disambiguation | 15 | 6% |
Music/bands/albums | 14 | 5.6% |
Company/product/service | 13 | 5.2% |
History/war | 12 | 4.8% |
Politics/government | 9 | 3.6% |
Sports | 8 | 3.2% |
Organisms | 8 | 3.2% |
Definitions/common phrases/common objects | 7 | 2.8% |
Architecture/buildings | 7 | 2.8% |
Mythology/religion | 5 | 2% |
Astronomy/physics/space science | 5 | 2% |
Software/computing | 5 | 2% |
Games (including video) | 4 | 1.6% |
Literature/publications | 4 | 1.6% |
Biology/medicine | 3 | 1.2% |
Food/drink | 3 | 1.2% |
Schools | 3 | 1.2% |
Math | 2 | 0.8% |
Nonsense/unclassifiable | 2 | 0.8% |
Visual arts | 2 | 0.8% |
Philosophy/ethics | 2 | 0.8% |
Linguistics/languages | 2 | 0.8% |
Charities/nonprofit organizations | 2 | 0.8% |
Economics/finance | 1 | 0.4% |
Deleted and protected | 1 | 0.4% |
"Biography" is probably a bit overinflated because I classified everything about an individual real person as a biography, including historical figures. Articles about fictional characters went in the category of the corresponding fiction (TV, myth, etc.)
Obviously this is a lousy way to determine Wikipedia coverage - 250 articles is a tiny sample. But the advantage over, say, counting category populations is that this avoids duplicate-counting of articles in multiple categories and can find articles that are un- or miscategorized. Special:Random also (as far as I know) excludes recently created articles that haven't yet been indexed, which filters out lots of nonsense speedy candidates. I don't think Special:Random would exclude deletion candidates, but none of these had prod or AfD templates.
First-glance observations:
Inspired by Wikipedia:Wikipedia is failing and User:Worldtraveller/Wikipedia is failing (NB: leaving the redlink, in case further moves occur), I looked at a sample of 250 mainspace edits covering a time span of 04:43 to 04:46 UTC on 18 Feb 2007. (It would be interesting to gather these statistics again at a time when US schools are in session.) In this sample there were 159 edits by registered users, 89 edits by anonymous users, and 2 edits to a subsequently deleted image description page. Thus the percentages below take 248 edits as the total sample.
Change type | Percent of total sample (n = 248) | Percent by registered editors (n = 248) | Percent by anonymous editors (n = 248) | Percent of all registered edits (n = 159) | Percent of all anonymous edits (n = 89) |
---|---|---|---|---|---|
Substantial content changes | 5.2% | 4.0% | 1.2% | 6.3% | 3.4% |
Minor content changes | 28.6% | 17.3% | 11.3% | 27.0% | 31.5% |
Copyediting/formatting/wikilinking | 40.7% | 27.4% | 13.3% | 42.8% | 37.1% |
Tagging/maintenance | 8.5% | 6.5% | 2.0% | 10.1% | 5.6% |
Vandalism reversion | 8.9% | 7.3% | 1.6% | 11.3% | 4.5% |
Vandalism | 8.1% | 1.6% | 6.5% | 2.5% | 18.0% |
Other than determining whether an edit was vandalism, I did not make any value judgments. Thus, 'minor content changes' contains considerable amounts of unsourced material and original research that will certainly be reverted.
Other observations:
General thoughts:
This user subpage is currently inactive and is retained for historical reference. If you want to revive discussion regarding the subject, you might try contacting the user in question or seeking broader input via a forum such as the village pump. |
The datasets below are old (2006-7), tiny, and not useful except as a historical reference.
I was bored waiting for my very slow program to run, so I clicked "random article" 250 times and kept track of what kinds of articles popped up. 48 articles (19.2%) were stubs or had at least one cleanup tag. (I tried to count "citation needed" as a cleanup tag but may have missed a few.) The results as of 11 Nov 2006:
Type of article | Number | Percent of sample |
---|---|---|
Biography | 60 | 24% |
Places/geographical locations | 34 | 13.6% |
TV shows/movies | 17 | 6.8% |
Disambiguation | 15 | 6% |
Music/bands/albums | 14 | 5.6% |
Company/product/service | 13 | 5.2% |
History/war | 12 | 4.8% |
Politics/government | 9 | 3.6% |
Sports | 8 | 3.2% |
Organisms | 8 | 3.2% |
Definitions/common phrases/common objects | 7 | 2.8% |
Architecture/buildings | 7 | 2.8% |
Mythology/religion | 5 | 2% |
Astronomy/physics/space science | 5 | 2% |
Software/computing | 5 | 2% |
Games (including video) | 4 | 1.6% |
Literature/publications | 4 | 1.6% |
Biology/medicine | 3 | 1.2% |
Food/drink | 3 | 1.2% |
Schools | 3 | 1.2% |
Math | 2 | 0.8% |
Nonsense/unclassifiable | 2 | 0.8% |
Visual arts | 2 | 0.8% |
Philosophy/ethics | 2 | 0.8% |
Linguistics/languages | 2 | 0.8% |
Charities/nonprofit organizations | 2 | 0.8% |
Economics/finance | 1 | 0.4% |
Deleted and protected | 1 | 0.4% |
"Biography" is probably a bit overinflated because I classified everything about an individual real person as a biography, including historical figures. Articles about fictional characters went in the category of the corresponding fiction (TV, myth, etc.)
Obviously this is a lousy way to determine Wikipedia coverage - 250 articles is a tiny sample. But the advantage over, say, counting category populations is that this avoids duplicate-counting of articles in multiple categories and can find articles that are un- or miscategorized. Special:Random also (as far as I know) excludes recently created articles that haven't yet been indexed, which filters out lots of nonsense speedy candidates. I don't think Special:Random would exclude deletion candidates, but none of these had prod or AfD templates.
First-glance observations:
Inspired by Wikipedia:Wikipedia is failing and User:Worldtraveller/Wikipedia is failing (NB: leaving the redlink, in case further moves occur), I looked at a sample of 250 mainspace edits covering a time span of 04:43 to 04:46 UTC on 18 Feb 2007. (It would be interesting to gather these statistics again at a time when US schools are in session.) In this sample there were 159 edits by registered users, 89 edits by anonymous users, and 2 edits to a subsequently deleted image description page. Thus the percentages below take 248 edits as the total sample.
Change type | Percent of total sample (n = 248) | Percent by registered editors (n = 248) | Percent by anonymous editors (n = 248) | Percent of all registered edits (n = 159) | Percent of all anonymous edits (n = 89) |
---|---|---|---|---|---|
Substantial content changes | 5.2% | 4.0% | 1.2% | 6.3% | 3.4% |
Minor content changes | 28.6% | 17.3% | 11.3% | 27.0% | 31.5% |
Copyediting/formatting/wikilinking | 40.7% | 27.4% | 13.3% | 42.8% | 37.1% |
Tagging/maintenance | 8.5% | 6.5% | 2.0% | 10.1% | 5.6% |
Vandalism reversion | 8.9% | 7.3% | 1.6% | 11.3% | 4.5% |
Vandalism | 8.1% | 1.6% | 6.5% | 2.5% | 18.0% |
Other than determining whether an edit was vandalism, I did not make any value judgments. Thus, 'minor content changes' contains considerable amounts of unsourced material and original research that will certainly be reverted.
Other observations:
General thoughts: