This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | ← | Archive 4 | Archive 5 | Archive 6 | Archive 7 | Archive 8 |
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
I would like to inform you that the Hindi Wikipedia has much more than 50,000 articles due to which I think that it should be included on the main page by adding it to this template. -- Tow Trucker talk 07:57, 27 May 2012 (UTC)
This is inconsistent with Template:Wikipedias. — Preceding unsigned comment added by Ibicdlcod ( talk • contribs) 03:19, 24 August 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Could we add the Uzbek Wikipedia to the list of Wikipedias with more than 50,000 entries? The Uzbek Wikipedia is currently blocked in the territory of Uzbekistan. Listing uzwiki on the main page of enwiki would help us spread the word about the blockage. Nataev ( talk) 10:32, 15 November 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Salam / Hello. Hopefully someone can change the link for Malay Wikipedia from "More than 50,000 articles" to "More than 150,000 articles" section. Thank you - 26 Ramadan ( talk) 14:27, 14 December 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Per WP:HLIST, shouldn't this template use {{ flatlist}}? Gorobay ( talk) 20:06, 14 December 2012 (UTC)
hlist
.
Gorobay (
talk)
02:28, 26 January 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please add another row, above the 750,000 line, with:
formatted like the existing lines, and remove those from the 750,000 line; per meta:List of Wikipedias#1 000 000+ articles. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:39, 25 January 2013 (UTC)
Hello there, sorry, I couldn't find out how to start a new topic/request. I'd like to point out that several languages are missing from the 50,000-200,000 category, for example Georgian, which now has 69,000 articles. (Or is this intentional?) Thank you! — Preceding unsigned comment added by 176.241.48.244 ( talk) 21:46, 31 January 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please move the Spanish Wikipedia (español) to the top line; they now have a million+ articles; see es:Especial:Estadísticas. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:55, 17 May 2013 (UTC)
Should our mainpage include an interwiki link to the Hindi Wikipedia? ThaddeusB ( talk) 02:33, 19 May 2013 (UTC)
Language | Language (local) | Wiki | Articles | Edits | Active Users | Depth | Native Speakers (million) |
Malay | Bahasa Melayu | ms | 218667 | 3549558 | 346 | 18 | 77 |
Bulgarian | Български | bg | 147582 | 6078163 | 756 | 29 | 6.8 |
Estonian | Eesti | et | 111292 | 3744442 | 486 | 31 | 1.05 |
Hindi | हिन्दी | hi | 105420 | 2277175 | 210 | 45 | 180 |
Serbo-Croatian | Srpskohrvatski / Српскохрватски | sh | 81325 | 1661259 | 216 | 20 | 19 |
Tamil | தமிழ் | ta | 53465 | 1485352 | 253 | 32 | 70 |
I used each Wikipedia's random article button to view 50 articles and tabulated the results. (All are listed Wikipedias except Hindi.) I counted # of articles with pictures since it is somewhat more effort to add a picture and less likely to do done with a bot. Geographic stubs (including 1-2 sentence articles) were counted since these are especially easy targets for rapid stubbing. -- ThaddeusB ( talk) 18:23, 19 May 2013 (UTC)
Language | Full | Short | Stub | 1-2 sentences or just headers |
Dab | List | w/Pics | w/Maint. tags | Geo stubs | |
Malay | 2 | 2 | 10 | 35 | 1 | 16 | 2 | 35 | ||
Bulgarian | 4 | 5 | 32 | 5 | 4 | 20 | 9 | |||
Estonian | 1 | 5 | 32 | 10 | 1 | 1 | 11 | 4 | 12 | |
Hindi | 1 | 3 | 20 | 26 | 19 | 1 | 18 | |||
Serbo-Croatian | 3 | 7 | 24 | 14 | 2 | 20 | 7 | |||
Tamil | 1* | 6 | 17 | 23 | 2 | 1 | 18 | 2 | 3 |
Tamil's sole long article was tagged as a machine translation. Tamil avoided the geo stub temptation, but made up for it with tons of 1-2 lines on cricket players (in my test).
To be taken with a grain of salt as margin of error on a sample size of 50 is ~13.8%
Language | Decent articles | Stubs | articles+stubs | article+stub+1-2, excl. Geo |
Malay | 17500 | 43700 | 61200 | 61200 |
Bulgarian | 26600 | 94000 | 121000 | 109000 |
Estonian | 13300 | 71000 | 84600 | 80100 |
Hindi | 8400 | 42000 | 50600 | 67500 |
Serbo-Croatian | 16300 | 39000 | 55300 | 66700 |
Tamil | 7500 | 18000 | 25700 | 47000 |
-- ThaddeusB ( talk) 18:23, 19 May 2013 (UTC)
This is regarding the need of a better set of criteria for wikipedias to be listed in the main page and be judged in general (to aid in improvement of quality of wikipedia). I compared a set of articles in simple English (~90,000 articles and listed) and Hindi (~100,000 and unlisted). The method I used was to check the 10001st, 20001st, 30001st, 40001st, 50001st, 60001st and 70001st longest articles of the two wikipedias. After 70,000th articles, most of the articles were very short in both wikipedias. So, they were not compared. Here are the articles that I got
Based on this, I think that there is very less difference in length of articles between the two.
Also, out of the first 500 longest articles in the two, there were 158 "list of" articles (and numerous articles like simple:2012 in movies which are non-featured lists with very less paragraphic contents) in simple and only 8 "सूची" (list related) articles in Hindi (Please correct me if there are more lists in longest 500 articles in Hindi) which infers that there are more content based articles in longest 500 articles in Hindi than Simple.
About quality of wikipedia, regarding stubs and placeholders, I think there is essentially no difference whether they are generated by bots or not. Much more than that, I think a better judgement can be reached by comparing the quality of list of articles every Wikipedia should have. What is the use of an encyclopedia with pages this long for a country and comparatively this long for a dance?
At this point, I would also like to request the community here to have objective written standards based on reproducible numbers (featured articles, good articles) and statistics for judging the quality of wikipedia than just random 50 articles (of which some wikipedias are exempt and others not based on some other arbitrary criteria) which, in all fairness, looks like bias. Thank you.-- Eukesh ( talk) 19:20, 18 December 2012 (UTC)
On the rather thin evidence of a score of random button presses, I see only 30% that are more than a one-liner (85% for en.wiki, 60% of gl.wiki, 50% for simple.wiki, 40 % for gd.wiki, 35% for lij.wiki, 25% for ang.wiki) so it is clearly at a very early stage, especially considering the potential readership. So the question is probably: What is the purpose of listing at the Main Page? To bring attention and thereby fuel growth, or to reward the work done already? The Main Page already features nboth approaches (Featured articles and DYK). I'd opt for the draw attention so that it might grow in this case, especially as there will be much English-Hindi bilingualism. Kevin McE ( talk) 09:24, 19 May 2013 (UTC)
So the question is probably: What is the purpose of listing at the Main Page? To bring attention and thereby fuel growth, or to reward the work done already?
I did a bunch of random articles on Simple and they were mostly stubs too.
As you note, pretty much any stat can be "faked". However, the number of speakers is certainly something that can't be faked and I think it should be considered.
The placeholders are a concern; let's say they are 10% of Hindi.
Should they be penalized because some idiot(s) used a bot to create a bunch of useless articles?
As to the last point, the "50 article test" implicitly penalizes Wikipedias for having stubs.
If you have 200k articles and 90% are stubs, that is 20k "better" articles, but 5/50 in a test will be such articles. If you have 50k articles and 60% are stubs, that is same 20k better quality articles with 20/50 showing up in a 50 article test.
There are two Wikipedias with over 300k articles (ceb & war) that aren't listed because they are exceptionally poor quality, so clearly quality considerations still apply when 200k is hit (as well it should).
The intent may be to judge absolute # of quality articles, but if you are gauging that only on the %age found in a 50 sample you are preferencing Wikipedias closer to 50k articles than 200k. A 100k Wikipedia needs twice as many decent articles as a 50k Wikipedia to get the same percentage; A 200k Wikipedia needs 4x as many decent articles. You are penalizing Wikipedias for having stubs, at least if they are between 50k and 200k total articles.
You are the one who called the 50 article test the standard way of deciding, not me.
Hindi is not "obviously" in poor shape any more than many of the listed Wikipedias - all 6 in my test are in the 80-90% stubs and below range.
I do agree that there are concerns for "penalizing" Wikipedias that have aggressively created a ton of stubs, but the way around that is simply to take a larger sample of the random article test. I just did an independent 51 random article sample, of which there appeared to be 1 full article ( hi:गुप्त ऊर्जा), perhaps 2 short but decent-looking articles, 47 placeholders / stubs that had only a sentence or two and often empty headings, and 1 piece of apparent spam that had been ignored ( hi:अध्याय २ साख्यंयोग). That doesn't fill me with confidence that there are in fact 20K "real" articles hiding somewhere in Hindi WP. SnowFire ( talk) 05:27, 19 May 2013 (UTC)
Actually, given the margin of error, the differences are insignificant.
We either need a better test,
or your statement that only Wikipedias that are obviously inferior are denied is false.
By that measure, the following changes would be made: Apteva ( talk) 23:27, 20 May 2013 (UTC)
lc | language | speakers | views | articles | action |
---|---|---|---|---|---|
eo | Esperanto | 1 M | 7363 | 178052 | remove |
eu | Basque | 1 M | 6446 | 150260 | remove |
nn | Nynorsk | 5 M | 6050 | 99830 | remove |
gl | Galician | 4 M | 5280 | 101161 | remove |
kk | Kazakh | 12 M | 12730 | 199976 | add |
ka | Georgian | 4 M | 9957 | 74020 | add |
hi | Hindi | 550 M | 8231 | 97151 | add |
bs | Bosnian | 3 M | 8127 | 32466 | add |
az | Azeri | 27 M | 7986 | 94755 | add |
lv | Latvian | 2 M | 7974 | 47756 | add |
hi all this link may give some overall idea about indian languge wikipideas http://shijualex.in/analysis-of-the-indic-language-statistical-report-2012/ — Preceding unsigned comment added by 112.135.223.2 ( talk) 07:03, 22 May 2013 (UTC)
The method that I used to compare Hindi and Simple English wikipedia was a roughly based on percentile system to measure the central tendency of articles based on their sizes. It showed that the hypotheses that have been generated on the basis of 50 random article is inadequate as the sample size is too small. One can clearly see that the articles in Hindi (and also length of articles) is better than the corresponding article in Simple English, as per my findings. I strictly believe that better statistical methods must be used to measure the central tendency and dispersion of the articles. At least something better than 50 random articles. One can delve into subjective criteria of "quality" which is more prone to observer bias. Even after a study of central tendency shows that Hindi wiki is superior to Simple English when it comes to article size (and length of articles), people are still not willing to accept it. This clearly shows an observer bias. It is clearly an unscientific assumption as such. Hypothesis testing methods such as 50 random articles should be reported with its p value and other statistical parameters, which clearly has not been done. Also, why was 50 articles chosen in the first place? Why not 75 or 100 or 10? Is there any statistical relevance of 50? I don't think those adamant proponents of the test have any statistically relevant answer to that. I think central tendency and dispersion of all articles can be used to generate Gaussian curve, which would be the most accurate test for length of articles. In its absence, we can still use list of articles based on length to generate percentile system based data. Random sampling is only alternative when we do not have any of these methods to cater us. Just imagine a scenario where one is determining the height of a country of 100,000 people using 50 random people in the street as a standard method when you have the data of height of all the people of country! Once again, I am just delving into the quantitative dimension of the wikipedias as such. This has nothing to do with the qualitative aspect of it. Thank you. -- Eukesh ( talk) 00:48, 12 June 2013 (UTC)
I see lot of interesting discussion going on regarding inclusion of Hindi. At the same time, please evaluate Telugu wikipedia for inclusion in 50,000+ section. te wikipedia has considerably improved since it's last evaluation. Thanks -- వైజాసత్య ( talk) 08:49, 2 June 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please move the Swedish Wikipedia (svenska) to the top line, as they/we now have a million+ articles. See article Swedish Wikipedia for update and links to sources. Many thanks in advance.-- Paracel63 ( talk) 12:33, 16 June 2013 (UTC)
-- — Preceding unsigned comment added by Zlobny ( talk • contribs) 18:28, 18 July 2013 (UTC)
...As https://pl.wikipedia.org/wiki/ bumped past one million articles on Tuesday, or thereabouts. Rwessel ( talk) 00:51, 27 September 2013 (UTC)
This
edit request to
Template:Wikipedia_languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Polish Wikipedia has already more than 1 000 000 articles. -- Brateevsky ( talk to me) 18:54, 25 September 2013 (UTC)
copied from
Talk:Main_Page
The
Wikipedia in Serbian has well over 200000 articles, but it's listed in the "over 50000" category instead of the correct one for at least a few days now. Possibly the same goes for some of the others.
— Preceding
unsigned comment added by
Spa (
talk •
contribs) 12:23, 29 August 2013
Hello. Georgian wikipedia has now more than 75,000 articles and it should be included in the list. Can you please add the Georgian wiki page to the list? georgianJORJADZE 15:04, 20 August 2013 (UTC)
This
edit request to
Template:Wikipedia languages/core has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Update to use the language parser function for the title as well. Change is in the sandbox. Test cases here and here. Raw output diff here. Lfdder ( talk) 09:46, 29 September 2013 (UTC)
A somewhat arbitrary cutoff for quality based mostly on the depth column (>=~18) on the Meta page produces the following (200, 400, and 1000k are the same as they are currently):
This Wikipedia is written in English. Started in 2001 , it currently contains 6,850,293 articles. Many other Wikipedias are available; some of the largest are listed below.
If no one has any objections, I will make the change in a few days. Thingg ⊕ ⊗ 14:01, 11 October 2013 (UTC)
We no longer rely on "Depth" as a useful criterion to gauge anyone's quality, but it doesn't mean that we should rely on David Levy's previous findings that lack factual accuracy.
That is, if one Wikipedia was omitted several times in the past, it has to be taken with grain of salt in the future and stripped for good from the right to be included again.
A very fruitful discussion was open on this page earlier this year, with some users complaining on the use of the 50 random-article test and at the same time coming with new solutions to replace this evaluation method, but it unfortunately was archived with only an affirmation to take any action which didn't yield any serious results at all.
I see that the template has been recently updated with the inclusion of links to Georgian and Latvian Wikipedia, so it seems to be on time to demand inclusion of the Macedonian Wikipedia which now has more than 73,000 articles or 46% above the minimum threshold for inclusion.-- Kiril Simeonovski ( talk) 20:43, 8 October 2013 (UTC)
Sure it was discussed in the past, but things may rapidly change.
Your premature conclusion loosely based on your empirical evidence you've collected by using your own evaluation methods
demonstrates that the omission of one edition is for good and it will never change your personal opinion regardless of one Wikipedia's quality.
But please don't forget that the first time it was when the Macedonian Wikipedia reached 50,000 articles, the second time when it counted app. 62,000 and now it has more than 73,000.
Please also note that there has been a discussion (in which you were one of the fellow users who have commented) with different proposals to replace the old-fashioned random-article test with other techniques to evaluate one Wikipedia's quality.
Recently Albanian Wikipedia has surpased 50,000 articles and it needs to be included in the 50k category. Thank You. Visi90 ( talk) 13:14, 2 November 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Welsh Wikipedia has passed the 50,000 article threshold, with a depth of 43. As I'm not sure what the criteria is nowadays, I won't add it myself, but could someone look into this please? Optimist on the run ( talk) 22:25, 12 November 2013 (UTC)
What I see is most language pages are now fixated on passing this 50,000 magical figure with whatever means available and yes resorting to mechanized automatic creation of articles (mostly of villages and small localities which is easy to prepare from exhaustive lists and add minimum of changes and launch as articles) or created one or two line article pages for sportsmen or music artists and then apply for inclusion. It is high time we went back to the earlier 100,000 and more threshold. This will take out some of the less productive language Wikipedias. Second: As for the new higher 100,000 threshold, it is high time very clear guidelines and unified acceptance rules are set once and for all for all language Wikipedias and that they are well-informed about the rules to be able to know how they can comply with acceptance criteria, and to know when they can actually apply. As of now, I feel it is quite an arbitrary process which languages English Wikipedia is accepting. Was there a quality comparison test made and all those we see included are 100% legit full content Wikipedias? Can we reopen the files for some of the languages or are the ones there now final and set in stone? I guess the most urgent for now is raising the threshold to 100,000 as a priority. How many languages would qualify there from the ones we now have werldwayd ( talk) 21:57, 14 December 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Latin Wikipedia reached 100,000 today, so should surely be here somewhere. StevenJ81 ( talk) 21:42, 18 December 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please update the list. For example Kazakh Wikipedia, there is more than 200,000 articles. Arystanbek ( talk) 07:01, 23 February 2014 (UTC)
Given that the Dutch Wikipedia has expanded from ~800K entries to 1.8M entries over the past few years primarily through a couple of bots adding large numbers of stubs, I'm wondering if it should be demoted from the list in this template. FWIW, I did a quick 50 article survey, and hit 35 stubs, 10 very short articles (counted generously), four articles with significant content (again, being somewhat generous), and a DAB page. Rwessel ( talk) 03:10, 3 May 2014 (UTC)
User:Amire80 has added Urdu Wikipedia to the list. But after performing random 50 article sample search it appears most of the article on urwiki are stubs, thus failing "stubs and placeholders" criterion. — Bill william compton Talk 19:22, 24 April 2014 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Persian Wikipedia is reached to 400k and it can be moved to the upper level :) – ebraminio talk 15:14, 18 July 2014 (UTC)
To check the length of articles, there is a much better way than 50 random articles. 50 random articles is a very bad sampling method to check the length of articles. A "random count" might display 50 shortest or longest articles and the results are not uniform for repeated tests. Besides, it is a tedious task, often subject to bias. I would like to request users here to change it with quartiles or deciles as it provides a better picture of the distribution of length of articles.
Let's say there is a wiki with 1000 articles, check Q1 (25%-75% split of length of articles), Q2 (50%-50% split) and Q3 (75%-25% split) article. For this,
This will give a better picture of the range of page lengths than 50 random articles. For an even better view of distribution of length, decile can be used. Thank you-- Eukesh ( talk) 11:36, 6 July 2014 (UTC)
The discussions here are always based on "I performed 50 random article sample..." which is not very accurate. Eg- Hindi wikipedia has far more longer articles than Simple English for almost each decile (which I found from a modified Decile method and posted here in 2012). I checked article length of "decile"s myself as well as the bytes of articles. However, the users here were still not ready to believe it.
Coming to quality, 50 random articles method does not determine quality of articles at all! How does one assess the quality of articles in languages that one can not even read? There should be better method for quality assessment and judging the wikipedias by the basic articles and their status (featured, good, long etc) would provide a qualitative assessment for the same. Thank you.-- Eukesh ( talk) 12:34, 8 July 2014 (UTC)
Let's assume there's a Wikipedia with 2,000 really good (and long) articles and about 1,000,000 bot generated stubs. Among the 50 random articles there might be 48 to 50 stubs. But: Why should this Wikipedia be mentioned in the box with more than 1 million articles when it really has only 2,000 of them? The statistical method isn't really the problem here. -- 32X ( talk) 14:40, 14 August 2014 (UTC)
Hi,
The Esperanto Wikipedia just reached 200,000 articles. Could you edit the list accordingly ?
Thank you very much ! :) Thomas Guibal ( talk) 06:00, 14 August 2014 (UTC)
Hi. Could someone please add the Bosnian wikipedia to the specified list? The number of 50k articles has been passed some time ago. I did the random 50 article test, and based on these results it seems that this wikipedia is worth including. Regards, -- Edinwiki ( talk) 15:35, 6 June 2014 (UTC)
In Main Page following Wikipedia Editions are missing for 1000000+ articles section: Cebuano and Waray-Waray. Actually in List of Wikipedias statistics page, these two editions are clearly listed as having over one million articles. Administrators should kindly correct the information in the main page. I have read some studies on the foundation of Cebuano Language Wikipedia and used it in my thesis, really amazing that they have grown so fast. By the way, if these language editions are mostly consisting of stubs, shouldn't they at least locate under 400000+ or 200000+ section? I see them in this page (the template above), but do not see them in the Main Page. — Preceding unsigned comment added by 94.123.205.101 ( talk) 13:57, 9 September 2014 (UTC)
Basque (Euskara) currently shows 202057 at meta:List of Wikipedias. Talk:Main Page#Basque Wikipedia discusses whether this is supposed to move it to "More than 200,000". PrimeHunter ( talk) 01:22, 25 September 2014 (UTC)
Serbo-Croatian Wiki has passed 260k articles according to http://meta.wikimedia.org/wiki/List_of_Wikipedias#100_000.2B_articles but it's still listed as only 50k+. Could that be fixed? 93.136.48.197 ( talk) 06:24, 29 September 2014 (UTC)
Please update some other languages to 1.5 million articles. Qwertyxp2000 ( talk) 04:51, 22 December 2014 (UTC)
Extended content
|
---|
|
This
edit request to
Template:Wikipedia languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
please move Slovak Wikipedia (sk) to the 200,000 category 213.151.215.195 ( talk) 18:39, 15 February 2015 (UTC)
please move Urdu Wikipedia to the 50,000+ category.)
-- Obaid Raza ( talk) 15:31, 17 February 2015 (UTC)
This
edit request to
Template:Wikipedia languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Add the Macedonian Wikipedia to 50,000+ per this. Also Georgian, Occitan, Chechen, Newar / Nepal Bhasa, Urdu, Tamil, etc. Fauzan ✆ talk ✉ mail 14:02, 15 March 2015 (UTC)
Could you please add Macedonian (mk.wiki) as it has passed the 80.000 mark some time ago? Cheers! -- B. Jankuloski ( talk) 11:14, 5 April 2015 (UTC)
Comment: You are obvioiusly under the impression that our wiki mostly consists of very small articles, and that our situation is somehow similar to what it was years ago when we made the request. I can assure you that now his not the case at all, In the years that have passed since our last suggestion, we have created many articles of very good size and this painstaking labour and it is very untoward to dismiss it. What I am talking about can best be illustrated by tjis list of long pages. On it, this article ranks at no. 50.000 by length. As can be seen, there are 49.999 articles larger than it, and a good number of them considerably so. I am sure that we more than meet the relevant criteria for inclusion. I expect that that, whoever decides, will take an objective look at our wiki in accordance with the relevant criteria and conclude what I have just expounded. -- B. Jankuloski ( talk) 18:43, 9 April 2015 (UTC)
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | ← | Archive 4 | Archive 5 | Archive 6 | Archive 7 | Archive 8 |
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
I would like to inform you that the Hindi Wikipedia has much more than 50,000 articles due to which I think that it should be included on the main page by adding it to this template. -- Tow Trucker talk 07:57, 27 May 2012 (UTC)
This is inconsistent with Template:Wikipedias. — Preceding unsigned comment added by Ibicdlcod ( talk • contribs) 03:19, 24 August 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Could we add the Uzbek Wikipedia to the list of Wikipedias with more than 50,000 entries? The Uzbek Wikipedia is currently blocked in the territory of Uzbekistan. Listing uzwiki on the main page of enwiki would help us spread the word about the blockage. Nataev ( talk) 10:32, 15 November 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Salam / Hello. Hopefully someone can change the link for Malay Wikipedia from "More than 50,000 articles" to "More than 150,000 articles" section. Thank you - 26 Ramadan ( talk) 14:27, 14 December 2012 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Per WP:HLIST, shouldn't this template use {{ flatlist}}? Gorobay ( talk) 20:06, 14 December 2012 (UTC)
hlist
.
Gorobay (
talk)
02:28, 26 January 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please add another row, above the 750,000 line, with:
formatted like the existing lines, and remove those from the 750,000 line; per meta:List of Wikipedias#1 000 000+ articles. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:39, 25 January 2013 (UTC)
Hello there, sorry, I couldn't find out how to start a new topic/request. I'd like to point out that several languages are missing from the 50,000-200,000 category, for example Georgian, which now has 69,000 articles. (Or is this intentional?) Thank you! — Preceding unsigned comment added by 176.241.48.244 ( talk) 21:46, 31 January 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please move the Spanish Wikipedia (español) to the top line; they now have a million+ articles; see es:Especial:Estadísticas. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:55, 17 May 2013 (UTC)
Should our mainpage include an interwiki link to the Hindi Wikipedia? ThaddeusB ( talk) 02:33, 19 May 2013 (UTC)
Language | Language (local) | Wiki | Articles | Edits | Active Users | Depth | Native Speakers (million) |
Malay | Bahasa Melayu | ms | 218667 | 3549558 | 346 | 18 | 77 |
Bulgarian | Български | bg | 147582 | 6078163 | 756 | 29 | 6.8 |
Estonian | Eesti | et | 111292 | 3744442 | 486 | 31 | 1.05 |
Hindi | हिन्दी | hi | 105420 | 2277175 | 210 | 45 | 180 |
Serbo-Croatian | Srpskohrvatski / Српскохрватски | sh | 81325 | 1661259 | 216 | 20 | 19 |
Tamil | தமிழ் | ta | 53465 | 1485352 | 253 | 32 | 70 |
I used each Wikipedia's random article button to view 50 articles and tabulated the results. (All are listed Wikipedias except Hindi.) I counted # of articles with pictures since it is somewhat more effort to add a picture and less likely to do done with a bot. Geographic stubs (including 1-2 sentence articles) were counted since these are especially easy targets for rapid stubbing. -- ThaddeusB ( talk) 18:23, 19 May 2013 (UTC)
Language | Full | Short | Stub | 1-2 sentences or just headers |
Dab | List | w/Pics | w/Maint. tags | Geo stubs | |
Malay | 2 | 2 | 10 | 35 | 1 | 16 | 2 | 35 | ||
Bulgarian | 4 | 5 | 32 | 5 | 4 | 20 | 9 | |||
Estonian | 1 | 5 | 32 | 10 | 1 | 1 | 11 | 4 | 12 | |
Hindi | 1 | 3 | 20 | 26 | 19 | 1 | 18 | |||
Serbo-Croatian | 3 | 7 | 24 | 14 | 2 | 20 | 7 | |||
Tamil | 1* | 6 | 17 | 23 | 2 | 1 | 18 | 2 | 3 |
Tamil's sole long article was tagged as a machine translation. Tamil avoided the geo stub temptation, but made up for it with tons of 1-2 lines on cricket players (in my test).
To be taken with a grain of salt as margin of error on a sample size of 50 is ~13.8%
Language | Decent articles | Stubs | articles+stubs | article+stub+1-2, excl. Geo |
Malay | 17500 | 43700 | 61200 | 61200 |
Bulgarian | 26600 | 94000 | 121000 | 109000 |
Estonian | 13300 | 71000 | 84600 | 80100 |
Hindi | 8400 | 42000 | 50600 | 67500 |
Serbo-Croatian | 16300 | 39000 | 55300 | 66700 |
Tamil | 7500 | 18000 | 25700 | 47000 |
-- ThaddeusB ( talk) 18:23, 19 May 2013 (UTC)
This is regarding the need of a better set of criteria for wikipedias to be listed in the main page and be judged in general (to aid in improvement of quality of wikipedia). I compared a set of articles in simple English (~90,000 articles and listed) and Hindi (~100,000 and unlisted). The method I used was to check the 10001st, 20001st, 30001st, 40001st, 50001st, 60001st and 70001st longest articles of the two wikipedias. After 70,000th articles, most of the articles were very short in both wikipedias. So, they were not compared. Here are the articles that I got
Based on this, I think that there is very less difference in length of articles between the two.
Also, out of the first 500 longest articles in the two, there were 158 "list of" articles (and numerous articles like simple:2012 in movies which are non-featured lists with very less paragraphic contents) in simple and only 8 "सूची" (list related) articles in Hindi (Please correct me if there are more lists in longest 500 articles in Hindi) which infers that there are more content based articles in longest 500 articles in Hindi than Simple.
About quality of wikipedia, regarding stubs and placeholders, I think there is essentially no difference whether they are generated by bots or not. Much more than that, I think a better judgement can be reached by comparing the quality of list of articles every Wikipedia should have. What is the use of an encyclopedia with pages this long for a country and comparatively this long for a dance?
At this point, I would also like to request the community here to have objective written standards based on reproducible numbers (featured articles, good articles) and statistics for judging the quality of wikipedia than just random 50 articles (of which some wikipedias are exempt and others not based on some other arbitrary criteria) which, in all fairness, looks like bias. Thank you.-- Eukesh ( talk) 19:20, 18 December 2012 (UTC)
On the rather thin evidence of a score of random button presses, I see only 30% that are more than a one-liner (85% for en.wiki, 60% of gl.wiki, 50% for simple.wiki, 40 % for gd.wiki, 35% for lij.wiki, 25% for ang.wiki) so it is clearly at a very early stage, especially considering the potential readership. So the question is probably: What is the purpose of listing at the Main Page? To bring attention and thereby fuel growth, or to reward the work done already? The Main Page already features nboth approaches (Featured articles and DYK). I'd opt for the draw attention so that it might grow in this case, especially as there will be much English-Hindi bilingualism. Kevin McE ( talk) 09:24, 19 May 2013 (UTC)
So the question is probably: What is the purpose of listing at the Main Page? To bring attention and thereby fuel growth, or to reward the work done already?
I did a bunch of random articles on Simple and they were mostly stubs too.
As you note, pretty much any stat can be "faked". However, the number of speakers is certainly something that can't be faked and I think it should be considered.
The placeholders are a concern; let's say they are 10% of Hindi.
Should they be penalized because some idiot(s) used a bot to create a bunch of useless articles?
As to the last point, the "50 article test" implicitly penalizes Wikipedias for having stubs.
If you have 200k articles and 90% are stubs, that is 20k "better" articles, but 5/50 in a test will be such articles. If you have 50k articles and 60% are stubs, that is same 20k better quality articles with 20/50 showing up in a 50 article test.
There are two Wikipedias with over 300k articles (ceb & war) that aren't listed because they are exceptionally poor quality, so clearly quality considerations still apply when 200k is hit (as well it should).
The intent may be to judge absolute # of quality articles, but if you are gauging that only on the %age found in a 50 sample you are preferencing Wikipedias closer to 50k articles than 200k. A 100k Wikipedia needs twice as many decent articles as a 50k Wikipedia to get the same percentage; A 200k Wikipedia needs 4x as many decent articles. You are penalizing Wikipedias for having stubs, at least if they are between 50k and 200k total articles.
You are the one who called the 50 article test the standard way of deciding, not me.
Hindi is not "obviously" in poor shape any more than many of the listed Wikipedias - all 6 in my test are in the 80-90% stubs and below range.
I do agree that there are concerns for "penalizing" Wikipedias that have aggressively created a ton of stubs, but the way around that is simply to take a larger sample of the random article test. I just did an independent 51 random article sample, of which there appeared to be 1 full article ( hi:गुप्त ऊर्जा), perhaps 2 short but decent-looking articles, 47 placeholders / stubs that had only a sentence or two and often empty headings, and 1 piece of apparent spam that had been ignored ( hi:अध्याय २ साख्यंयोग). That doesn't fill me with confidence that there are in fact 20K "real" articles hiding somewhere in Hindi WP. SnowFire ( talk) 05:27, 19 May 2013 (UTC)
Actually, given the margin of error, the differences are insignificant.
We either need a better test,
or your statement that only Wikipedias that are obviously inferior are denied is false.
By that measure, the following changes would be made: Apteva ( talk) 23:27, 20 May 2013 (UTC)
lc | language | speakers | views | articles | action |
---|---|---|---|---|---|
eo | Esperanto | 1 M | 7363 | 178052 | remove |
eu | Basque | 1 M | 6446 | 150260 | remove |
nn | Nynorsk | 5 M | 6050 | 99830 | remove |
gl | Galician | 4 M | 5280 | 101161 | remove |
kk | Kazakh | 12 M | 12730 | 199976 | add |
ka | Georgian | 4 M | 9957 | 74020 | add |
hi | Hindi | 550 M | 8231 | 97151 | add |
bs | Bosnian | 3 M | 8127 | 32466 | add |
az | Azeri | 27 M | 7986 | 94755 | add |
lv | Latvian | 2 M | 7974 | 47756 | add |
hi all this link may give some overall idea about indian languge wikipideas http://shijualex.in/analysis-of-the-indic-language-statistical-report-2012/ — Preceding unsigned comment added by 112.135.223.2 ( talk) 07:03, 22 May 2013 (UTC)
The method that I used to compare Hindi and Simple English wikipedia was a roughly based on percentile system to measure the central tendency of articles based on their sizes. It showed that the hypotheses that have been generated on the basis of 50 random article is inadequate as the sample size is too small. One can clearly see that the articles in Hindi (and also length of articles) is better than the corresponding article in Simple English, as per my findings. I strictly believe that better statistical methods must be used to measure the central tendency and dispersion of the articles. At least something better than 50 random articles. One can delve into subjective criteria of "quality" which is more prone to observer bias. Even after a study of central tendency shows that Hindi wiki is superior to Simple English when it comes to article size (and length of articles), people are still not willing to accept it. This clearly shows an observer bias. It is clearly an unscientific assumption as such. Hypothesis testing methods such as 50 random articles should be reported with its p value and other statistical parameters, which clearly has not been done. Also, why was 50 articles chosen in the first place? Why not 75 or 100 or 10? Is there any statistical relevance of 50? I don't think those adamant proponents of the test have any statistically relevant answer to that. I think central tendency and dispersion of all articles can be used to generate Gaussian curve, which would be the most accurate test for length of articles. In its absence, we can still use list of articles based on length to generate percentile system based data. Random sampling is only alternative when we do not have any of these methods to cater us. Just imagine a scenario where one is determining the height of a country of 100,000 people using 50 random people in the street as a standard method when you have the data of height of all the people of country! Once again, I am just delving into the quantitative dimension of the wikipedias as such. This has nothing to do with the qualitative aspect of it. Thank you. -- Eukesh ( talk) 00:48, 12 June 2013 (UTC)
I see lot of interesting discussion going on regarding inclusion of Hindi. At the same time, please evaluate Telugu wikipedia for inclusion in 50,000+ section. te wikipedia has considerably improved since it's last evaluation. Thanks -- వైజాసత్య ( talk) 08:49, 2 June 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please move the Swedish Wikipedia (svenska) to the top line, as they/we now have a million+ articles. See article Swedish Wikipedia for update and links to sources. Many thanks in advance.-- Paracel63 ( talk) 12:33, 16 June 2013 (UTC)
-- — Preceding unsigned comment added by Zlobny ( talk • contribs) 18:28, 18 July 2013 (UTC)
...As https://pl.wikipedia.org/wiki/ bumped past one million articles on Tuesday, or thereabouts. Rwessel ( talk) 00:51, 27 September 2013 (UTC)
This
edit request to
Template:Wikipedia_languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Polish Wikipedia has already more than 1 000 000 articles. -- Brateevsky ( talk to me) 18:54, 25 September 2013 (UTC)
copied from
Talk:Main_Page
The
Wikipedia in Serbian has well over 200000 articles, but it's listed in the "over 50000" category instead of the correct one for at least a few days now. Possibly the same goes for some of the others.
— Preceding
unsigned comment added by
Spa (
talk •
contribs) 12:23, 29 August 2013
Hello. Georgian wikipedia has now more than 75,000 articles and it should be included in the list. Can you please add the Georgian wiki page to the list? georgianJORJADZE 15:04, 20 August 2013 (UTC)
This
edit request to
Template:Wikipedia languages/core has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Update to use the language parser function for the title as well. Change is in the sandbox. Test cases here and here. Raw output diff here. Lfdder ( talk) 09:46, 29 September 2013 (UTC)
A somewhat arbitrary cutoff for quality based mostly on the depth column (>=~18) on the Meta page produces the following (200, 400, and 1000k are the same as they are currently):
This Wikipedia is written in English. Started in 2001 , it currently contains 6,850,293 articles. Many other Wikipedias are available; some of the largest are listed below.
If no one has any objections, I will make the change in a few days. Thingg ⊕ ⊗ 14:01, 11 October 2013 (UTC)
We no longer rely on "Depth" as a useful criterion to gauge anyone's quality, but it doesn't mean that we should rely on David Levy's previous findings that lack factual accuracy.
That is, if one Wikipedia was omitted several times in the past, it has to be taken with grain of salt in the future and stripped for good from the right to be included again.
A very fruitful discussion was open on this page earlier this year, with some users complaining on the use of the 50 random-article test and at the same time coming with new solutions to replace this evaluation method, but it unfortunately was archived with only an affirmation to take any action which didn't yield any serious results at all.
I see that the template has been recently updated with the inclusion of links to Georgian and Latvian Wikipedia, so it seems to be on time to demand inclusion of the Macedonian Wikipedia which now has more than 73,000 articles or 46% above the minimum threshold for inclusion.-- Kiril Simeonovski ( talk) 20:43, 8 October 2013 (UTC)
Sure it was discussed in the past, but things may rapidly change.
Your premature conclusion loosely based on your empirical evidence you've collected by using your own evaluation methods
demonstrates that the omission of one edition is for good and it will never change your personal opinion regardless of one Wikipedia's quality.
But please don't forget that the first time it was when the Macedonian Wikipedia reached 50,000 articles, the second time when it counted app. 62,000 and now it has more than 73,000.
Please also note that there has been a discussion (in which you were one of the fellow users who have commented) with different proposals to replace the old-fashioned random-article test with other techniques to evaluate one Wikipedia's quality.
Recently Albanian Wikipedia has surpased 50,000 articles and it needs to be included in the 50k category. Thank You. Visi90 ( talk) 13:14, 2 November 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Welsh Wikipedia has passed the 50,000 article threshold, with a depth of 43. As I'm not sure what the criteria is nowadays, I won't add it myself, but could someone look into this please? Optimist on the run ( talk) 22:25, 12 November 2013 (UTC)
What I see is most language pages are now fixated on passing this 50,000 magical figure with whatever means available and yes resorting to mechanized automatic creation of articles (mostly of villages and small localities which is easy to prepare from exhaustive lists and add minimum of changes and launch as articles) or created one or two line article pages for sportsmen or music artists and then apply for inclusion. It is high time we went back to the earlier 100,000 and more threshold. This will take out some of the less productive language Wikipedias. Second: As for the new higher 100,000 threshold, it is high time very clear guidelines and unified acceptance rules are set once and for all for all language Wikipedias and that they are well-informed about the rules to be able to know how they can comply with acceptance criteria, and to know when they can actually apply. As of now, I feel it is quite an arbitrary process which languages English Wikipedia is accepting. Was there a quality comparison test made and all those we see included are 100% legit full content Wikipedias? Can we reopen the files for some of the languages or are the ones there now final and set in stone? I guess the most urgent for now is raising the threshold to 100,000 as a priority. How many languages would qualify there from the ones we now have werldwayd ( talk) 21:57, 14 December 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Latin Wikipedia reached 100,000 today, so should surely be here somewhere. StevenJ81 ( talk) 21:42, 18 December 2013 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Please update the list. For example Kazakh Wikipedia, there is more than 200,000 articles. Arystanbek ( talk) 07:01, 23 February 2014 (UTC)
Given that the Dutch Wikipedia has expanded from ~800K entries to 1.8M entries over the past few years primarily through a couple of bots adding large numbers of stubs, I'm wondering if it should be demoted from the list in this template. FWIW, I did a quick 50 article survey, and hit 35 stubs, 10 very short articles (counted generously), four articles with significant content (again, being somewhat generous), and a DAB page. Rwessel ( talk) 03:10, 3 May 2014 (UTC)
User:Amire80 has added Urdu Wikipedia to the list. But after performing random 50 article sample search it appears most of the article on urwiki are stubs, thus failing "stubs and placeholders" criterion. — Bill william compton Talk 19:22, 24 April 2014 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Persian Wikipedia is reached to 400k and it can be moved to the upper level :) – ebraminio talk 15:14, 18 July 2014 (UTC)
To check the length of articles, there is a much better way than 50 random articles. 50 random articles is a very bad sampling method to check the length of articles. A "random count" might display 50 shortest or longest articles and the results are not uniform for repeated tests. Besides, it is a tedious task, often subject to bias. I would like to request users here to change it with quartiles or deciles as it provides a better picture of the distribution of length of articles.
Let's say there is a wiki with 1000 articles, check Q1 (25%-75% split of length of articles), Q2 (50%-50% split) and Q3 (75%-25% split) article. For this,
This will give a better picture of the range of page lengths than 50 random articles. For an even better view of distribution of length, decile can be used. Thank you-- Eukesh ( talk) 11:36, 6 July 2014 (UTC)
The discussions here are always based on "I performed 50 random article sample..." which is not very accurate. Eg- Hindi wikipedia has far more longer articles than Simple English for almost each decile (which I found from a modified Decile method and posted here in 2012). I checked article length of "decile"s myself as well as the bytes of articles. However, the users here were still not ready to believe it.
Coming to quality, 50 random articles method does not determine quality of articles at all! How does one assess the quality of articles in languages that one can not even read? There should be better method for quality assessment and judging the wikipedias by the basic articles and their status (featured, good, long etc) would provide a qualitative assessment for the same. Thank you.-- Eukesh ( talk) 12:34, 8 July 2014 (UTC)
Let's assume there's a Wikipedia with 2,000 really good (and long) articles and about 1,000,000 bot generated stubs. Among the 50 random articles there might be 48 to 50 stubs. But: Why should this Wikipedia be mentioned in the box with more than 1 million articles when it really has only 2,000 of them? The statistical method isn't really the problem here. -- 32X ( talk) 14:40, 14 August 2014 (UTC)
Hi,
The Esperanto Wikipedia just reached 200,000 articles. Could you edit the list accordingly ?
Thank you very much ! :) Thomas Guibal ( talk) 06:00, 14 August 2014 (UTC)
Hi. Could someone please add the Bosnian wikipedia to the specified list? The number of 50k articles has been passed some time ago. I did the random 50 article test, and based on these results it seems that this wikipedia is worth including. Regards, -- Edinwiki ( talk) 15:35, 6 June 2014 (UTC)
In Main Page following Wikipedia Editions are missing for 1000000+ articles section: Cebuano and Waray-Waray. Actually in List of Wikipedias statistics page, these two editions are clearly listed as having over one million articles. Administrators should kindly correct the information in the main page. I have read some studies on the foundation of Cebuano Language Wikipedia and used it in my thesis, really amazing that they have grown so fast. By the way, if these language editions are mostly consisting of stubs, shouldn't they at least locate under 400000+ or 200000+ section? I see them in this page (the template above), but do not see them in the Main Page. — Preceding unsigned comment added by 94.123.205.101 ( talk) 13:57, 9 September 2014 (UTC)
Basque (Euskara) currently shows 202057 at meta:List of Wikipedias. Talk:Main Page#Basque Wikipedia discusses whether this is supposed to move it to "More than 200,000". PrimeHunter ( talk) 01:22, 25 September 2014 (UTC)
Serbo-Croatian Wiki has passed 260k articles according to http://meta.wikimedia.org/wiki/List_of_Wikipedias#100_000.2B_articles but it's still listed as only 50k+. Could that be fixed? 93.136.48.197 ( talk) 06:24, 29 September 2014 (UTC)
Please update some other languages to 1.5 million articles. Qwertyxp2000 ( talk) 04:51, 22 December 2014 (UTC)
Extended content
|
---|
|
This
edit request to
Template:Wikipedia languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
please move Slovak Wikipedia (sk) to the 200,000 category 213.151.215.195 ( talk) 18:39, 15 February 2015 (UTC)
please move Urdu Wikipedia to the 50,000+ category.)
-- Obaid Raza ( talk) 15:31, 17 February 2015 (UTC)
This
edit request to
Template:Wikipedia languages has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Add the Macedonian Wikipedia to 50,000+ per this. Also Georgian, Occitan, Chechen, Newar / Nepal Bhasa, Urdu, Tamil, etc. Fauzan ✆ talk ✉ mail 14:02, 15 March 2015 (UTC)
Could you please add Macedonian (mk.wiki) as it has passed the 80.000 mark some time ago? Cheers! -- B. Jankuloski ( talk) 11:14, 5 April 2015 (UTC)
Comment: You are obvioiusly under the impression that our wiki mostly consists of very small articles, and that our situation is somehow similar to what it was years ago when we made the request. I can assure you that now his not the case at all, In the years that have passed since our last suggestion, we have created many articles of very good size and this painstaking labour and it is very untoward to dismiss it. What I am talking about can best be illustrated by tjis list of long pages. On it, this article ranks at no. 50.000 by length. As can be seen, there are 49.999 articles larger than it, and a good number of them considerably so. I am sure that we more than meet the relevant criteria for inclusion. I expect that that, whoever decides, will take an objective look at our wiki in accordance with the relevant criteria and conclude what I have just expounded. -- B. Jankuloski ( talk) 18:43, 9 April 2015 (UTC)