This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | ← | Archive 3 | Archive 4 | Archive 5 | Archive 6 |
Wikipedia:Article size → Wikipedia:Manual of Style (article size) — Consolidating naming per Wikipedia_talk:Manual_of_Style#Poll Gnevin ( talk) 16:28, 24 May 2010 (UTC)
It's getting toward 2011 now, and I'm feeling more and more that we need to revisit the technical side of this. At some point WP has to stop catering to broken, obsolete technology like browsers from the 1990s and early 2000s. While "reader fatigue" is a real issue, and WP:SUMMARY provides a way to solve that problem, not all articles are intended to be read from top to bottom, and will become user-unfriendly and downright editor-hateful if split into multiple pages. I have anecdotal but to me rather strong evidence that the technical aspects are essentially an obsolete – a long and very, very linked-to glossary list article, has reported absolutely zero length-related problems in over 4.5 years.
The most obvious cases are list articles of various sorts, including glossaries. People (other than really bored people with way too much time on their hands) do not usually try to read such articles from top to bottom; they load the page and search for the term they are interested in, if a #-link didn't bring them directly to it from another page. Splitting such pages makes in-page search more difficult, and actually frustrates readers' ability to find information. E.g. if I search for "foo" in a long one-page list, I may find a "foo" entry, and/or various mentions of "foo" as applied in several contexts, in various other entries, while if the article is split, I might not find a "foo" entry because it's in another page, and assume there's no information on the topic, and/or I may miss a lot of contextual information about "foo" because I have not realized there is more of it in another article in the split series).
I have put off converting
Glossary of cue sports terms, one of the articles I have worked on the most, into a split article for some time (it's been tagged with {{
Longish}}
for 2.5+ years), because I have yet to see one single case of someone's browser accidentally truncating the page, a user reporting a crash or other technical problem, or any reader suggesting that the document is too long for simple human usability reasons. This despite the article being over 240K, being linked to (usually many times at different entries) in almost
all cue sports articles, and being edited nearly two-thousand times, by registered and anon users from all over the world, with greatly varying levels of technological currency/obsolescence. I've also resisted splitting because {{
Cuegloss}}
would have to be redone in a very complex way that would so complicate use that most editors who bother to use it now to create helpful glossary links for non-billiards-expert users would surely abandon it. I can no longer see any good reason for (and do see several good reasons against) splitting this or any similar article, even though until recently I have long been tinkering with test code for splitting the article and adapting the templates that work with it, to comply with this guideline's article length advice.
At any rate, if after 1900+ edits by hundreds of users the page has never been truncated by a browser that can't handle long textedit fields, this strongly suggests that the truncation concern is no longer a valid one in anything near significant numbers of incidents; such browsers are today so rare that the odds of it happening are now so low that it need not be even mentioned here, and if it does happen, it will be obvious and someone will fix it.
I propose that a partial rewrite is also in order to strongly suggest that most types of list articles remain unsplit, either regardless of length, or unless longer than X where this variable is some number we arrive at that is very much more than the current number, like maybe 1MB. Lists that are easily divided into clear sub-topical sections each with numerous entries could be given as a clear exception, something that perhaps should be split after 100K or so, such as events relating to some topic in the 1700s, 1800s, 1900s, or vehicles manufactured by Ford, BMW, Toyota, etc.). But most list articles, including glossaries, are not divisible logically this way, only arbitrarily, and WP:SUMMARY cannot logically apply to them. They are not intended for start-to-end reading, but for in-page searching. Meanwhile, splitting them not only greatly impedes such searching, it makes creation and use of tools that work with such articles (e.g. Template:Cuegloss) much more difficult.
— SMcCandlish Talk⇒ ʕ(Õلō)ˀ Contribs. 10:03, 17 September 2010 (UTC)
I keep these two as redirects to "Wikipedia talk:Article size" because I had significant trouble finding the article in the first place.-- Jax 0677 ( talk) 12:15, 9 January 2011 (UTC)
I tried following the instructions at Wikipedia:Article size#Measuring "readable prose" size with the Time travel article, but I didn't see any "page size warning" in the preview. Did I miss it, or do you only get such a warning if the size is above a certain limit? Either way I think the specifics of where and when such a warning appears should be mentioned in this section. Hypnosifl ( talk) 04:33, 3 March 2011 (UTC)
The guide appears to be referring to standalone lists - "Lists, tables, and articles summarizing certain fields are exceptions." Though, surely, the advice relating to not splitting certain lists would also apply to embedded lists. If a list is constructed in such a way that splitting it or summarising it would be inappropriate, then it doesn't really matter if the list is a standalone or is embedded, if it shouldn't be split, then it shouldn't be split.
What does "articles summarizing certain fields" refer to?
Possible new wording: Lists, tables, and material summarizing certain fields are exceptions. If there is no "natural" way to split long lists or tables, it may be best to leave them intact. They act as summaries and starting points and in the case of some broad subjects or lists either do not have a natural division point or are more easily word-searched as a single set. This is especially the case when buttressing cites are repeated throughout the list or table. In such cases, the list or table should nonetheless be kept as short as feasible.
Does the paragraph regarding "Major subsections..." belong in the Exceptions: Lists, Tables section? SilkTork * Tea time 18:49, 26 June 2011 (UTC)
As far as your suggestion goes, I don't have an opinion yet. I find this to be one of the most puzzling guidelines on Wikipedia, particularly since it's been labeled an "editing guideline". Good raise 08:29, 27 June 2011 (UTC)
The "How to find articles by size" section says "You can find the size of a page including the markup in kilobytes [kb] from the page history". I have tried this for several long articles (examples [1] [2]) and I do not see anything on the history pages indicating article size. Am I just missing it, or was this feature removed? Does one have to use one of the "external tools" to find an article's size now? Should this guideline be updated? -- IllaZilla ( talk) 17:31, 1 September 2011 (UTC)
On 6 September I introduced the WP:SIZEGUIDE shortcut as a replacement for WP:SIZERULE, on the grounds that the criteria are not a hard-and-fast rule, and because "readable prose size" is all too often mistaken for "article size". It was reverted on 19 September without a refutal of my reason for doing so. Per WP:BRD, I am starting this discussion to see whether that was one editor, or whether there is a wider consensus to continue referring to it as a "rule". — WFC— 09:51, 20 September 2011 (UTC)
Over discussions over the articles iOS version history, Android version history, and History of iOS jailbreaking, I've noticed this guideline been trumped out as a reason that, despite failing policy (in the former two, flagrantly and inherently violating WP:NOT#DIRECTORY, in the latter, systemic violations of WP:V throughout the parent article leading to blind application of the guideline), the articles should remain separate if split due to SIZE, which leads to an interesting question: if split articles fall below 32-40KB of readable prose, should they be merged back? And on the point of the latter article, should cleanup be recommended before splitting? Sceptre ( talk) 23:38, 22 October 2011 (UTC)
Include redirect WP:AS in "Shortcuts" box. 71.146.20.62 ( talk) 03:44, 28 November 2011 (UTC)
Images have been discussed here before with regard to the total size of an article. Even though an image may only have a few tens of characters of text in the edit box, the thumbnail of that image will require a few 100 kb of download bandwidth. The TOOLONG guideline should give some indication of the upper limit of acceptable image use. For instance, this version of the article List of American Civil War Generals (Confederate) contains 237 thumbnail images, each requiring about 162 kb in my 1024x768 browser window. This makes for a very unwieldy article of more than 38 Mb! Let's add a paragraph about the TOOLONG problems associated with too much bandwidth taken by images. Binksternet ( talk) 20:25, 31 March 2012 (UTC)
Apologies if this is the wrong venue to ask such a question, if somewhere in meta might be better to ask, but, sometimes, for very long pages, like the RfC for Mohammad images, or some old archives, it significantly slows down my browser when viewing them. If only there was some kind of feature that could allow me to set it so that when viewing pages or page histories, it would automatically break off the page at a user-specified amount (like, break off to "page 2" or "page 3" etc. if the next section makes the page exceed 200kb). This is especially a problem in very old archive pages where it would be inconvenient to break apart a page despite being long. Does such a feature exist?-- New questions? 18:21, 10 April 2012 (UTC)
I notice the article does much to address questions concerning antique (turn of the century) computers and browsers, but doesn't mention something new, namely Help:Mobile access. I do nearly all my editing by five years old hardware with a nice big screen and DSL connection, but much of my reading is away from home, on my palm sized smartphone or my hand sized Android tablet. These automatically go to the .m. mobile page which shows only the lead and the top section titles until I tap the title.
Alas, such accommodation becomes inadequate when the article is long. Either the list of sections is too long, or it inadequately guides me to the desired information, or each section upon opening overwhelms my ability to understand it on the little screen, or all the above. And where there's no Wi-Fi and 3g coverage is poor, it takes a long time to load the page. Surely I'm not unusual among readers in facing these problems, and the number of affected users will only increase with the popularity of smartphones and smaller tablet computers (even with the relatively large iPad it's somewhat a problem). Do we need a new section? Jim.henderson ( talk) 17:42, 27 April 2012 (UTC)
I have raised a query at Wikipedia:Village_pump_(policy)#Splitting_articles_arbitrarily about the bit in WP:SIZE#Very large articles where it says very large articles may be split arbitrarily. I think this is okay for very large lists but not articles and see no sense in this end run round notability. I believe is an article is large enough to require splitting there will always be subtopics which satisfy notability. Dmcq ( talk) 00:52, 24 May 2012 (UTC)
The first issue here is that Dmcq is arguing against 5 years of consensus on the verbatim phrasing "If necessary, split the article arbitrarily", and that this text should be preserved on that ground alone. Many many editors have read that sentence and understood that "arbitrarily" means, not "randomly" or "at an arbitrary point", but basically "according to some clear local-consensus method". Therefore there is no need to change the wording now.
The second issue is that Dmcq has affirmed the POV that the many many subarticles split according to some clear local-consensus method are actually "notable", such as Later life of Isaac Newton (arbitrarily starting 1693), House and Senate career of John McCain, until 2000 (arbitrarily 1981-2000), and Cultural impact of the Guitar Hero series (arbitrary subset of notable topic). If more examples are needed I can oblige. Another POV is that these are not notable but widely accepted because they are spinouts. Given Dmcq's POV, if the word "arbitrarily" is understood as it has been for 5 years, there is no reason to change it, because the articles will by the POV's definition be "notable"; and given the alternate (stricter) POV of notability, the word "arbitrarily" is very necessary to permit articles that fail this strict N standard.
Now, as to Dmcq's Alan's wording (assuming we remove the stray "them" in the last line), the last sentence seems to be chopped up for no reason; the rest assumes that it is always possible to split the content into logically separate articles. Dmcq has rejected all counterexamples, but they at least prove that it is at least sometimes colorable that the content is not split logically and notably. I don't think this assumption necessary in case there should arise a consensus that there is no notable way to split a very long article.
Further, Dmcq has opened the same discussion on two talk pages for some reason; I have invited the VPP to centralize here.
I think the whole problem is that Dmcq is reading "arbitrarily" as "at an arbitrary point", which is a novel or original reading of the guideline. There is no evidence the guideline needs adjustment. JJB 02:58, 24 May 2012 (UTC)
Alan, sorry I didn't realize that was your wording. Longevity of a guideline is a silent consensus that it works for many editors. Yes, the word "arbitrarily" could be clarified based on your concerns, but the assumption that it's always possible to split logically should not be suddenly added to the text. Possible proposed change: For instance, the meaning of the sentence could be clarified as, "If this is not possible, split the article according to local consensus." But the rest should stand for the reasons above.
Dmcq, the discussion I previously linked shows that there is no supermajority consensus on notability of spinoffs, so this will not be decided at VPP today. The fact is that we have many nonnotable spinoffs, and they often survive AFD (or more commonly are never nommed). The stated rationales vary (one reason for unclear consensus): sometimes it's SNG shoehorning, sometimes it's a relatively loose N affirmation like your own, sometimes it's recognized as a spinoff, sometimes it's recognized as a pointy nom that would imbalance a set, sometimes it's a merge that affirms the spinoff principle. Since you seem to define "notable" as including most of the adhoc local-consensus splits as well as allowing the various nonnotable large-list splits (although that would include the list of poker events), I really don't know that there is an issue for VPP besides your finding the word "arbitrarily" to be ambiguous. JJB 04:36, 24 May 2012 (UTC)
There may or may not be a technical reasons for splitting an article (not all access is via machines with large memories (virtual or otherwise), but there is definitely an economic and other reasons for doing so. Many people have to pay for every byte they download (either because they live in a country where the Telecoms use that model for charging for broadband access), or because their mobile service provider charges that way for mobile hand held devices (phones, tablets etc). There is also the case when a person accessing the net is connected via a free wifi service with a data limit on downloading (eg at a library). -- PBS ( talk) 11:36, 31 May 2012 (UTC)
I'll try just deleting the whole technical issues section. The 400k limit in it is far larger than the recommended size limits anyway. We could add a bit in the size section about larger sizes causing problems with slow connections as well as causing readability problems but that's about it I think. Dmcq ( talk) 22:53, 7 June 2012 (UTC)
We could bring it back, replacing the tersely ambiguous "arbitrary" with something clumsier and more precise such as "regardless of other considerations" and the overly precise "400K" with something longer and vaguer such as "hunreds of kilobytes". Jim.henderson ( talk) 13:40, 10 June 2012 (UTC)
You are invited to join the discussion at Wikipedia talk:Summary style#RfC: Should the summary style guideline quote WP:Notability and if so in what place.
This RfC is to decide the specific changes discussed at in Wikipedia:VPP#Splitting_articles_arbitrarily. This may affect the notability of subarticles and is related to the RfC above. Dmcq ( talk) 19:09, 1 June 2012 (UTC)
I would like to start a discussion about how to split the Biggest Loser South Africa article. In my opinion, it should either be a minimum number of tables, or should be split into several smaller articles, as 600 kB is ridiculous. Thoughts?-- Jax 0677 ( talk) 18:51, 10 June 2012 (UTC)
{{
citation needed}}
.
Dmcq (
talk) 20:45, 10 June 2012 (UTC)
With recent events, such as deletion of Ashton Kutcher on Twitter, Personal life of Jennifer Lopez, and Rihanna on Twitter, bad splitting (and awful transclusion) of List of Codename: Kids Next Door episodes, and bad use of Template:very long, is there something generally wrong with this guideline? Is it consistent with other policies and guidelines? -- George Ho ( talk) 00:00, 4 July 2012 (UTC)
As Dmcq said, I must respond. Writing about one topic can result a big article. Nevertheless, writing about a subtopic must be consistent with applicable policies and guidelines; otherwise, a subtopic article may be at risk of deletion, like Ashton Kutcher on Twitter. I'll rephrase the "consistency" part: Does this guideline have to mention WP:What Wikipedia is not when it comes to articles of topics and subtopics? Why can't this guide mention about any other policies and guidelines? -- George Ho ( talk) 14:24, 4 July 2012 (UTC)
I've changed the rule of thumb to refer to the markup size as given in the history. It is pretty obvious that people have used this normally and mean this. There would be no point talking about limits on sortable tables otherwise as readable prose doesn't include tables according to the bit at the beginning. Dmcq ( talk) 14:07, 4 July 2012 (UTC)
The overwhelming consensus here is to go with readable prose size. The Blade of the Northern Lights ( 話して下さい) 21:22, 4 August 2012 (UTC) |
Should WP:SIZERULE, a rule of thumb guideline for saying an article is too large, refer to the 'markup size' of an article given by the size in its history - the number of bytes downloaded when an edit is done, or should it refer to the 'readable prose size' given by the text excluding any footnotes and reference sections ("see also", "external links", bibliography, etc.); diagrams and images; tables and lists; Wikilinks and external URLs; and formatting and mark-up as given by the script User:Dr_pda/prosesize? There is a discussion above in Wikipedia talk:Article size#Rule of thumb. Dmcq ( talk) 14:52, 4 July 2012 (UTC)
Is this the right way to say about articles, or must this nutshell be reworded? -- George Ho ( talk) 03:25, 6 July 2012 (UTC)
How is one supposed to calculate the Readable Prose size. Can the method be included in the article somewhere? Op47 ( talk) 13:00, 6 July 2012 (UTC)
The major thing one should do if things are too long is see if large subtopics can be cut out as detailed in WP:SS. If there are no major subtopics and the article is very long then it is worth wondering whether there is sufficient weight for the inclusion of some of the contents. Personally I haven't seen any very long articles not filled with trivia or where large bits shouldn't be cut out into notable subtopics. The only ones that aren't like that are list articles which have special provisions for splitting.
I don't think the prose text argument for the rule of thumb holds any weight because the WP:Article size#Readability issues says 50k is a limit and yet the rule of thumb is being taken as meaning there are no such issues till the prose text goes over 100k. People should be thinking about size before getting to 50k and should have it definitely in mind over 50k, 100k is a very negative aspect of a featured article.
Wikitext size tends to be about twice the size of prose text for featured articles though it can be more if citations have a lot of extra text in them, and it can be much more if there are tables since prose text ignores table sizes. The advantage of wikitext is that it is directly supported by Special:Longpages and the article history whereas prose text is not well supported. Thus if it is a reasonable measure it is a better measure as a rule of thumb. As to the current rule of thumb if 100k was interpreted as wikitext it would put prose size at about 50k before one should definitely consider splitting which is about right by the readability issues section.
The other point in wikitext's favour is a consideration of a mjor problem in Wikipedia. People are dumping in databases of sports results, election results, manga characters etc etc into Wikipedia. These articles come out as only a few hundred bytes as far as prose text is concerned but often involve downloading over a megabyte and quite long times to display even ignoring mobile browsing. Rule of thumb is supposed to be something easy, if featured article people want to polish an article that is another business and they can take time over it, but dealing with the great mass of rubbish being stuck into Wikipedia requires more everyday tools.
For straightforward articles which are okayy wikitext is as good as prose text as a rule of thumb and for everyday use talking about articles which are too large consisting of huge tables prose text is simply useless. As to lists they have their own rules and can be split fairly arbitrarily but their rules stop them being misused as database dumps in quite the same way as normal articles. Dmcq ( talk) 15:26, 7 July 2012 (UTC)
By the way I have also just set up WP:VPT#Section viewing to start thinking about coping better with long articles on mobile devices. Probably somebody else has been at this sort of thing before but getting changes in to the wiki software isn't that easy.. Dmcq ( talk) 15:44, 7 July 2012 (UTC)
Ok, so I did some research, I read this article: Loading today's sites over dialup about load times.
I also played around with this web analyser tool: [5], I pointed it at a few article pages, the main page, and a couple of featured articles.
It looks like, roughly, an article that is about 1 megabyte of load size takes about 2-2.5 minutes on a 56K modem. The main page is about 80K and takes about 20 seconds. Yesterdays Calgary Stampede FA was 222K and would load in about 51 seconds, and the rocket page is 500K, 1:52 on a 56K modem (note that long articles tend to load a bit quicker than you would expect because of queue latency at the webserver that hurts short articles more.)
I don't have any reason to think that those articles are particular large, but I tentatively suggest we write down 1 megabyte to limit the maximum size, and to have a rule of thumb that it's all good up to 250K (a load time of under a minute).
To put this in perspective, according to that article the average size on the wider web is a bit over 1M and the load time on a 56K modem is about 2 minutes 30, so although 2 minutes is a long load, it's still above average.
I mean in an ideal world I would prefer everything to load in 5 seconds on 56K modem, but that's not going to happen, I don't think people want an encyclopedia web page that looks like it's 1993. So we have to be a bit reasonable.
webpage load size | Header text |
---|---|
<250K | good |
250K-500K | acceptable |
500K-1M | consider shrinking or splitting |
>1M | should be split |
So I would suggest that we add that, in addition to the rule of thumb on prose length. Does that sound OK? GliderMaven ( talk) 16:12, 7 July 2012 (UTC)
wikitext size | What to do |
---|---|
< 30k | normally too small to split |
< 60k | good readability |
60k - 120k | acceptable but splitting may be helpful |
120k - 250k | can have readability issues, consider shrinking or splitting |
> 250k | almost certainly should be split |
URL: http://en.wikipedia.org/wiki/Calgary_Stampede |
---|
Title: Calgary Stampede - Wikipedia, the free encyclopedia |
Date: Report run on Sun Jul 8 09:07:20EDT2012 |
Diagnosis |
Global Statistics |
Total HTTP Requests: 33 |
Total Size: 222714 bytes |
Object Size Totals |
Object type Size (bytes) Download @ 56K (seconds) Download @ T1 (seconds) |
HTML: 52324 10.63 0.48 |
HTML Images: 124925 29.50 5.26 |
CSS Images: 15941 3.78 0.68 |
Total Images: 140866 33.28 5.94 |
Javascript: 23997 5.78 1.13 |
CSS: 5527 1.30 0.23 |
Multimedia: 0 0.00 0.00 |
Other: 0 0.00 0.00 |
URL: http://en.wikipedia.org/wiki/Rocket |
---|
Title: Rocket - Wikipedia, the free encyclopedia |
Date: Report run on Sun Jul 8 09:30:45EDT2012 |
Diagnosis |
Global Statistics |
Total HTTP Requests: 79 |
Total Size: 500598 bytes |
Object Size Totals |
Object type Size (bytes) Download @ 56K (seconds) Download @ T1 (seconds) |
HTML: 81986 16.54 0.63 |
HTML Images: 367824 86.91 15.55 |
CSS Images: 15941 3.78 0.68 |
Total Images: 383765 90.69 16.23 |
Javascript: 29320 7.04 1.36 |
CSS: 5527 1.30 0.23 |
Multimedia: 0 0.00 0.00 |
Other: 0 0.00 0.00 |
Note that the example given in a previous thread, Wikipedia talk:Article size#Images as part of the total download of an article to a browser, is 38MB (!!) worth of page load because of the hundreds of thumbnail images. Pages like that must be cut down by taking away the images or by splitting. Our guideline must recommend an upper limit for that kind of silliness. 1MB seems reasonable. Binksternet ( talk) 01:19, 9 July 2012 (UTC)
Seems to me all three major ways of measuring size are relevant to different users. The majority of users are presumably readers who found their way via Web search and are ignorant of the article's topic. For those who use a desktop or large laptop screen and fast wired connection, quick comprehensibility is the main design consideration, which makes readable prose size the proper measure. For those using their mobile phone screen as I often do when merely reading, or a small tablet, small prose size is even more important for avoiding getting lost in an article with sections either too large or too numerous, and download size also becomes important for those of us with slow mobile radio connections.
Editors ignorant of the fine points discussed here, who will remain the majority of editors for an indefinite period, only know markup size, because that's what's in the watchlist entry. Those who edit on a small screen or slow connection are again even more interested in markup size. And of course many readers who use small mobile screens, including me often, will be reading one or another of the "Mobile Web" versions and sometimes the official Wikipedia mobile app or an unofficial one, most of which will present pictures with a smaller thumbnail than the "Desktop version" that the majority of deskbound readers use. So, yeah, all these methods of setting limits ought to be taken into consideration, but markup size is the only one the majority of editors will use until the others are as easily reported as that one. Jim.henderson ( talk) 03:13, 19 July 2012 (UTC)
Originally the only measure for size was the byte count. The words "readable prose" were introduced in 2004 to point out that tables, lists, and markup were not to be included, but did not adjust the suggested counts that involved. [7] Prior to that it was clear that the only count that was used was the byte count - see [8] and [9] to see that this article is 15 kb (that was before you could just click history to find out the current size. In fact the suggested sizes have increased, not decreased, while using a measure that gives a smaller size. This has compounded the problem of pages being too long. Apteva ( talk) 20:06, 19 September 2012 (UTC)
This section is now disputed because changes have been proposed and because information of subtopic in article dedicated to main topic may be either decent or excessive. To establish a straw poll, you can create a subheading below with a touch of RFC tag. -- George Ho ( talk) 16:52, 19 July 2012 (UTC)
Marimba is a musical instrument made and played by the Lozi people of the western province of Zambia — Preceding unsigned comment added by 101.119.24.76 ( talk) 01:35, 11 August 2012 (UTC)
Am I using Template:Size correctly, on the massively oversized List of historic places in Quebec? Seeing the equally massive logo it places atop, I'm unsure. Should this go on the Talk page? The template documentation is unclear, at least, to me. Shawn in Montreal ( talk) 15:42, 23 August 2012 (UTC)
{{#ifexpr: <!---1---> {{PAGESIZE:{{FULLPAGENAME}}|R}} >= 102400 |<!---2---> [[File:Ui Yellowexclamation.png| {{#ifeq: {{{big|no}}} | yes | 70px |35px}} link= Template:Longish]]
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | ← | Archive 3 | Archive 4 | Archive 5 | Archive 6 |
Wikipedia:Article size → Wikipedia:Manual of Style (article size) — Consolidating naming per Wikipedia_talk:Manual_of_Style#Poll Gnevin ( talk) 16:28, 24 May 2010 (UTC)
It's getting toward 2011 now, and I'm feeling more and more that we need to revisit the technical side of this. At some point WP has to stop catering to broken, obsolete technology like browsers from the 1990s and early 2000s. While "reader fatigue" is a real issue, and WP:SUMMARY provides a way to solve that problem, not all articles are intended to be read from top to bottom, and will become user-unfriendly and downright editor-hateful if split into multiple pages. I have anecdotal but to me rather strong evidence that the technical aspects are essentially an obsolete – a long and very, very linked-to glossary list article, has reported absolutely zero length-related problems in over 4.5 years.
The most obvious cases are list articles of various sorts, including glossaries. People (other than really bored people with way too much time on their hands) do not usually try to read such articles from top to bottom; they load the page and search for the term they are interested in, if a #-link didn't bring them directly to it from another page. Splitting such pages makes in-page search more difficult, and actually frustrates readers' ability to find information. E.g. if I search for "foo" in a long one-page list, I may find a "foo" entry, and/or various mentions of "foo" as applied in several contexts, in various other entries, while if the article is split, I might not find a "foo" entry because it's in another page, and assume there's no information on the topic, and/or I may miss a lot of contextual information about "foo" because I have not realized there is more of it in another article in the split series).
I have put off converting
Glossary of cue sports terms, one of the articles I have worked on the most, into a split article for some time (it's been tagged with {{
Longish}}
for 2.5+ years), because I have yet to see one single case of someone's browser accidentally truncating the page, a user reporting a crash or other technical problem, or any reader suggesting that the document is too long for simple human usability reasons. This despite the article being over 240K, being linked to (usually many times at different entries) in almost
all cue sports articles, and being edited nearly two-thousand times, by registered and anon users from all over the world, with greatly varying levels of technological currency/obsolescence. I've also resisted splitting because {{
Cuegloss}}
would have to be redone in a very complex way that would so complicate use that most editors who bother to use it now to create helpful glossary links for non-billiards-expert users would surely abandon it. I can no longer see any good reason for (and do see several good reasons against) splitting this or any similar article, even though until recently I have long been tinkering with test code for splitting the article and adapting the templates that work with it, to comply with this guideline's article length advice.
At any rate, if after 1900+ edits by hundreds of users the page has never been truncated by a browser that can't handle long textedit fields, this strongly suggests that the truncation concern is no longer a valid one in anything near significant numbers of incidents; such browsers are today so rare that the odds of it happening are now so low that it need not be even mentioned here, and if it does happen, it will be obvious and someone will fix it.
I propose that a partial rewrite is also in order to strongly suggest that most types of list articles remain unsplit, either regardless of length, or unless longer than X where this variable is some number we arrive at that is very much more than the current number, like maybe 1MB. Lists that are easily divided into clear sub-topical sections each with numerous entries could be given as a clear exception, something that perhaps should be split after 100K or so, such as events relating to some topic in the 1700s, 1800s, 1900s, or vehicles manufactured by Ford, BMW, Toyota, etc.). But most list articles, including glossaries, are not divisible logically this way, only arbitrarily, and WP:SUMMARY cannot logically apply to them. They are not intended for start-to-end reading, but for in-page searching. Meanwhile, splitting them not only greatly impedes such searching, it makes creation and use of tools that work with such articles (e.g. Template:Cuegloss) much more difficult.
— SMcCandlish Talk⇒ ʕ(Õلō)ˀ Contribs. 10:03, 17 September 2010 (UTC)
I keep these two as redirects to "Wikipedia talk:Article size" because I had significant trouble finding the article in the first place.-- Jax 0677 ( talk) 12:15, 9 January 2011 (UTC)
I tried following the instructions at Wikipedia:Article size#Measuring "readable prose" size with the Time travel article, but I didn't see any "page size warning" in the preview. Did I miss it, or do you only get such a warning if the size is above a certain limit? Either way I think the specifics of where and when such a warning appears should be mentioned in this section. Hypnosifl ( talk) 04:33, 3 March 2011 (UTC)
The guide appears to be referring to standalone lists - "Lists, tables, and articles summarizing certain fields are exceptions." Though, surely, the advice relating to not splitting certain lists would also apply to embedded lists. If a list is constructed in such a way that splitting it or summarising it would be inappropriate, then it doesn't really matter if the list is a standalone or is embedded, if it shouldn't be split, then it shouldn't be split.
What does "articles summarizing certain fields" refer to?
Possible new wording: Lists, tables, and material summarizing certain fields are exceptions. If there is no "natural" way to split long lists or tables, it may be best to leave them intact. They act as summaries and starting points and in the case of some broad subjects or lists either do not have a natural division point or are more easily word-searched as a single set. This is especially the case when buttressing cites are repeated throughout the list or table. In such cases, the list or table should nonetheless be kept as short as feasible.
Does the paragraph regarding "Major subsections..." belong in the Exceptions: Lists, Tables section? SilkTork * Tea time 18:49, 26 June 2011 (UTC)
As far as your suggestion goes, I don't have an opinion yet. I find this to be one of the most puzzling guidelines on Wikipedia, particularly since it's been labeled an "editing guideline". Good raise 08:29, 27 June 2011 (UTC)
The "How to find articles by size" section says "You can find the size of a page including the markup in kilobytes [kb] from the page history". I have tried this for several long articles (examples [1] [2]) and I do not see anything on the history pages indicating article size. Am I just missing it, or was this feature removed? Does one have to use one of the "external tools" to find an article's size now? Should this guideline be updated? -- IllaZilla ( talk) 17:31, 1 September 2011 (UTC)
On 6 September I introduced the WP:SIZEGUIDE shortcut as a replacement for WP:SIZERULE, on the grounds that the criteria are not a hard-and-fast rule, and because "readable prose size" is all too often mistaken for "article size". It was reverted on 19 September without a refutal of my reason for doing so. Per WP:BRD, I am starting this discussion to see whether that was one editor, or whether there is a wider consensus to continue referring to it as a "rule". — WFC— 09:51, 20 September 2011 (UTC)
Over discussions over the articles iOS version history, Android version history, and History of iOS jailbreaking, I've noticed this guideline been trumped out as a reason that, despite failing policy (in the former two, flagrantly and inherently violating WP:NOT#DIRECTORY, in the latter, systemic violations of WP:V throughout the parent article leading to blind application of the guideline), the articles should remain separate if split due to SIZE, which leads to an interesting question: if split articles fall below 32-40KB of readable prose, should they be merged back? And on the point of the latter article, should cleanup be recommended before splitting? Sceptre ( talk) 23:38, 22 October 2011 (UTC)
Include redirect WP:AS in "Shortcuts" box. 71.146.20.62 ( talk) 03:44, 28 November 2011 (UTC)
Images have been discussed here before with regard to the total size of an article. Even though an image may only have a few tens of characters of text in the edit box, the thumbnail of that image will require a few 100 kb of download bandwidth. The TOOLONG guideline should give some indication of the upper limit of acceptable image use. For instance, this version of the article List of American Civil War Generals (Confederate) contains 237 thumbnail images, each requiring about 162 kb in my 1024x768 browser window. This makes for a very unwieldy article of more than 38 Mb! Let's add a paragraph about the TOOLONG problems associated with too much bandwidth taken by images. Binksternet ( talk) 20:25, 31 March 2012 (UTC)
Apologies if this is the wrong venue to ask such a question, if somewhere in meta might be better to ask, but, sometimes, for very long pages, like the RfC for Mohammad images, or some old archives, it significantly slows down my browser when viewing them. If only there was some kind of feature that could allow me to set it so that when viewing pages or page histories, it would automatically break off the page at a user-specified amount (like, break off to "page 2" or "page 3" etc. if the next section makes the page exceed 200kb). This is especially a problem in very old archive pages where it would be inconvenient to break apart a page despite being long. Does such a feature exist?-- New questions? 18:21, 10 April 2012 (UTC)
I notice the article does much to address questions concerning antique (turn of the century) computers and browsers, but doesn't mention something new, namely Help:Mobile access. I do nearly all my editing by five years old hardware with a nice big screen and DSL connection, but much of my reading is away from home, on my palm sized smartphone or my hand sized Android tablet. These automatically go to the .m. mobile page which shows only the lead and the top section titles until I tap the title.
Alas, such accommodation becomes inadequate when the article is long. Either the list of sections is too long, or it inadequately guides me to the desired information, or each section upon opening overwhelms my ability to understand it on the little screen, or all the above. And where there's no Wi-Fi and 3g coverage is poor, it takes a long time to load the page. Surely I'm not unusual among readers in facing these problems, and the number of affected users will only increase with the popularity of smartphones and smaller tablet computers (even with the relatively large iPad it's somewhat a problem). Do we need a new section? Jim.henderson ( talk) 17:42, 27 April 2012 (UTC)
I have raised a query at Wikipedia:Village_pump_(policy)#Splitting_articles_arbitrarily about the bit in WP:SIZE#Very large articles where it says very large articles may be split arbitrarily. I think this is okay for very large lists but not articles and see no sense in this end run round notability. I believe is an article is large enough to require splitting there will always be subtopics which satisfy notability. Dmcq ( talk) 00:52, 24 May 2012 (UTC)
The first issue here is that Dmcq is arguing against 5 years of consensus on the verbatim phrasing "If necessary, split the article arbitrarily", and that this text should be preserved on that ground alone. Many many editors have read that sentence and understood that "arbitrarily" means, not "randomly" or "at an arbitrary point", but basically "according to some clear local-consensus method". Therefore there is no need to change the wording now.
The second issue is that Dmcq has affirmed the POV that the many many subarticles split according to some clear local-consensus method are actually "notable", such as Later life of Isaac Newton (arbitrarily starting 1693), House and Senate career of John McCain, until 2000 (arbitrarily 1981-2000), and Cultural impact of the Guitar Hero series (arbitrary subset of notable topic). If more examples are needed I can oblige. Another POV is that these are not notable but widely accepted because they are spinouts. Given Dmcq's POV, if the word "arbitrarily" is understood as it has been for 5 years, there is no reason to change it, because the articles will by the POV's definition be "notable"; and given the alternate (stricter) POV of notability, the word "arbitrarily" is very necessary to permit articles that fail this strict N standard.
Now, as to Dmcq's Alan's wording (assuming we remove the stray "them" in the last line), the last sentence seems to be chopped up for no reason; the rest assumes that it is always possible to split the content into logically separate articles. Dmcq has rejected all counterexamples, but they at least prove that it is at least sometimes colorable that the content is not split logically and notably. I don't think this assumption necessary in case there should arise a consensus that there is no notable way to split a very long article.
Further, Dmcq has opened the same discussion on two talk pages for some reason; I have invited the VPP to centralize here.
I think the whole problem is that Dmcq is reading "arbitrarily" as "at an arbitrary point", which is a novel or original reading of the guideline. There is no evidence the guideline needs adjustment. JJB 02:58, 24 May 2012 (UTC)
Alan, sorry I didn't realize that was your wording. Longevity of a guideline is a silent consensus that it works for many editors. Yes, the word "arbitrarily" could be clarified based on your concerns, but the assumption that it's always possible to split logically should not be suddenly added to the text. Possible proposed change: For instance, the meaning of the sentence could be clarified as, "If this is not possible, split the article according to local consensus." But the rest should stand for the reasons above.
Dmcq, the discussion I previously linked shows that there is no supermajority consensus on notability of spinoffs, so this will not be decided at VPP today. The fact is that we have many nonnotable spinoffs, and they often survive AFD (or more commonly are never nommed). The stated rationales vary (one reason for unclear consensus): sometimes it's SNG shoehorning, sometimes it's a relatively loose N affirmation like your own, sometimes it's recognized as a spinoff, sometimes it's recognized as a pointy nom that would imbalance a set, sometimes it's a merge that affirms the spinoff principle. Since you seem to define "notable" as including most of the adhoc local-consensus splits as well as allowing the various nonnotable large-list splits (although that would include the list of poker events), I really don't know that there is an issue for VPP besides your finding the word "arbitrarily" to be ambiguous. JJB 04:36, 24 May 2012 (UTC)
There may or may not be a technical reasons for splitting an article (not all access is via machines with large memories (virtual or otherwise), but there is definitely an economic and other reasons for doing so. Many people have to pay for every byte they download (either because they live in a country where the Telecoms use that model for charging for broadband access), or because their mobile service provider charges that way for mobile hand held devices (phones, tablets etc). There is also the case when a person accessing the net is connected via a free wifi service with a data limit on downloading (eg at a library). -- PBS ( talk) 11:36, 31 May 2012 (UTC)
I'll try just deleting the whole technical issues section. The 400k limit in it is far larger than the recommended size limits anyway. We could add a bit in the size section about larger sizes causing problems with slow connections as well as causing readability problems but that's about it I think. Dmcq ( talk) 22:53, 7 June 2012 (UTC)
We could bring it back, replacing the tersely ambiguous "arbitrary" with something clumsier and more precise such as "regardless of other considerations" and the overly precise "400K" with something longer and vaguer such as "hunreds of kilobytes". Jim.henderson ( talk) 13:40, 10 June 2012 (UTC)
You are invited to join the discussion at Wikipedia talk:Summary style#RfC: Should the summary style guideline quote WP:Notability and if so in what place.
This RfC is to decide the specific changes discussed at in Wikipedia:VPP#Splitting_articles_arbitrarily. This may affect the notability of subarticles and is related to the RfC above. Dmcq ( talk) 19:09, 1 June 2012 (UTC)
I would like to start a discussion about how to split the Biggest Loser South Africa article. In my opinion, it should either be a minimum number of tables, or should be split into several smaller articles, as 600 kB is ridiculous. Thoughts?-- Jax 0677 ( talk) 18:51, 10 June 2012 (UTC)
{{
citation needed}}
.
Dmcq (
talk) 20:45, 10 June 2012 (UTC)
With recent events, such as deletion of Ashton Kutcher on Twitter, Personal life of Jennifer Lopez, and Rihanna on Twitter, bad splitting (and awful transclusion) of List of Codename: Kids Next Door episodes, and bad use of Template:very long, is there something generally wrong with this guideline? Is it consistent with other policies and guidelines? -- George Ho ( talk) 00:00, 4 July 2012 (UTC)
As Dmcq said, I must respond. Writing about one topic can result a big article. Nevertheless, writing about a subtopic must be consistent with applicable policies and guidelines; otherwise, a subtopic article may be at risk of deletion, like Ashton Kutcher on Twitter. I'll rephrase the "consistency" part: Does this guideline have to mention WP:What Wikipedia is not when it comes to articles of topics and subtopics? Why can't this guide mention about any other policies and guidelines? -- George Ho ( talk) 14:24, 4 July 2012 (UTC)
I've changed the rule of thumb to refer to the markup size as given in the history. It is pretty obvious that people have used this normally and mean this. There would be no point talking about limits on sortable tables otherwise as readable prose doesn't include tables according to the bit at the beginning. Dmcq ( talk) 14:07, 4 July 2012 (UTC)
The overwhelming consensus here is to go with readable prose size. The Blade of the Northern Lights ( 話して下さい) 21:22, 4 August 2012 (UTC) |
Should WP:SIZERULE, a rule of thumb guideline for saying an article is too large, refer to the 'markup size' of an article given by the size in its history - the number of bytes downloaded when an edit is done, or should it refer to the 'readable prose size' given by the text excluding any footnotes and reference sections ("see also", "external links", bibliography, etc.); diagrams and images; tables and lists; Wikilinks and external URLs; and formatting and mark-up as given by the script User:Dr_pda/prosesize? There is a discussion above in Wikipedia talk:Article size#Rule of thumb. Dmcq ( talk) 14:52, 4 July 2012 (UTC)
Is this the right way to say about articles, or must this nutshell be reworded? -- George Ho ( talk) 03:25, 6 July 2012 (UTC)
How is one supposed to calculate the Readable Prose size. Can the method be included in the article somewhere? Op47 ( talk) 13:00, 6 July 2012 (UTC)
The major thing one should do if things are too long is see if large subtopics can be cut out as detailed in WP:SS. If there are no major subtopics and the article is very long then it is worth wondering whether there is sufficient weight for the inclusion of some of the contents. Personally I haven't seen any very long articles not filled with trivia or where large bits shouldn't be cut out into notable subtopics. The only ones that aren't like that are list articles which have special provisions for splitting.
I don't think the prose text argument for the rule of thumb holds any weight because the WP:Article size#Readability issues says 50k is a limit and yet the rule of thumb is being taken as meaning there are no such issues till the prose text goes over 100k. People should be thinking about size before getting to 50k and should have it definitely in mind over 50k, 100k is a very negative aspect of a featured article.
Wikitext size tends to be about twice the size of prose text for featured articles though it can be more if citations have a lot of extra text in them, and it can be much more if there are tables since prose text ignores table sizes. The advantage of wikitext is that it is directly supported by Special:Longpages and the article history whereas prose text is not well supported. Thus if it is a reasonable measure it is a better measure as a rule of thumb. As to the current rule of thumb if 100k was interpreted as wikitext it would put prose size at about 50k before one should definitely consider splitting which is about right by the readability issues section.
The other point in wikitext's favour is a consideration of a mjor problem in Wikipedia. People are dumping in databases of sports results, election results, manga characters etc etc into Wikipedia. These articles come out as only a few hundred bytes as far as prose text is concerned but often involve downloading over a megabyte and quite long times to display even ignoring mobile browsing. Rule of thumb is supposed to be something easy, if featured article people want to polish an article that is another business and they can take time over it, but dealing with the great mass of rubbish being stuck into Wikipedia requires more everyday tools.
For straightforward articles which are okayy wikitext is as good as prose text as a rule of thumb and for everyday use talking about articles which are too large consisting of huge tables prose text is simply useless. As to lists they have their own rules and can be split fairly arbitrarily but their rules stop them being misused as database dumps in quite the same way as normal articles. Dmcq ( talk) 15:26, 7 July 2012 (UTC)
By the way I have also just set up WP:VPT#Section viewing to start thinking about coping better with long articles on mobile devices. Probably somebody else has been at this sort of thing before but getting changes in to the wiki software isn't that easy.. Dmcq ( talk) 15:44, 7 July 2012 (UTC)
Ok, so I did some research, I read this article: Loading today's sites over dialup about load times.
I also played around with this web analyser tool: [5], I pointed it at a few article pages, the main page, and a couple of featured articles.
It looks like, roughly, an article that is about 1 megabyte of load size takes about 2-2.5 minutes on a 56K modem. The main page is about 80K and takes about 20 seconds. Yesterdays Calgary Stampede FA was 222K and would load in about 51 seconds, and the rocket page is 500K, 1:52 on a 56K modem (note that long articles tend to load a bit quicker than you would expect because of queue latency at the webserver that hurts short articles more.)
I don't have any reason to think that those articles are particular large, but I tentatively suggest we write down 1 megabyte to limit the maximum size, and to have a rule of thumb that it's all good up to 250K (a load time of under a minute).
To put this in perspective, according to that article the average size on the wider web is a bit over 1M and the load time on a 56K modem is about 2 minutes 30, so although 2 minutes is a long load, it's still above average.
I mean in an ideal world I would prefer everything to load in 5 seconds on 56K modem, but that's not going to happen, I don't think people want an encyclopedia web page that looks like it's 1993. So we have to be a bit reasonable.
webpage load size | Header text |
---|---|
<250K | good |
250K-500K | acceptable |
500K-1M | consider shrinking or splitting |
>1M | should be split |
So I would suggest that we add that, in addition to the rule of thumb on prose length. Does that sound OK? GliderMaven ( talk) 16:12, 7 July 2012 (UTC)
wikitext size | What to do |
---|---|
< 30k | normally too small to split |
< 60k | good readability |
60k - 120k | acceptable but splitting may be helpful |
120k - 250k | can have readability issues, consider shrinking or splitting |
> 250k | almost certainly should be split |
URL: http://en.wikipedia.org/wiki/Calgary_Stampede |
---|
Title: Calgary Stampede - Wikipedia, the free encyclopedia |
Date: Report run on Sun Jul 8 09:07:20EDT2012 |
Diagnosis |
Global Statistics |
Total HTTP Requests: 33 |
Total Size: 222714 bytes |
Object Size Totals |
Object type Size (bytes) Download @ 56K (seconds) Download @ T1 (seconds) |
HTML: 52324 10.63 0.48 |
HTML Images: 124925 29.50 5.26 |
CSS Images: 15941 3.78 0.68 |
Total Images: 140866 33.28 5.94 |
Javascript: 23997 5.78 1.13 |
CSS: 5527 1.30 0.23 |
Multimedia: 0 0.00 0.00 |
Other: 0 0.00 0.00 |
URL: http://en.wikipedia.org/wiki/Rocket |
---|
Title: Rocket - Wikipedia, the free encyclopedia |
Date: Report run on Sun Jul 8 09:30:45EDT2012 |
Diagnosis |
Global Statistics |
Total HTTP Requests: 79 |
Total Size: 500598 bytes |
Object Size Totals |
Object type Size (bytes) Download @ 56K (seconds) Download @ T1 (seconds) |
HTML: 81986 16.54 0.63 |
HTML Images: 367824 86.91 15.55 |
CSS Images: 15941 3.78 0.68 |
Total Images: 383765 90.69 16.23 |
Javascript: 29320 7.04 1.36 |
CSS: 5527 1.30 0.23 |
Multimedia: 0 0.00 0.00 |
Other: 0 0.00 0.00 |
Note that the example given in a previous thread, Wikipedia talk:Article size#Images as part of the total download of an article to a browser, is 38MB (!!) worth of page load because of the hundreds of thumbnail images. Pages like that must be cut down by taking away the images or by splitting. Our guideline must recommend an upper limit for that kind of silliness. 1MB seems reasonable. Binksternet ( talk) 01:19, 9 July 2012 (UTC)
Seems to me all three major ways of measuring size are relevant to different users. The majority of users are presumably readers who found their way via Web search and are ignorant of the article's topic. For those who use a desktop or large laptop screen and fast wired connection, quick comprehensibility is the main design consideration, which makes readable prose size the proper measure. For those using their mobile phone screen as I often do when merely reading, or a small tablet, small prose size is even more important for avoiding getting lost in an article with sections either too large or too numerous, and download size also becomes important for those of us with slow mobile radio connections.
Editors ignorant of the fine points discussed here, who will remain the majority of editors for an indefinite period, only know markup size, because that's what's in the watchlist entry. Those who edit on a small screen or slow connection are again even more interested in markup size. And of course many readers who use small mobile screens, including me often, will be reading one or another of the "Mobile Web" versions and sometimes the official Wikipedia mobile app or an unofficial one, most of which will present pictures with a smaller thumbnail than the "Desktop version" that the majority of deskbound readers use. So, yeah, all these methods of setting limits ought to be taken into consideration, but markup size is the only one the majority of editors will use until the others are as easily reported as that one. Jim.henderson ( talk) 03:13, 19 July 2012 (UTC)
Originally the only measure for size was the byte count. The words "readable prose" were introduced in 2004 to point out that tables, lists, and markup were not to be included, but did not adjust the suggested counts that involved. [7] Prior to that it was clear that the only count that was used was the byte count - see [8] and [9] to see that this article is 15 kb (that was before you could just click history to find out the current size. In fact the suggested sizes have increased, not decreased, while using a measure that gives a smaller size. This has compounded the problem of pages being too long. Apteva ( talk) 20:06, 19 September 2012 (UTC)
This section is now disputed because changes have been proposed and because information of subtopic in article dedicated to main topic may be either decent or excessive. To establish a straw poll, you can create a subheading below with a touch of RFC tag. -- George Ho ( talk) 16:52, 19 July 2012 (UTC)
Marimba is a musical instrument made and played by the Lozi people of the western province of Zambia — Preceding unsigned comment added by 101.119.24.76 ( talk) 01:35, 11 August 2012 (UTC)
Am I using Template:Size correctly, on the massively oversized List of historic places in Quebec? Seeing the equally massive logo it places atop, I'm unsure. Should this go on the Talk page? The template documentation is unclear, at least, to me. Shawn in Montreal ( talk) 15:42, 23 August 2012 (UTC)
{{#ifexpr: <!---1---> {{PAGESIZE:{{FULLPAGENAME}}|R}} >= 102400 |<!---2---> [[File:Ui Yellowexclamation.png| {{#ifeq: {{{big|no}}} | yes | 70px |35px}} link= Template:Longish]]