This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
I think it would be cool if we could see 1) which entries weren't included in the previous cycle 2) how many 10 day cycles they've been on the top 5000 and 3) how much they changed from the previous cycle (up or down). Biosthmors ( talk) 14:41, 3 December 2012 (UTC)
Could we get a line that says how many days each count includes? I'm guessing it is 10 plus or minus. Thanks! Biosthmors ( talk) 23:48, 14 December 2012 (UTC)
I was wondering if we could get a gold star by the articles on this list that are featured (and perhaps good ones too?). (I posted on a related subject at WT:FA.) It would also be great to see month by month statistics to identify trends in these counts month by month. Maybe even a number of how many peer reviews each article in this list has had. Thanks. Biosthmors ( talk) 20:44, 2 December 2012 (UTC)
Hi again Biosthmors. Due to rapidly evolving real life considerations, this isn't a feature I'll be able to integrate any time in the immediate future. Fortunately, however, it isn't one that really needs to be addressed with serious back-end processing. Notice that the Mediawiki parser does most of the work for us when coloring links red/blue. This is annotated in the source code by the presence of "redlink=1", and one would just need to count those instances. Surely the "find" functionality of some text editors would make immediate work of such a sum. Alternatively, I wonder if this is something that could be handled with template coding (not my expertise)? Thanks, West.andrew.g ( talk) 05:02, 22 December 2012 (UTC)
Andrew, when you get a chance, could you add a counter for featured lists? There are at least 100 hits for "List of" in the list. It would be great to keep track of this too. Best. Biosthmors ( talk) 21:01, 15 January 2013 (UTC)
Could you also add a counter for the sum of the 5000 hit count, when you get a chance? Thanks. Biosthmors ( talk) 21:01, 15 January 2013 (UTC)
I showed your beta list to some admins at the London Wikipedia meetup and they say that if you could find a way to adjust your algorithm to get rid of the "undefined" and multiple main page listings, it could be incorporated into the site. :) Serendi pod ous 20:33, 15 January 2013 (UTC)
As of your writing, the version which is displayed covers the week of Jan. 6--12. In the next several minutes/hours a new version will appear that will cover Jan. 13--19. The last day included is always a Saturday. Reports are generated Sunday morning (EST, in case there is any time zone confusion between us). The new report is currently in the processing pipeline. Hopefully this also answers your question regarding the specific article count? Thanks, West.andrew.g ( talk) 05:55, 20 January 2013 (UTC)
We will be drafting this at User:West.andrew.g/Popular_pages/Signpost. Thanks, West.andrew.g ( talk) 17:58, 24 January 2013 (UTC)
> grep "en Computer_virus" * pagecounts-20130117-000000:en Computer_virus 134 pagecounts-20130117-010000:en Computer_virus 167 pagecounts-20130117-020000:en Computer_virus 190 pagecounts-20130117-030001:en Computer_virus 171 pagecounts-20130117-040000:en Computer_virus 667 pagecounts-20130117-050000:en Computer_virus 603 pagecounts-20130117-060000:en Computer_virus 302 pagecounts-20130117-070000:en Computer_virus 2630 pagecounts-20130117-080000:en Computer_virus 612 pagecounts-20130117-090000:en Computer_virus 1263 pagecounts-20130117-100000:en Computer_virus 2548 pagecounts-20130117-110000:en Computer_virus 965 pagecounts-20130117-120000:en Computer_virus 478 pagecounts-20130117-130001:en Computer_virus 820 pagecounts-20130117-140000:en Computer_virus 2518 pagecounts-20130117-150000:en Computer_virus 1135 pagecounts-20130117-160000:en Computer_virus 775 pagecounts-20130117-170000:en Computer_virus 381 pagecounts-20130117-180000:en Computer_virus 474 pagecounts-20130117-190000:en Computer_virus 295 pagecounts-20130117-200001:en Computer_virus 325 pagecounts-20130117-210000:en Computer_virus 284 pagecounts-20130117-220000:en Computer_virus 259 pagecounts-20130117-230000:en Computer_virus 281
Interesting. I see that the other tool gives the same results regardless of capitalization, so it presumably sums the results for all the redirects that differ only in capitalization. (This is not the case when there are redirects to or from accented version of names; in that case they are counted separately and so need to be manually aggregated.) LittleBen ( talk) 09:34, 25 January 2013 (UTC)
I'll do this one more time just to make it painfully clear. That tool is broken in some way:
grep "en Comparison_of_Android_devices" * pagecounts-20130121-000000:en Comparison_of_Android_devices 41 pagecounts-20130121-000000:en Comparison_of_Android_devices%091055702 1 pagecounts-20130121-010000:en Comparison_of_Android_devices 2144 pagecounts-20130121-020000:en Comparison_of_Android_devices 82 pagecounts-20130121-030000:en Comparison_of_Android_devices 55 pagecounts-20130121-030000:en Comparison_of_Android_devices%23Smartphones 1 pagecounts-20130121-030000:en Comparison_of_Android_devices%23Tablet_computers 1 pagecounts-20130121-040000:en Comparison_of_Android_devices 2151 pagecounts-20130121-050001:en Comparison_of_Android_devices 113 pagecounts-20130121-060000:en Comparison_of_Android_devices 40 pagecounts-20130121-070000:en Comparison_of_Android_devices 2011 pagecounts-20130121-080000:en Comparison_of_Android_devices 94 pagecounts-20130121-090000:en Comparison_of_Android_devices 61 pagecounts-20130121-100000:en Comparison_of_Android_devices 2561 pagecounts-20130121-110000:en Comparison_of_Android_devices 105 pagecounts-20130121-130000:en Comparison_of_Android_devices 2514 pagecounts-20130121-140000:en Comparison_of_Android_devices 108 pagecounts-20130121-150000:en Comparison_of_Android_devices 77 pagecounts-20130121-160000:en Comparison_of_Android_devices 2134 pagecounts-20130121-170001:en Comparison_of_Android_devices 157 pagecounts-20130121-180000:en Comparison_of_Android_devices 92 pagecounts-20130121-190000:en Comparison_of_Android_devices 2350 pagecounts-20130121-200000:en Comparison_of_Android_devices 112 pagecounts-20130121-210000:en Comparison_of_Android_devices 77 pagecounts-20130121-210000:en Comparison_of_Android_devices& 1 pagecounts-20130121-220000:en Comparison_of_Android_devices 2418 pagecounts-20130121-230000:en Comparison_of_Android_devices 127
Once again, the sum nowhere approaches the total. Milowent already de-bunked the redirection theory. If anything I would pay attention to the "Comparison_of_Android_devices%091055702" line. Someone tried to use this section link (or whatever it is) to access the page. It would seem Henrik's tool mis-parses this line with "%09" as a character code, and then believes the remainder of the number (~1 million) to be the number of article views. This is incorrect, that section/item only got 1 view. Matter closed. Take this up with the other tool maintainers. Thanks, West.andrew.g ( talk) 01:10, 28 January 2013 (UTC)
I mentioned your good work here; hope you don't mind. LittleBen ( talk) 01:43, 29 January 2013 (UTC)
Wikipedia:Village_pump_(technical)#Format_Change_of_Page_View_Stats -- Why things change around here with no notification is incredibly frustrating. Regardless, I caught and fixed the error last night, and this week's fresh report is now in the pipeline. Thanks, West.andrew.g ( talk) 18:08, 3 February 2013 (UTC)
The biggest hour for this year's Super Bowl entertainment was 02:00UTC of Feb. 4. Note that the raw statistics do not resolve redirects, therefore we have: Beyoncé Knowles at 378,923, Beyonce at 12,877, and Beyonce_knowles at 4,149. This means that during that hour, the article was averaging 100-110 hits per second. These redirect cases also make some suggestions about traffic sources. It's safe to assume that no one would actually accent the "e" in Beyoncé Knowles when doing a casual search. Thus, those arriving directly at that page are likely via Google or another search engine that handles the redirect logic (i.e., the vast majority of visitors). Meanwhile, Beyonce and Beyonce_knowles are more plausible to be direct Mediawiki searches. Thanks, West.andrew.g ( talk) 17:25, 4 February 2013 (UTC)
I would like to know (along the lines of this) if there is a way to pick out which articles belong to which WikiProject. Specifically, I am interested in finding out which WP:ANAT articles are listed here. Any ideas? I thought I'd post here before taking it to WP:VPT. Thanks. Biosthmors ( talk) 19:06, 6 February 2013 (UTC)
So why is the undefined disambiguation drawing such a massive number of hits? And where are they being dabbed to? The articles linked from that dab don't seem to be drawing much traffic. Doesn't make any sense. Wbm1058 ( talk) 23:57, 7 February 2013 (UTC)
Just wanted to say thanks for making this list -- it's really interesting & useful. -- phoebe / ( talk to me) 03:55, 8 February 2013 (UTC)
It would be useful to add the following:
Wow, one of the top 5000 articles (#4999) is one that doesn't even exist, but I did a google news search and he's a rapper. Wow. Looks like we aren't "serving our customers" as well as we could. I assume Kevin Gates is notable. We need to track more of these "most desired" not existent articles. Can we? We have to adapt to reader demand to stay relevant. Biosthmors ( talk) 21:49, 10 February 2013 (UTC)
I just created it. He had been getting hundreds of hits daily until a recent spike, I guess in line with a recent release of his. http://stats.grok.se/en/latest/Kevin_Gates How can we find these sooner? Biosthmors ( talk) 22:00, 10 February 2013 (UTC)
Maybe we can get a top-500 list for the most popular non-existent pages? Biosthmors ( talk) 22:01, 10 February 2013 (UTC)
As a matter of personal policy, I don't get too involved in content disputes (nor am I terribly familiar with the subtleties of the processes). However, I'm curious if you could push back on the deletion process using our statistical evidence? I mean, don't thousands of attempted views somehow imply "notability"? Thanks, West.andrew.g ( talk) 05:40, 11 February 2013 (UTC)
In the future, I think it would be helpful to also add page ratings (A-class, B-class, etc.) to the chart as well, but I don't know how difficult the coding for this would be. Remember ( talk) 14:48, 8 February 2013 (UTC)
I know how wishlists can grow, and I reserve the right to delay/ignore accordingly :-). This is a pretty large bandwidth penalty (I need to actually go and parse 5000 articles that are several kb in size... but it may be do-able). You'll need to list any/all templates that would be of use and how they might be represented in the list (I am thinking a dedicated column might be needed if we keep adding stuff). Could you also help with some of my above questions regarding "A/B/C" class and how I would go about doing that? Thanks, West.andrew.g ( talk) 20:19, 8 February 2013 (UTC)
Okay, so the report will soon notate the A/B/C/stub/start/unassessed classifications. Could someone please update the lede/header section so it has the description for all of this? Once you do so, report here, and revert it back to normal -- and then I will restore your version once I actually run the report. Note that a page can have multiple icons next to its name i.e., "A and B class". An icon simply means that "1 of more projects have classified this article at level "x"".
I am going to temporarily hold off on Biosthmors suggestion. If you continue to want it, Biosthmors, you need to come up with a list of category memberships of interest (it must be categories (even if hidden ones) -- as I don't want to obtain and parse actual page content. Note that virtually all clean-up and maintenance templates implicitly add some form of category). You also need to come up with an icon or other terse system for representing membership. We could imagine a second version of the top-5000 with tons of clean-up details -- and that is fine -- but I am quite busy and need you to handle all the non-technical portions. Thanks, West.andrew.g ( talk) 01:36, 10 February 2013 (UTC)
New WP:TOP25 list should be up late evening (pacific time) when I get back to the Internet. One odd highly-placed entry I can't explain at the moment is Ernst Litfaß - anybody have any ideas about that one? Mary Leakey, subject of a Google Doogle, will be #1, with a whopping 2.7 million views. And Illuminati will be an amusing entry, with popularity fueled by the ridiculous tabloid claim that the fabled group was related to the Superbowl power outage and Beyonce.-- Milowent • has spoken 14:09, 10 February 2013 (UTC)
Martin Rycak, who was posting last-24 hour trend charts on the German Wikipedia for a few months I believe, has expanded to wikipedia.trending.eu/en/index.html, which covers 11 different language wikipedias, allowing you to see the most popular articles on those wikipedias over the past day (and smaller slices). It also has a twitter feed @trending_eu.-- Milowent • has spoken 06:35, 11 February 2013 (UTC)
Due to work travel, I will likely not be able to get the next WP:TOP25 report live until Wednesday morning UTC. If someone wants to help do it, I am welcoming volunteers! Essentially its done manually at this point, I create the new version using the latest version as my template. And I move the old version to its archive URL.-- Milowent • has spoken 14:49, 14 February 2013 (UTC)
Fixed. There were basically two entries for "Valentine's Day" in the raw statistics; one where the apostrophe was ASCII encoded and a second where it was not (and why the latter escaped the encoding is a bit puzzling). Regardless, the far less popular latter case was overwriting the former when it was encountered in processing. The software now accommodates the possibility of a title appearing twice by summing the views. An updated report will appear in a couple of minutes. Thanks, West.andrew.g ( talk) 13:00, 21 February 2013 (UTC)
The exciting new annotated format is up at WP:5000 -- its explained in the header. We need to be watchful that this report gets done early Sunday mornings (per UTC). While everything is fine on my end, Mediawiki and its API seem to get a little moody at times over having to parse a 5000 line table that includes 10,000+ images. I've tried doing it as one big edit (as I am now) -- and putting it together piece-wise -- but it seems to get a little testy regardless.
There is also a page up at User:West.andrew.g/Popular_redlinks that lists the most popular redlinks (showing any that had 1000+ views in the past week). Lots of scripts and spam bot requests it would seem. I'll try to add some heuristics so we can toss out the things that are obviously script/computationally based. I'd also appreciate if someone could author a simple header that describes the page and maybe give it a shortcut from the Wikipedia namespace.
I am going to reject/decline/waitlist all new feature requests for the next two months or so. I apologize, but I have a dissertation to complete. I will entertain them after that point in time. Thanks, West.andrew.g ( talk) 04:44, 15 February 2013 (UTC)
If anyone is wondering why cum shot and a number of similar porn articles are in the Top 100 this week, I think its due to an article on cracked.com, The 6 Most Terrifying Sex Illustrations on Wikipedia (nsfw).-- Milowent • has spoken 15:45, 4 March 2013 (UTC)
Why the massive number of hits on Aho–Corasick string matching algorithm, currently at #3 with 1,105,039 hits? Over a million hits and third place (!) in the past week on a small article on a subject that few people other than programmers and computer scientists would be interested in? It's not a new article either, having been around for over 10 years. — Lowellian ( reply) 06:30, 4 April 2013 (UTC)
These are scripts or bots that in all likelihood have some sort of bug or mis-configuration. When these crazy spikes occur without cause, I don't think we should be so quick to assume there is malice involved, or that the targeting of that particular page was intentional. West.andrew.g ( talk) 21:58, 7 April 2013 (UTC)
Hello. The title is made in jest, but if you had to look twice at it, you can imagine that I did the same when reading "These 5000 pages were the most heavily trafficked on the English Wikipedia". Trafficking is always said when referring to "trade or business, especially of an illicit kind", for example, trafficking in drugs or human beings. The use of the word in your sentence is not standard English, as far as I am aware. SomeFreakOnTheInternet ( talk) 23:07, 21 April 2013 (UTC)
Any idea why Mark Linn-Baker spiked this week? I find it hard to believe that a relatively minor actor got 4x the views of the Boston Marathon or Chechnya articles. -- phoebe / ( talk to me) 04:18, 23 April 2013 (UTC)
Greetings fellow wiki statistic fans,
I am writing to inform you that I have submitted a proposal to Wikimania 2013 entitled Examining the Popularity of Wikipedia Articles: Catalysts, Trends, and Applications which is based on the earlier Signpost article of the same name. I am hoping this opportunity will provide the impetus to dig even deeper into statistical data, find some more fascinating examples, and make even more community members aware of our contributions at WP:5000 and WP:5000/Top25Report.
I don't want to unnecessarily canvass, but I will note that page does provide space for community members to indicate their interest in the proposal (and past Wikimania's have streamed/recorded the event to provide access to those not in attendance). Regardless the outcome of that submission, I will be in Hong Kong and would love to meet those who I interact with on this page -- or anyone who has ideas for how these statistics might be re-purposed to improve the project(s). Thanks, West.andrew.g ( talk) 21:40, 27 April 2013 (UTC)
It appears the English Wikipedia might be getting less popular, from a trend of the article views of the last article on the list for the past several months (it has gone down from around 28K to 24K). Or maybe we're just receiving less non-human views, or maybe the construction of the list -- by excluding more non-results -- has also reduced the number. But an approximate 4K difference is fairly large.
I've also noticed an article I've worked on used to never get into the WP:5000 but now it does. It's a medical article and I wonder to what extent some topics are increasing/decreasing in popularity in relation to each other. Biosthmors ( talk) 04:59, 4 June 2013 (UTC)
Thanks for your efforts here. Hope you don't mind I fixed a link on one of your user pages.
I'm wondering if you know of any source of statistics about Wikipedia articles (number of articles and typical traffic) that is categorized, for example: entertainment, sports, politics, science, finance, technology, etc. etc. The categorization method doesn't matter so much to me as long as it gives some kind of indication of what it is that people are interested in and looking for when they come here. Thank you. RenniePet ( talk) 01:52, 5 June 2013 (UTC)
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
I think it would be cool if we could see 1) which entries weren't included in the previous cycle 2) how many 10 day cycles they've been on the top 5000 and 3) how much they changed from the previous cycle (up or down). Biosthmors ( talk) 14:41, 3 December 2012 (UTC)
Could we get a line that says how many days each count includes? I'm guessing it is 10 plus or minus. Thanks! Biosthmors ( talk) 23:48, 14 December 2012 (UTC)
I was wondering if we could get a gold star by the articles on this list that are featured (and perhaps good ones too?). (I posted on a related subject at WT:FA.) It would also be great to see month by month statistics to identify trends in these counts month by month. Maybe even a number of how many peer reviews each article in this list has had. Thanks. Biosthmors ( talk) 20:44, 2 December 2012 (UTC)
Hi again Biosthmors. Due to rapidly evolving real life considerations, this isn't a feature I'll be able to integrate any time in the immediate future. Fortunately, however, it isn't one that really needs to be addressed with serious back-end processing. Notice that the Mediawiki parser does most of the work for us when coloring links red/blue. This is annotated in the source code by the presence of "redlink=1", and one would just need to count those instances. Surely the "find" functionality of some text editors would make immediate work of such a sum. Alternatively, I wonder if this is something that could be handled with template coding (not my expertise)? Thanks, West.andrew.g ( talk) 05:02, 22 December 2012 (UTC)
Andrew, when you get a chance, could you add a counter for featured lists? There are at least 100 hits for "List of" in the list. It would be great to keep track of this too. Best. Biosthmors ( talk) 21:01, 15 January 2013 (UTC)
Could you also add a counter for the sum of the 5000 hit count, when you get a chance? Thanks. Biosthmors ( talk) 21:01, 15 January 2013 (UTC)
I showed your beta list to some admins at the London Wikipedia meetup and they say that if you could find a way to adjust your algorithm to get rid of the "undefined" and multiple main page listings, it could be incorporated into the site. :) Serendi pod ous 20:33, 15 January 2013 (UTC)
As of your writing, the version which is displayed covers the week of Jan. 6--12. In the next several minutes/hours a new version will appear that will cover Jan. 13--19. The last day included is always a Saturday. Reports are generated Sunday morning (EST, in case there is any time zone confusion between us). The new report is currently in the processing pipeline. Hopefully this also answers your question regarding the specific article count? Thanks, West.andrew.g ( talk) 05:55, 20 January 2013 (UTC)
We will be drafting this at User:West.andrew.g/Popular_pages/Signpost. Thanks, West.andrew.g ( talk) 17:58, 24 January 2013 (UTC)
> grep "en Computer_virus" * pagecounts-20130117-000000:en Computer_virus 134 pagecounts-20130117-010000:en Computer_virus 167 pagecounts-20130117-020000:en Computer_virus 190 pagecounts-20130117-030001:en Computer_virus 171 pagecounts-20130117-040000:en Computer_virus 667 pagecounts-20130117-050000:en Computer_virus 603 pagecounts-20130117-060000:en Computer_virus 302 pagecounts-20130117-070000:en Computer_virus 2630 pagecounts-20130117-080000:en Computer_virus 612 pagecounts-20130117-090000:en Computer_virus 1263 pagecounts-20130117-100000:en Computer_virus 2548 pagecounts-20130117-110000:en Computer_virus 965 pagecounts-20130117-120000:en Computer_virus 478 pagecounts-20130117-130001:en Computer_virus 820 pagecounts-20130117-140000:en Computer_virus 2518 pagecounts-20130117-150000:en Computer_virus 1135 pagecounts-20130117-160000:en Computer_virus 775 pagecounts-20130117-170000:en Computer_virus 381 pagecounts-20130117-180000:en Computer_virus 474 pagecounts-20130117-190000:en Computer_virus 295 pagecounts-20130117-200001:en Computer_virus 325 pagecounts-20130117-210000:en Computer_virus 284 pagecounts-20130117-220000:en Computer_virus 259 pagecounts-20130117-230000:en Computer_virus 281
Interesting. I see that the other tool gives the same results regardless of capitalization, so it presumably sums the results for all the redirects that differ only in capitalization. (This is not the case when there are redirects to or from accented version of names; in that case they are counted separately and so need to be manually aggregated.) LittleBen ( talk) 09:34, 25 January 2013 (UTC)
I'll do this one more time just to make it painfully clear. That tool is broken in some way:
grep "en Comparison_of_Android_devices" * pagecounts-20130121-000000:en Comparison_of_Android_devices 41 pagecounts-20130121-000000:en Comparison_of_Android_devices%091055702 1 pagecounts-20130121-010000:en Comparison_of_Android_devices 2144 pagecounts-20130121-020000:en Comparison_of_Android_devices 82 pagecounts-20130121-030000:en Comparison_of_Android_devices 55 pagecounts-20130121-030000:en Comparison_of_Android_devices%23Smartphones 1 pagecounts-20130121-030000:en Comparison_of_Android_devices%23Tablet_computers 1 pagecounts-20130121-040000:en Comparison_of_Android_devices 2151 pagecounts-20130121-050001:en Comparison_of_Android_devices 113 pagecounts-20130121-060000:en Comparison_of_Android_devices 40 pagecounts-20130121-070000:en Comparison_of_Android_devices 2011 pagecounts-20130121-080000:en Comparison_of_Android_devices 94 pagecounts-20130121-090000:en Comparison_of_Android_devices 61 pagecounts-20130121-100000:en Comparison_of_Android_devices 2561 pagecounts-20130121-110000:en Comparison_of_Android_devices 105 pagecounts-20130121-130000:en Comparison_of_Android_devices 2514 pagecounts-20130121-140000:en Comparison_of_Android_devices 108 pagecounts-20130121-150000:en Comparison_of_Android_devices 77 pagecounts-20130121-160000:en Comparison_of_Android_devices 2134 pagecounts-20130121-170001:en Comparison_of_Android_devices 157 pagecounts-20130121-180000:en Comparison_of_Android_devices 92 pagecounts-20130121-190000:en Comparison_of_Android_devices 2350 pagecounts-20130121-200000:en Comparison_of_Android_devices 112 pagecounts-20130121-210000:en Comparison_of_Android_devices 77 pagecounts-20130121-210000:en Comparison_of_Android_devices& 1 pagecounts-20130121-220000:en Comparison_of_Android_devices 2418 pagecounts-20130121-230000:en Comparison_of_Android_devices 127
Once again, the sum nowhere approaches the total. Milowent already de-bunked the redirection theory. If anything I would pay attention to the "Comparison_of_Android_devices%091055702" line. Someone tried to use this section link (or whatever it is) to access the page. It would seem Henrik's tool mis-parses this line with "%09" as a character code, and then believes the remainder of the number (~1 million) to be the number of article views. This is incorrect, that section/item only got 1 view. Matter closed. Take this up with the other tool maintainers. Thanks, West.andrew.g ( talk) 01:10, 28 January 2013 (UTC)
I mentioned your good work here; hope you don't mind. LittleBen ( talk) 01:43, 29 January 2013 (UTC)
Wikipedia:Village_pump_(technical)#Format_Change_of_Page_View_Stats -- Why things change around here with no notification is incredibly frustrating. Regardless, I caught and fixed the error last night, and this week's fresh report is now in the pipeline. Thanks, West.andrew.g ( talk) 18:08, 3 February 2013 (UTC)
The biggest hour for this year's Super Bowl entertainment was 02:00UTC of Feb. 4. Note that the raw statistics do not resolve redirects, therefore we have: Beyoncé Knowles at 378,923, Beyonce at 12,877, and Beyonce_knowles at 4,149. This means that during that hour, the article was averaging 100-110 hits per second. These redirect cases also make some suggestions about traffic sources. It's safe to assume that no one would actually accent the "e" in Beyoncé Knowles when doing a casual search. Thus, those arriving directly at that page are likely via Google or another search engine that handles the redirect logic (i.e., the vast majority of visitors). Meanwhile, Beyonce and Beyonce_knowles are more plausible to be direct Mediawiki searches. Thanks, West.andrew.g ( talk) 17:25, 4 February 2013 (UTC)
I would like to know (along the lines of this) if there is a way to pick out which articles belong to which WikiProject. Specifically, I am interested in finding out which WP:ANAT articles are listed here. Any ideas? I thought I'd post here before taking it to WP:VPT. Thanks. Biosthmors ( talk) 19:06, 6 February 2013 (UTC)
So why is the undefined disambiguation drawing such a massive number of hits? And where are they being dabbed to? The articles linked from that dab don't seem to be drawing much traffic. Doesn't make any sense. Wbm1058 ( talk) 23:57, 7 February 2013 (UTC)
Just wanted to say thanks for making this list -- it's really interesting & useful. -- phoebe / ( talk to me) 03:55, 8 February 2013 (UTC)
It would be useful to add the following:
Wow, one of the top 5000 articles (#4999) is one that doesn't even exist, but I did a google news search and he's a rapper. Wow. Looks like we aren't "serving our customers" as well as we could. I assume Kevin Gates is notable. We need to track more of these "most desired" not existent articles. Can we? We have to adapt to reader demand to stay relevant. Biosthmors ( talk) 21:49, 10 February 2013 (UTC)
I just created it. He had been getting hundreds of hits daily until a recent spike, I guess in line with a recent release of his. http://stats.grok.se/en/latest/Kevin_Gates How can we find these sooner? Biosthmors ( talk) 22:00, 10 February 2013 (UTC)
Maybe we can get a top-500 list for the most popular non-existent pages? Biosthmors ( talk) 22:01, 10 February 2013 (UTC)
As a matter of personal policy, I don't get too involved in content disputes (nor am I terribly familiar with the subtleties of the processes). However, I'm curious if you could push back on the deletion process using our statistical evidence? I mean, don't thousands of attempted views somehow imply "notability"? Thanks, West.andrew.g ( talk) 05:40, 11 February 2013 (UTC)
In the future, I think it would be helpful to also add page ratings (A-class, B-class, etc.) to the chart as well, but I don't know how difficult the coding for this would be. Remember ( talk) 14:48, 8 February 2013 (UTC)
I know how wishlists can grow, and I reserve the right to delay/ignore accordingly :-). This is a pretty large bandwidth penalty (I need to actually go and parse 5000 articles that are several kb in size... but it may be do-able). You'll need to list any/all templates that would be of use and how they might be represented in the list (I am thinking a dedicated column might be needed if we keep adding stuff). Could you also help with some of my above questions regarding "A/B/C" class and how I would go about doing that? Thanks, West.andrew.g ( talk) 20:19, 8 February 2013 (UTC)
Okay, so the report will soon notate the A/B/C/stub/start/unassessed classifications. Could someone please update the lede/header section so it has the description for all of this? Once you do so, report here, and revert it back to normal -- and then I will restore your version once I actually run the report. Note that a page can have multiple icons next to its name i.e., "A and B class". An icon simply means that "1 of more projects have classified this article at level "x"".
I am going to temporarily hold off on Biosthmors suggestion. If you continue to want it, Biosthmors, you need to come up with a list of category memberships of interest (it must be categories (even if hidden ones) -- as I don't want to obtain and parse actual page content. Note that virtually all clean-up and maintenance templates implicitly add some form of category). You also need to come up with an icon or other terse system for representing membership. We could imagine a second version of the top-5000 with tons of clean-up details -- and that is fine -- but I am quite busy and need you to handle all the non-technical portions. Thanks, West.andrew.g ( talk) 01:36, 10 February 2013 (UTC)
New WP:TOP25 list should be up late evening (pacific time) when I get back to the Internet. One odd highly-placed entry I can't explain at the moment is Ernst Litfaß - anybody have any ideas about that one? Mary Leakey, subject of a Google Doogle, will be #1, with a whopping 2.7 million views. And Illuminati will be an amusing entry, with popularity fueled by the ridiculous tabloid claim that the fabled group was related to the Superbowl power outage and Beyonce.-- Milowent • has spoken 14:09, 10 February 2013 (UTC)
Martin Rycak, who was posting last-24 hour trend charts on the German Wikipedia for a few months I believe, has expanded to wikipedia.trending.eu/en/index.html, which covers 11 different language wikipedias, allowing you to see the most popular articles on those wikipedias over the past day (and smaller slices). It also has a twitter feed @trending_eu.-- Milowent • has spoken 06:35, 11 February 2013 (UTC)
Due to work travel, I will likely not be able to get the next WP:TOP25 report live until Wednesday morning UTC. If someone wants to help do it, I am welcoming volunteers! Essentially its done manually at this point, I create the new version using the latest version as my template. And I move the old version to its archive URL.-- Milowent • has spoken 14:49, 14 February 2013 (UTC)
Fixed. There were basically two entries for "Valentine's Day" in the raw statistics; one where the apostrophe was ASCII encoded and a second where it was not (and why the latter escaped the encoding is a bit puzzling). Regardless, the far less popular latter case was overwriting the former when it was encountered in processing. The software now accommodates the possibility of a title appearing twice by summing the views. An updated report will appear in a couple of minutes. Thanks, West.andrew.g ( talk) 13:00, 21 February 2013 (UTC)
The exciting new annotated format is up at WP:5000 -- its explained in the header. We need to be watchful that this report gets done early Sunday mornings (per UTC). While everything is fine on my end, Mediawiki and its API seem to get a little moody at times over having to parse a 5000 line table that includes 10,000+ images. I've tried doing it as one big edit (as I am now) -- and putting it together piece-wise -- but it seems to get a little testy regardless.
There is also a page up at User:West.andrew.g/Popular_redlinks that lists the most popular redlinks (showing any that had 1000+ views in the past week). Lots of scripts and spam bot requests it would seem. I'll try to add some heuristics so we can toss out the things that are obviously script/computationally based. I'd also appreciate if someone could author a simple header that describes the page and maybe give it a shortcut from the Wikipedia namespace.
I am going to reject/decline/waitlist all new feature requests for the next two months or so. I apologize, but I have a dissertation to complete. I will entertain them after that point in time. Thanks, West.andrew.g ( talk) 04:44, 15 February 2013 (UTC)
If anyone is wondering why cum shot and a number of similar porn articles are in the Top 100 this week, I think its due to an article on cracked.com, The 6 Most Terrifying Sex Illustrations on Wikipedia (nsfw).-- Milowent • has spoken 15:45, 4 March 2013 (UTC)
Why the massive number of hits on Aho–Corasick string matching algorithm, currently at #3 with 1,105,039 hits? Over a million hits and third place (!) in the past week on a small article on a subject that few people other than programmers and computer scientists would be interested in? It's not a new article either, having been around for over 10 years. — Lowellian ( reply) 06:30, 4 April 2013 (UTC)
These are scripts or bots that in all likelihood have some sort of bug or mis-configuration. When these crazy spikes occur without cause, I don't think we should be so quick to assume there is malice involved, or that the targeting of that particular page was intentional. West.andrew.g ( talk) 21:58, 7 April 2013 (UTC)
Hello. The title is made in jest, but if you had to look twice at it, you can imagine that I did the same when reading "These 5000 pages were the most heavily trafficked on the English Wikipedia". Trafficking is always said when referring to "trade or business, especially of an illicit kind", for example, trafficking in drugs or human beings. The use of the word in your sentence is not standard English, as far as I am aware. SomeFreakOnTheInternet ( talk) 23:07, 21 April 2013 (UTC)
Any idea why Mark Linn-Baker spiked this week? I find it hard to believe that a relatively minor actor got 4x the views of the Boston Marathon or Chechnya articles. -- phoebe / ( talk to me) 04:18, 23 April 2013 (UTC)
Greetings fellow wiki statistic fans,
I am writing to inform you that I have submitted a proposal to Wikimania 2013 entitled Examining the Popularity of Wikipedia Articles: Catalysts, Trends, and Applications which is based on the earlier Signpost article of the same name. I am hoping this opportunity will provide the impetus to dig even deeper into statistical data, find some more fascinating examples, and make even more community members aware of our contributions at WP:5000 and WP:5000/Top25Report.
I don't want to unnecessarily canvass, but I will note that page does provide space for community members to indicate their interest in the proposal (and past Wikimania's have streamed/recorded the event to provide access to those not in attendance). Regardless the outcome of that submission, I will be in Hong Kong and would love to meet those who I interact with on this page -- or anyone who has ideas for how these statistics might be re-purposed to improve the project(s). Thanks, West.andrew.g ( talk) 21:40, 27 April 2013 (UTC)
It appears the English Wikipedia might be getting less popular, from a trend of the article views of the last article on the list for the past several months (it has gone down from around 28K to 24K). Or maybe we're just receiving less non-human views, or maybe the construction of the list -- by excluding more non-results -- has also reduced the number. But an approximate 4K difference is fairly large.
I've also noticed an article I've worked on used to never get into the WP:5000 but now it does. It's a medical article and I wonder to what extent some topics are increasing/decreasing in popularity in relation to each other. Biosthmors ( talk) 04:59, 4 June 2013 (UTC)
Thanks for your efforts here. Hope you don't mind I fixed a link on one of your user pages.
I'm wondering if you know of any source of statistics about Wikipedia articles (number of articles and typical traffic) that is categorized, for example: entertainment, sports, politics, science, finance, technology, etc. etc. The categorization method doesn't matter so much to me as long as it gives some kind of indication of what it is that people are interested in and looking for when they come here. Thank you. RenniePet ( talk) 01:52, 5 June 2013 (UTC)