This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Hi everyone - just to let you know that we have an update to this page coming soon. We'll be publishing some statistics soon which outline per page metrics on revisions under Pending Changes. Nimish Gautam and Devin Finzer (Devin is an intern that is working for Wikimedia Foundation this summer) are working on some statistics.
Here are the per-article stats they are gathering:
The output format they're producing is .csv file. I'm not sure what the preferred way of publishing that sort of thing here. Ideas? -- RobLa ( talk) 17:22, 28 July 2010 (UTC)
As I indicated above, a sortable table on a wiki page is the ideal presentation format for the final results, in my view. Users shouldn't be expected or required to read through .csv files in order to see the results. Though, a "top sheet" approach might be the best of both worlds; i.e., a final presentation on a wiki page that outlines everything and links to the raw data and a detailed explanation of the methodology used. -- MZMcBride ( talk) 23:42, 28 July 2010 (UTC)
What is the meta-information, such as (but not limited to) start time and end time for each of the statistics? BrainMarble ( talk) 19:33, 28 July 2010 (UTC)
Here's a list of the metrics we've collected so far:
These links are now on Wikipedia:Pending changes/Metrics as well. -- RobLa-WMF ( talk) 01:41, 4 August 2010 (UTC)
Metrics, like statistics, are most useful when they pertain directly to the goal of the study. The statistics listed so far (above by RobLa on 17:22, 28 July 2010; and by RobLa-WMF on 01:41, 4 August 2010) are descriptive statistics, useful in themselves for giving a sense of the environment or scope of the study.
However, during the end-of-trial evaluation of the pending changes policy, we would likely be interested first in how the metrics reveal answers to the following two questions:
Both "how" questions imply a comparison between actions during the trial period and actions prior to the trial period. The trial period is about two months, overall. The pre-trial period used for comparison should be long enough to account for cyclic trends, perhaps up to a year before the start of the trial. (So far, none of the metrics appear to account for cyclic trends before the trial period.)
The duration of time intervals affects the quality of the evaluation. Shorter durations (hours, days) yield more detail but require greater effort. Longer durations (weeks, months) require less effort but yield less detail. Experimentation will help determine which duration to use in the evaluation. (So far, there appear to be only two time intervals in the metrics: "under pending" and "not under pending".)
The effectiveness question involves counts of groups of editing actions per time interval throughout both pre-trial and trial periods; reduction of the influence of cyclic trends; and comparison ratios for each of the groups of editing actions. (So far, only a set of raw counts appear in the metrics.)
The efficiency question implies calculating the time delay involved in taking corrective editing actions (such as revert, undo, unaccept) for the same time intervals as in the effectiveness question; and comparison ratios to reveal an increase or decrease in efficiency over time. (So far, none of the metrics are capable of revealing efficiency.) BrainMarble ( talk) 18:55, 5 August 2010 (UTC)
Hi everyone, Devin provided the following raw data prior before leaving WMF for the summer:
I'm traveling right now, so I'm not in a good spot to provide a lot of commentary on what is in these files, but the gist is that these are every edit to every article while it was under Pending Changes. I hope these are useful for anyone who wants to do some additional number crunching. -- RobLa-WMF ( talk) 05:22, 15 August 2010 (UTC)
Sorry if this analysis is rough, but I am heading on vacation for 2 weeks, so do not have time to provide additional analysis and/or commentary. Please feel free to contribute to the analysis, either on the talk page or directly on the article page. I'm hoping the skeleton I put together will engender discussion about the trial. Howief ( talk) 00:13, 21 August 2010 (UTC)
Could the basic numbers be provided - the total number of anonymous edits (in total, and with unreverted, separately), and the average number of anonymous edits per article per day? Comparison to a control sample of the same number of articles over the same time periods that weren't protected by semi/full-protection (or the same articles that were protected by pending changes at an earlier time) would be good. I'm specifically interested in getting a feel for the number of edits that pending changes enabled/'saved' compared to semi-protection, and to the articles' unprotected states. Thanks. Mike Peel ( talk) 20:12, 24 August 2010 (UTC)
I found this analysis together with the Metrics page helpful, but not many have read them: [1] and [2] show Wikipedia:Pending changes/Metrics/Preliminary Analysis and Wikipedia:Pending changes/Metrics were each viewed about 10 to 20 times daily, compared to several hundred times for Wikipedia:Pending_changes/Closure. The closure article does have links to the others, but maybe not prominently enough. - 84user ( talk) 12:54, 25 August 2010 (UTC)
Thanks, Howief - very helpful!
You say One could potentially argue that an article with under a certain level of edits per day is simply not worth putting under pending changes. But since there is little cost to listing something as "Pending changes" I don't see why that would be true.
What I'd really like to see is a comparison of the rate of approved changes to articles before and after being put under "pending changes" If there is a significant increase, and if volunteers don't tire of reviewing changes, then it seems like it is helpful. -- NealMcB ( talk) 20:09, 22 August 2010 (UTC)
The final graph shows that a smaller percent of IP edits were accepted before than during the trial. Is this success, or failure? To me, it is total failure. Pending Changes is no good if it discourages good IPs more than it discourages bad IPs. The fact that the percent accepted went down on every article in the table should be very concerning. Either the criteria for acceptance went up, or good IPs were put off more than bad IPs. Or good IPs got alienated and turned into bad IPs. So, what happened over time during the trial period? Did the daily or weekly percent of IP edits that were accepted go up or down? 69.3.72.249 ( talk) 04:44, 15 September 2010 (UTC)
Here is a snapshot of the graph I am referring to. 69.3.72.249 ( talk) 04:46, 15 September 2010 (UTC)
Page title | Pending Changes after | Total IP edits before Pending | Rev of these IP edits | nonreverted/total IP edits | Pending Changes: nonreverted/total IP edits |
Spain national football team | "21:16, 11 July 2010" | 143 (10 days) | 15 | 89.5% | 65.8% |
Toy Story 3 | 14:49, 23 July 2010 | 156 (10 days) | 30 | 80.8% | 51.4% |
Australia's Next Top Model, Cycle 6 | 21:01, 21 July 2010 | 200 (10 days) | 68 | 66.0% | 65.7% |
SummerSlam (2010) | 18:11, 21 July 2010 | 167 (10 days) | 45 | 73.1% | 27.2% |
The Twilight Saga: Eclipse | 21:24, 25 April 2010 | 32 (10 days) | 17 | 46.9% | 42.0% |
Jason Leopold | 10:57, 28 July 2010 | 7 (10 days) | 2 | 71.4% | 17.9% |
Wrong diagnosis. IP vandals go up after the Pending Changes period begins... :( -- Chris.urs-o ( talk) 18:46, 17 September 2010 (UTC)
PC affects more than only anonymous edits. Editors who are logged in but do not have reviewer privileges are also put in the "pending" queue. What do the results look like when non-anonymous users make edits? Cliff ( talk) 16:41, 17 March 2011 (UTC)
This page is great. But, it needs a last column which gives the unreverted/total anonymous edits % for each page which looks at the same amount of time the page was under PC, immediately prior to being under PC. We also need to identify whether that overlapped with another kind of page protection. Otherwise, we have data, internally comparable, but untethered to the past. Ocaasi ( talk) 11:22, 7 August 2010 (UTC)
Here is an obvious experiment: try PC on Today's Featured Article for 1 day. There is already tons of experience with vandalism on TFA under the old practice of leaving it unprotected as much as possible, and reverting vandalism on it. 69.111.194.167 ( talk) 16:31, 15 April 2011 (UTC)
first, NICE TABLE! I was a bit surprized at the range of accepted edits. For articles with more than a few the range is from 100% to 0%. I don't know what I might have expected, but it wasn't such a range. Going forward, I might suggest that articles with <20% or so accepted edits, and with an edit volume of more than a few a week might be better on semi-protection. Otherwise I'm very happy with the way the test plot has come out. -- Rocksanddirt ( talk) 16:11, 25 August 2010 (UTC)
´We find this quite non-operative for certain cases. See Antonio Arnaiz-Villena (false )biography: 1-A group of apparent linguists ( Trigaranus, Akerbeltz, User:Dumu Eduba, Kwamikagami) started fighting somewhere else in Wikipedia to remove “Iberian-Guanche Inscriptions”.The page was removed and they particularly disagree with the word “usko-mediterraneans” 2-After many months Dumu Eduba (only interested in linguistics) brought up false accusations against Arnaiz-Villena from doubtul newspapers written ten years ago that he found in Internet(June 9th 2010).This accusations were published within 2-3 weeks time and nothing was said anymore. 3-Arnaiz-Villena and his group have published quite a lot of papers in the last ten years and some books [3]
4-The accusations were shown to be induced and Judges invalidated them. 5-Now,at least User: Arnaiz1= Arnaiz-Villena himself, User:Virginal6,who were both involved in the accusations and have all documents tried to write details about how Judges made the false and induced accusations dissapearing: ,name of Judges ,dates,sentences number etc (see Arnaiz-Villena discussion) 6-They have silenced Arnaiz1 and Virginal6 .because of sock pupets (it is not true). 7-The false biography is in Wikipedia and the living interested people silenced. 8-I would ask you in the name of Arnaiz-Villena group to a)Push to Dumu Eduba to finishing the biography on litigations(he has the data in the Discussion). b)Remove this part of biography until it is finished.(or leave this part as it was before June 9th 2010). 9-We are not allowed to give away Court sentences to nick names.This has not been seen in Wikipedia yet.We will only give sentences to Wikipedia California Administrators. Please contact Antonio Arnaiz-Villena at or at aarnaiz@med.ucm.es or tel +34913941632. Symbio04 ( talk) 10:11, 1 September 2010 (UTC) Symbio04 ( talk) 10:58, 1 September 2010 (UTC)
PC affects more than only anonymous edits. Editors who are logged in but do not have reviewer privileges are also put in the "pending" queue. What do the results look like when non-anonymous users make edits? Cliff ( talk) 16:41, 17 March 2011 (UTC)
In the interests of accuracy and future research, could someone state what precisely "reverted" means in this context, and what algorithm [a link to the actual source code would be ideal!] was used to determine which revisions were reverted.-- greenrd ( talk) 12:09, 14 September 2010 (UTC)
Are the totals for each of those columns available somewhere? -- Anthonyhcole ( talk) 18:11, 19 February 2011 (UTC)
This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Hi everyone - just to let you know that we have an update to this page coming soon. We'll be publishing some statistics soon which outline per page metrics on revisions under Pending Changes. Nimish Gautam and Devin Finzer (Devin is an intern that is working for Wikimedia Foundation this summer) are working on some statistics.
Here are the per-article stats they are gathering:
The output format they're producing is .csv file. I'm not sure what the preferred way of publishing that sort of thing here. Ideas? -- RobLa ( talk) 17:22, 28 July 2010 (UTC)
As I indicated above, a sortable table on a wiki page is the ideal presentation format for the final results, in my view. Users shouldn't be expected or required to read through .csv files in order to see the results. Though, a "top sheet" approach might be the best of both worlds; i.e., a final presentation on a wiki page that outlines everything and links to the raw data and a detailed explanation of the methodology used. -- MZMcBride ( talk) 23:42, 28 July 2010 (UTC)
What is the meta-information, such as (but not limited to) start time and end time for each of the statistics? BrainMarble ( talk) 19:33, 28 July 2010 (UTC)
Here's a list of the metrics we've collected so far:
These links are now on Wikipedia:Pending changes/Metrics as well. -- RobLa-WMF ( talk) 01:41, 4 August 2010 (UTC)
Metrics, like statistics, are most useful when they pertain directly to the goal of the study. The statistics listed so far (above by RobLa on 17:22, 28 July 2010; and by RobLa-WMF on 01:41, 4 August 2010) are descriptive statistics, useful in themselves for giving a sense of the environment or scope of the study.
However, during the end-of-trial evaluation of the pending changes policy, we would likely be interested first in how the metrics reveal answers to the following two questions:
Both "how" questions imply a comparison between actions during the trial period and actions prior to the trial period. The trial period is about two months, overall. The pre-trial period used for comparison should be long enough to account for cyclic trends, perhaps up to a year before the start of the trial. (So far, none of the metrics appear to account for cyclic trends before the trial period.)
The duration of time intervals affects the quality of the evaluation. Shorter durations (hours, days) yield more detail but require greater effort. Longer durations (weeks, months) require less effort but yield less detail. Experimentation will help determine which duration to use in the evaluation. (So far, there appear to be only two time intervals in the metrics: "under pending" and "not under pending".)
The effectiveness question involves counts of groups of editing actions per time interval throughout both pre-trial and trial periods; reduction of the influence of cyclic trends; and comparison ratios for each of the groups of editing actions. (So far, only a set of raw counts appear in the metrics.)
The efficiency question implies calculating the time delay involved in taking corrective editing actions (such as revert, undo, unaccept) for the same time intervals as in the effectiveness question; and comparison ratios to reveal an increase or decrease in efficiency over time. (So far, none of the metrics are capable of revealing efficiency.) BrainMarble ( talk) 18:55, 5 August 2010 (UTC)
Hi everyone, Devin provided the following raw data prior before leaving WMF for the summer:
I'm traveling right now, so I'm not in a good spot to provide a lot of commentary on what is in these files, but the gist is that these are every edit to every article while it was under Pending Changes. I hope these are useful for anyone who wants to do some additional number crunching. -- RobLa-WMF ( talk) 05:22, 15 August 2010 (UTC)
Sorry if this analysis is rough, but I am heading on vacation for 2 weeks, so do not have time to provide additional analysis and/or commentary. Please feel free to contribute to the analysis, either on the talk page or directly on the article page. I'm hoping the skeleton I put together will engender discussion about the trial. Howief ( talk) 00:13, 21 August 2010 (UTC)
Could the basic numbers be provided - the total number of anonymous edits (in total, and with unreverted, separately), and the average number of anonymous edits per article per day? Comparison to a control sample of the same number of articles over the same time periods that weren't protected by semi/full-protection (or the same articles that were protected by pending changes at an earlier time) would be good. I'm specifically interested in getting a feel for the number of edits that pending changes enabled/'saved' compared to semi-protection, and to the articles' unprotected states. Thanks. Mike Peel ( talk) 20:12, 24 August 2010 (UTC)
I found this analysis together with the Metrics page helpful, but not many have read them: [1] and [2] show Wikipedia:Pending changes/Metrics/Preliminary Analysis and Wikipedia:Pending changes/Metrics were each viewed about 10 to 20 times daily, compared to several hundred times for Wikipedia:Pending_changes/Closure. The closure article does have links to the others, but maybe not prominently enough. - 84user ( talk) 12:54, 25 August 2010 (UTC)
Thanks, Howief - very helpful!
You say One could potentially argue that an article with under a certain level of edits per day is simply not worth putting under pending changes. But since there is little cost to listing something as "Pending changes" I don't see why that would be true.
What I'd really like to see is a comparison of the rate of approved changes to articles before and after being put under "pending changes" If there is a significant increase, and if volunteers don't tire of reviewing changes, then it seems like it is helpful. -- NealMcB ( talk) 20:09, 22 August 2010 (UTC)
The final graph shows that a smaller percent of IP edits were accepted before than during the trial. Is this success, or failure? To me, it is total failure. Pending Changes is no good if it discourages good IPs more than it discourages bad IPs. The fact that the percent accepted went down on every article in the table should be very concerning. Either the criteria for acceptance went up, or good IPs were put off more than bad IPs. Or good IPs got alienated and turned into bad IPs. So, what happened over time during the trial period? Did the daily or weekly percent of IP edits that were accepted go up or down? 69.3.72.249 ( talk) 04:44, 15 September 2010 (UTC)
Here is a snapshot of the graph I am referring to. 69.3.72.249 ( talk) 04:46, 15 September 2010 (UTC)
Page title | Pending Changes after | Total IP edits before Pending | Rev of these IP edits | nonreverted/total IP edits | Pending Changes: nonreverted/total IP edits |
Spain national football team | "21:16, 11 July 2010" | 143 (10 days) | 15 | 89.5% | 65.8% |
Toy Story 3 | 14:49, 23 July 2010 | 156 (10 days) | 30 | 80.8% | 51.4% |
Australia's Next Top Model, Cycle 6 | 21:01, 21 July 2010 | 200 (10 days) | 68 | 66.0% | 65.7% |
SummerSlam (2010) | 18:11, 21 July 2010 | 167 (10 days) | 45 | 73.1% | 27.2% |
The Twilight Saga: Eclipse | 21:24, 25 April 2010 | 32 (10 days) | 17 | 46.9% | 42.0% |
Jason Leopold | 10:57, 28 July 2010 | 7 (10 days) | 2 | 71.4% | 17.9% |
Wrong diagnosis. IP vandals go up after the Pending Changes period begins... :( -- Chris.urs-o ( talk) 18:46, 17 September 2010 (UTC)
PC affects more than only anonymous edits. Editors who are logged in but do not have reviewer privileges are also put in the "pending" queue. What do the results look like when non-anonymous users make edits? Cliff ( talk) 16:41, 17 March 2011 (UTC)
This page is great. But, it needs a last column which gives the unreverted/total anonymous edits % for each page which looks at the same amount of time the page was under PC, immediately prior to being under PC. We also need to identify whether that overlapped with another kind of page protection. Otherwise, we have data, internally comparable, but untethered to the past. Ocaasi ( talk) 11:22, 7 August 2010 (UTC)
Here is an obvious experiment: try PC on Today's Featured Article for 1 day. There is already tons of experience with vandalism on TFA under the old practice of leaving it unprotected as much as possible, and reverting vandalism on it. 69.111.194.167 ( talk) 16:31, 15 April 2011 (UTC)
first, NICE TABLE! I was a bit surprized at the range of accepted edits. For articles with more than a few the range is from 100% to 0%. I don't know what I might have expected, but it wasn't such a range. Going forward, I might suggest that articles with <20% or so accepted edits, and with an edit volume of more than a few a week might be better on semi-protection. Otherwise I'm very happy with the way the test plot has come out. -- Rocksanddirt ( talk) 16:11, 25 August 2010 (UTC)
´We find this quite non-operative for certain cases. See Antonio Arnaiz-Villena (false )biography: 1-A group of apparent linguists ( Trigaranus, Akerbeltz, User:Dumu Eduba, Kwamikagami) started fighting somewhere else in Wikipedia to remove “Iberian-Guanche Inscriptions”.The page was removed and they particularly disagree with the word “usko-mediterraneans” 2-After many months Dumu Eduba (only interested in linguistics) brought up false accusations against Arnaiz-Villena from doubtul newspapers written ten years ago that he found in Internet(June 9th 2010).This accusations were published within 2-3 weeks time and nothing was said anymore. 3-Arnaiz-Villena and his group have published quite a lot of papers in the last ten years and some books [3]
4-The accusations were shown to be induced and Judges invalidated them. 5-Now,at least User: Arnaiz1= Arnaiz-Villena himself, User:Virginal6,who were both involved in the accusations and have all documents tried to write details about how Judges made the false and induced accusations dissapearing: ,name of Judges ,dates,sentences number etc (see Arnaiz-Villena discussion) 6-They have silenced Arnaiz1 and Virginal6 .because of sock pupets (it is not true). 7-The false biography is in Wikipedia and the living interested people silenced. 8-I would ask you in the name of Arnaiz-Villena group to a)Push to Dumu Eduba to finishing the biography on litigations(he has the data in the Discussion). b)Remove this part of biography until it is finished.(or leave this part as it was before June 9th 2010). 9-We are not allowed to give away Court sentences to nick names.This has not been seen in Wikipedia yet.We will only give sentences to Wikipedia California Administrators. Please contact Antonio Arnaiz-Villena at or at aarnaiz@med.ucm.es or tel +34913941632. Symbio04 ( talk) 10:11, 1 September 2010 (UTC) Symbio04 ( talk) 10:58, 1 September 2010 (UTC)
PC affects more than only anonymous edits. Editors who are logged in but do not have reviewer privileges are also put in the "pending" queue. What do the results look like when non-anonymous users make edits? Cliff ( talk) 16:41, 17 March 2011 (UTC)
In the interests of accuracy and future research, could someone state what precisely "reverted" means in this context, and what algorithm [a link to the actual source code would be ideal!] was used to determine which revisions were reverted.-- greenrd ( talk) 12:09, 14 September 2010 (UTC)
Are the totals for each of those columns available somewhere? -- Anthonyhcole ( talk) 18:11, 19 February 2011 (UTC)