This page discusses evaluations of changes in Wikipedia policies.
To evaluate policy changes, we need to first develop a database of them. Possible structures for this database:
Some policy changes would be easy to represent in any of these formats: there is nothing unclear about when certain articles were protected and unprotected. The representations of some other policy changes could be more subtle, and community members should perhaps be consulted about them.
It might be possible to partially automate construction of the policy change database by stepping through the edit histories of the 43 sub pages of the Wikipedia list of policies.
Once we have any of these representations of rule changes, to the extent that the "treatments" of these changes are randomly assigned (a questionable assumption in many cases), we can evaluate them the way economists often evaluate the effects of natural experiments.
What were the costs and benefits of this policy change, which was implemented on Monday December 5, at 19:00 UTC?
One consequence of the no-anon-page-creation policy may have been a reduction in the number of articles that are deleted very quickly after their creation. To test this, first define these two variables:
SD_a^T = Speed of deletion of article a, if it was deleted within T days of its creation; (ie, this equals time-of-deletion minus time-of-creation if deleted before time-of-creation plus T, and is missing otherwise.) D_a^T = Dummy variable for whether article a was deleted within T days of its creation
These variables can be aggregated to the date of article creation:
SD_t^T = mean(SD_a^T | article a was created on date t) D_t^T = mean(D_a^T | article a was created on date t)
Those two variables can be plotted as a function of time. In addition, we can run the following regression specifications:
SD_a^T = b0 + b1*AnonCreated_a + b2*Post05Dec2005_a + g1*DTimeofDay_a + f(t_a) + eps_a D_a^T = b0 + b1*AnonCreated_a + b2*Post05Dec2005_a + g1*DTimeofDay_a + f(t_a) + eps_a,
where b0 is a constant, AnonCreated_a is an indicator for whether article a was created by an anonymous user, Post05Dec2005_a is an indicator for whether article a was created after the no-anon-page-creation policy was implemented, DTimeofDay_a is a full set of hour dummies to capture cyclicality during each day, f(t_a) is a smooth function of day-of-creation, and eps_a is an error term.
Here we would expect b2 not to be significant, because any effect of the policy should be picked up by the AnonCreated dummy. We could additionally run those regressions excluding one-by-one the AnonCreated dummy or the Post05Dec2005 dummy.
Last, we could run
SD_a^T = b0 + b1*AnonCreated_a + g1*DTimeofDay_a + f(t_a) + eps_a D_a^T = b0 + b1*AnonCreated_a + g1*DTimeofDay_a + f(t_a) + eps_a,
and instrument for AnonCreated_a with Post05Dec2005_a.
Projection and unprotection of articles.
Blocking and unblocking of users.
This page discusses evaluations of changes in Wikipedia policies.
To evaluate policy changes, we need to first develop a database of them. Possible structures for this database:
Some policy changes would be easy to represent in any of these formats: there is nothing unclear about when certain articles were protected and unprotected. The representations of some other policy changes could be more subtle, and community members should perhaps be consulted about them.
It might be possible to partially automate construction of the policy change database by stepping through the edit histories of the 43 sub pages of the Wikipedia list of policies.
Once we have any of these representations of rule changes, to the extent that the "treatments" of these changes are randomly assigned (a questionable assumption in many cases), we can evaluate them the way economists often evaluate the effects of natural experiments.
What were the costs and benefits of this policy change, which was implemented on Monday December 5, at 19:00 UTC?
One consequence of the no-anon-page-creation policy may have been a reduction in the number of articles that are deleted very quickly after their creation. To test this, first define these two variables:
SD_a^T = Speed of deletion of article a, if it was deleted within T days of its creation; (ie, this equals time-of-deletion minus time-of-creation if deleted before time-of-creation plus T, and is missing otherwise.) D_a^T = Dummy variable for whether article a was deleted within T days of its creation
These variables can be aggregated to the date of article creation:
SD_t^T = mean(SD_a^T | article a was created on date t) D_t^T = mean(D_a^T | article a was created on date t)
Those two variables can be plotted as a function of time. In addition, we can run the following regression specifications:
SD_a^T = b0 + b1*AnonCreated_a + b2*Post05Dec2005_a + g1*DTimeofDay_a + f(t_a) + eps_a D_a^T = b0 + b1*AnonCreated_a + b2*Post05Dec2005_a + g1*DTimeofDay_a + f(t_a) + eps_a,
where b0 is a constant, AnonCreated_a is an indicator for whether article a was created by an anonymous user, Post05Dec2005_a is an indicator for whether article a was created after the no-anon-page-creation policy was implemented, DTimeofDay_a is a full set of hour dummies to capture cyclicality during each day, f(t_a) is a smooth function of day-of-creation, and eps_a is an error term.
Here we would expect b2 not to be significant, because any effect of the policy should be picked up by the AnonCreated dummy. We could additionally run those regressions excluding one-by-one the AnonCreated dummy or the Post05Dec2005 dummy.
Last, we could run
SD_a^T = b0 + b1*AnonCreated_a + g1*DTimeofDay_a + f(t_a) + eps_a D_a^T = b0 + b1*AnonCreated_a + g1*DTimeofDay_a + f(t_a) + eps_a,
and instrument for AnonCreated_a with Post05Dec2005_a.
Projection and unprotection of articles.
Blocking and unblocking of users.