A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
A paper presented at last month's CSCW Conference, titled "Keeping eyes on the prize: officially sanctioned rule breaking in mass collaboration systems" [1] observes that "Mass collaboration systems are often characterized as unstructured organizations lacking rule and order", yet Wikipedia has a well developed body of policies to support it as an organization. Rule breaking in bureaucracies is a slippery slope quickly leading to potentially dangerous exceptions, so Wikipedia has a mechanism called "Ignore all rules" ( WP:IAR) for officially sanctioned rule breaking. The researchers have considered IAR's impact within the scope of deletion requests. The results show that the IAR policy has meaningful influences on deliberation outcomes, which rather than wreaking havoc, provides a positive, functional governance mechanism.
This paper is another welcome addition to the growing literature on AfD, examining the effectiveness of rule breaking using WP:IAR within these discussions. It starts with an in depth examination of rule breaking within collaborative environments. Then these six hypotheses are postulated:
To test these, the researchers scoured AfD discussions starting from April 2006 to October 2008, collecting those where WP:IAR had been invoked. These were then supplemented by randomly drawing a control group from non IAR AfD discussions from the same date. The resulting dataset contained 555 AfD discussions. These were coded by Outcome, for Keep/Delete and IAR usage in Keep/Delete vote, Policy Match and Category Match. Each hypothesis and the control were fitted to a linear regression model. The results were as follows:
H1 was supported only in cases where IAR is used in keep vote, but showed insignificant impact as a delete argument. H2, H3 & H4 look for conditions in which IAR's impact on the ultimate decision would be strengthened. H2 was supported only marginally; H3 was not supported; H4 was not supported and actually indicated that in the case where a keep voter has invoked IAR with another policy this will only increase the chance of a delete outcome! H5 and H6 consider if IAR fares better when pitted against increasingly contradictory or complicated policies and both of these are supported. Overall, the authors conclude that IAR plays a significant role in Wikipedia's policies, and recommend its use to other communities. They point out that IAR is also an indicator of where policy is weak in addressing the community's needs.
Another CSCW paper titled "Could someone please translate this?": activity analysis of Wikipedia article translation by non-experts" [2] analyzes the work of a volunteer translator of Wikipedia articles. It goes into great detail: it breaks down the big translation task into many sub-activities, such as looking up complicated words in the source language, choosing the right translation, using editing software, etc. It presents all the activities according to the Activity theory methodology. Though there are other papers that deal with translation of Wikipedia content, it is the first paper to examine the actual volunteer translator's activity.
Interestingly, this paper notes the importance of the Simple English Wikipedia several times, as a tool that may help people translate the content, with the assumption that the language of the main English Wikipedia may frequently be complex and challenging (this assumption is based on another paper, which compared the English and Simple English Wikipedias). It relies on the Simple English Wikipedia a bit too much, though; for example, it cites its main page as a source for some statistics, which would better be obtained directly from stats.wikimedia.org, Wikimedia's main statistics site.
It has some shortcomings, which should be addressed in future works on the subject:
Despite these shortcomings, this paper is valuable for several reasons:
Finally, the article promises further research and suggestions about building tools for translator support, which would be very interesting to read.
A preprint titled "Has OpenStreetMap a role in Digital Earth Applications?" [3] studies OpenStreetMap, the wiki-based collaboratively editable map, as a predominant example of Volunteered Geographical Information projects. The paper addresses two main research questions: 1) How successful is the OSM project in providing spatial data and to which extent can it be compared to Wikipedia in this sense, 2) what are the main characteristics of OSM stemming from its crowd-sourced nature? The paper gives a very comprehensive overview of the work-flow of OSM, reviews the main characteristics of its collaborative mapping process very well, and tries to compare these characteristics with those of Wikipedia: In contrast with Wikipedia, the administrative structure of OSM is unknown and not very well defined within the community of its editors; however both platforms show the same Zipfian characteristics among their editors; a few editors are responsible for large numbers of contributions and many editors have only a few contributions. Although the criteria are quite different on the two platforms, the paper finds that the relative population of OSM Featured Objects is evidently larger than the ones of Wikipedia (Featured Articles). In the conclusion, the authors express that they "believe that OSM will continue its growth for the foreseeable future". However, the route to this conclusion is not very well described in the manuscript.
In MJ no more: Using Concurrent Wikipedia Edit Spikes with Social Network Plausibility Checks for Breaking News Detection [4] by Thomas Steiner, Seth van Hooland and Ed Summers, the controversial (per WP:Recentism and WP:RS) field of breaking news articles is investigated. Motivated by the overloading of Wikipedia during the breaking of the news of Michael Jackson's death, researcher Thomas Steiner created an open source exploratory tool called The Wikipedia Live Monitor. This tool allowed his team to examine clusters of related activity based on edit spikes in a 5 minute window within multiple streams fed by Wikipedia's recent changes; Twitter Feeds; Google+ and Facebook. The main research question posed is: are edit spikes in Wikipedia, clustered with related social network activity, useful indicators for identifying breaking news events, and with what delay? By considering action along multiple streams, they are able to cross-check the plausibility of information being disseminated by many less reliable sources.
Their approach is based on prior work by S. Petrović, M. Osborne, and V. Lavrenko in Streaming First Story Detection with Application to Twitter, who used the document vector space model from classic information retrieval to cluster twitter feeds. But in this case the researchers are clustering multiple streams which can potentially hold far more information when a story breaks and can therefore detect these very quickly. While they could locate breaking news, they may need more work to optimize the timing parameters of the algorithm. Further research is planned into automating the classification of edits, which could reform future use of non- reliable sources.
A WikiSym 2012 paper titled Staying in the Loop: Structure and Dynamics of Wikipedia’s Breaking News Collaborations [5] looked at the trajectory of article construction which captures the collaboration structure embedded in the creation of breaking news stories. They have shown that these stories, fueled by mass media and social networks, tend to create a social melting pot surrounding the editing of these events. A social network analysis of the relations between editors of breaking news stories located editors in diverse social roles, such as Creators, early contributors, the highly centralized activity coordinators (admins) and the marginal vandals and their tireless opponents, the spam fighting bots and recent changes patrollers. Another result is that most articles - those which are not breaking news stories - lack the dense creation trajectories found in breaking news stories.
As once observed by Ward Cunningham, one important feature by which Wikipedians improved his invention, the wiki, was to introduce "a talk page or a discussion page behind every page, so you don't actually have to see the discussion and it makes a much more finished product". Yet surfacing this deliberation could engender trust in the process if the deliberation process appears fair, well-reasoned, and thorough. Alternatively, it could encourage doubts about content quality, especially if the process appears messy or biased. In a CSCW '13 paper titled "Your process is showing: controversy management and perceived quality in wikipedia", [6] the researchers report on an experiment in which they found that exposing discussions generally led to a drop in the perceived quality of the related article, especially if the discussion revealed conflict.
Motivated by how university students learn to assess reliability of controversial articles such as Supreme Court decisions or about individuals like Pope Pius XII and Yasser Arafat, the researchers considered how beneficial it would be to reveal the process of articles creation. In wikis the discussions used to produce the articles are hidden from view using talk pages and other coordination spaces. It was believed that when deliberations appear fair, well-reasoned, and thorough it should engender trust in the reader and that a process which appears biased or chaotic should diminish the confidence in the article's quality. The paper outlines the issues involved in assessing the credibility of online information sources. The paper first considers prior work on article quality but reframes the issues based on an idea presented in the recent best seller Thinking, Fast and Slow by economics Nobel laureate Daniel Kahneman. The research questions posed are:
These questions are then interpreted using Kahneman's System 1 (slower deliberative thinking) and System 2 (faster associative thinking). The questions were investigated in an experiment run on Amazon's mechanical Turk — a crowdsourcing platform allowing micropayments. Beginning with 3500 controversial articles, the researchers selected featured articles, and discarded newsworthy items leaving only 50 articles. Elite Turkers were then shown ten brief vignettes illustrating talk page discussion about a selected controversy, meant to display one of ten forms of editor coordination or conflict activities. They then had to answer a questionnaire, and complete two reading comprehension tasks. The researchers noticed that exposing Wikipedia readers to such discussions with any type of conflict generally led to a drop in the perceived quality of the related article. They point out that the magnitude of the reader's negative perception depends on the type of editors’ interaction. Finally they note that while participants may have suffered a confidence crisis with respect to specific articles, at the same time they gained respect for Wikipedia in general. A final conclusion is that while the experiment, especially the comprehension task, was designed to engage readers in System 1 thinking, watching the discussions may well have triggered a System 2 critical response.
{{
cite journal}}
: Cite journal requires |journal=
(
help)
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
A paper presented at last month's CSCW Conference, titled "Keeping eyes on the prize: officially sanctioned rule breaking in mass collaboration systems" [1] observes that "Mass collaboration systems are often characterized as unstructured organizations lacking rule and order", yet Wikipedia has a well developed body of policies to support it as an organization. Rule breaking in bureaucracies is a slippery slope quickly leading to potentially dangerous exceptions, so Wikipedia has a mechanism called "Ignore all rules" ( WP:IAR) for officially sanctioned rule breaking. The researchers have considered IAR's impact within the scope of deletion requests. The results show that the IAR policy has meaningful influences on deliberation outcomes, which rather than wreaking havoc, provides a positive, functional governance mechanism.
This paper is another welcome addition to the growing literature on AfD, examining the effectiveness of rule breaking using WP:IAR within these discussions. It starts with an in depth examination of rule breaking within collaborative environments. Then these six hypotheses are postulated:
To test these, the researchers scoured AfD discussions starting from April 2006 to October 2008, collecting those where WP:IAR had been invoked. These were then supplemented by randomly drawing a control group from non IAR AfD discussions from the same date. The resulting dataset contained 555 AfD discussions. These were coded by Outcome, for Keep/Delete and IAR usage in Keep/Delete vote, Policy Match and Category Match. Each hypothesis and the control were fitted to a linear regression model. The results were as follows:
H1 was supported only in cases where IAR is used in keep vote, but showed insignificant impact as a delete argument. H2, H3 & H4 look for conditions in which IAR's impact on the ultimate decision would be strengthened. H2 was supported only marginally; H3 was not supported; H4 was not supported and actually indicated that in the case where a keep voter has invoked IAR with another policy this will only increase the chance of a delete outcome! H5 and H6 consider if IAR fares better when pitted against increasingly contradictory or complicated policies and both of these are supported. Overall, the authors conclude that IAR plays a significant role in Wikipedia's policies, and recommend its use to other communities. They point out that IAR is also an indicator of where policy is weak in addressing the community's needs.
Another CSCW paper titled "Could someone please translate this?": activity analysis of Wikipedia article translation by non-experts" [2] analyzes the work of a volunteer translator of Wikipedia articles. It goes into great detail: it breaks down the big translation task into many sub-activities, such as looking up complicated words in the source language, choosing the right translation, using editing software, etc. It presents all the activities according to the Activity theory methodology. Though there are other papers that deal with translation of Wikipedia content, it is the first paper to examine the actual volunteer translator's activity.
Interestingly, this paper notes the importance of the Simple English Wikipedia several times, as a tool that may help people translate the content, with the assumption that the language of the main English Wikipedia may frequently be complex and challenging (this assumption is based on another paper, which compared the English and Simple English Wikipedias). It relies on the Simple English Wikipedia a bit too much, though; for example, it cites its main page as a source for some statistics, which would better be obtained directly from stats.wikimedia.org, Wikimedia's main statistics site.
It has some shortcomings, which should be addressed in future works on the subject:
Despite these shortcomings, this paper is valuable for several reasons:
Finally, the article promises further research and suggestions about building tools for translator support, which would be very interesting to read.
A preprint titled "Has OpenStreetMap a role in Digital Earth Applications?" [3] studies OpenStreetMap, the wiki-based collaboratively editable map, as a predominant example of Volunteered Geographical Information projects. The paper addresses two main research questions: 1) How successful is the OSM project in providing spatial data and to which extent can it be compared to Wikipedia in this sense, 2) what are the main characteristics of OSM stemming from its crowd-sourced nature? The paper gives a very comprehensive overview of the work-flow of OSM, reviews the main characteristics of its collaborative mapping process very well, and tries to compare these characteristics with those of Wikipedia: In contrast with Wikipedia, the administrative structure of OSM is unknown and not very well defined within the community of its editors; however both platforms show the same Zipfian characteristics among their editors; a few editors are responsible for large numbers of contributions and many editors have only a few contributions. Although the criteria are quite different on the two platforms, the paper finds that the relative population of OSM Featured Objects is evidently larger than the ones of Wikipedia (Featured Articles). In the conclusion, the authors express that they "believe that OSM will continue its growth for the foreseeable future". However, the route to this conclusion is not very well described in the manuscript.
In MJ no more: Using Concurrent Wikipedia Edit Spikes with Social Network Plausibility Checks for Breaking News Detection [4] by Thomas Steiner, Seth van Hooland and Ed Summers, the controversial (per WP:Recentism and WP:RS) field of breaking news articles is investigated. Motivated by the overloading of Wikipedia during the breaking of the news of Michael Jackson's death, researcher Thomas Steiner created an open source exploratory tool called The Wikipedia Live Monitor. This tool allowed his team to examine clusters of related activity based on edit spikes in a 5 minute window within multiple streams fed by Wikipedia's recent changes; Twitter Feeds; Google+ and Facebook. The main research question posed is: are edit spikes in Wikipedia, clustered with related social network activity, useful indicators for identifying breaking news events, and with what delay? By considering action along multiple streams, they are able to cross-check the plausibility of information being disseminated by many less reliable sources.
Their approach is based on prior work by S. Petrović, M. Osborne, and V. Lavrenko in Streaming First Story Detection with Application to Twitter, who used the document vector space model from classic information retrieval to cluster twitter feeds. But in this case the researchers are clustering multiple streams which can potentially hold far more information when a story breaks and can therefore detect these very quickly. While they could locate breaking news, they may need more work to optimize the timing parameters of the algorithm. Further research is planned into automating the classification of edits, which could reform future use of non- reliable sources.
A WikiSym 2012 paper titled Staying in the Loop: Structure and Dynamics of Wikipedia’s Breaking News Collaborations [5] looked at the trajectory of article construction which captures the collaboration structure embedded in the creation of breaking news stories. They have shown that these stories, fueled by mass media and social networks, tend to create a social melting pot surrounding the editing of these events. A social network analysis of the relations between editors of breaking news stories located editors in diverse social roles, such as Creators, early contributors, the highly centralized activity coordinators (admins) and the marginal vandals and their tireless opponents, the spam fighting bots and recent changes patrollers. Another result is that most articles - those which are not breaking news stories - lack the dense creation trajectories found in breaking news stories.
As once observed by Ward Cunningham, one important feature by which Wikipedians improved his invention, the wiki, was to introduce "a talk page or a discussion page behind every page, so you don't actually have to see the discussion and it makes a much more finished product". Yet surfacing this deliberation could engender trust in the process if the deliberation process appears fair, well-reasoned, and thorough. Alternatively, it could encourage doubts about content quality, especially if the process appears messy or biased. In a CSCW '13 paper titled "Your process is showing: controversy management and perceived quality in wikipedia", [6] the researchers report on an experiment in which they found that exposing discussions generally led to a drop in the perceived quality of the related article, especially if the discussion revealed conflict.
Motivated by how university students learn to assess reliability of controversial articles such as Supreme Court decisions or about individuals like Pope Pius XII and Yasser Arafat, the researchers considered how beneficial it would be to reveal the process of articles creation. In wikis the discussions used to produce the articles are hidden from view using talk pages and other coordination spaces. It was believed that when deliberations appear fair, well-reasoned, and thorough it should engender trust in the reader and that a process which appears biased or chaotic should diminish the confidence in the article's quality. The paper outlines the issues involved in assessing the credibility of online information sources. The paper first considers prior work on article quality but reframes the issues based on an idea presented in the recent best seller Thinking, Fast and Slow by economics Nobel laureate Daniel Kahneman. The research questions posed are:
These questions are then interpreted using Kahneman's System 1 (slower deliberative thinking) and System 2 (faster associative thinking). The questions were investigated in an experiment run on Amazon's mechanical Turk — a crowdsourcing platform allowing micropayments. Beginning with 3500 controversial articles, the researchers selected featured articles, and discarded newsworthy items leaving only 50 articles. Elite Turkers were then shown ten brief vignettes illustrating talk page discussion about a selected controversy, meant to display one of ten forms of editor coordination or conflict activities. They then had to answer a questionnaire, and complete two reading comprehension tasks. The researchers noticed that exposing Wikipedia readers to such discussions with any type of conflict generally led to a drop in the perceived quality of the related article. They point out that the magnitude of the reader's negative perception depends on the type of editors’ interaction. Finally they note that while participants may have suffered a confidence crisis with respect to specific articles, at the same time they gained respect for Wikipedia in general. A final conclusion is that while the experiment, especially the comprehension task, was designed to engage readers in System 1 thinking, watching the discussions may well have triggered a System 2 critical response.
{{
cite journal}}
: Cite journal requires |journal=
(
help)
Discuss this story
Thanks
Really appreciated the Talk page and In brief bits this month, very illuminating. Thanks. Killer Chihuahua 21:19, 4 April 2013 (UTC) reply
IAR Study
And they call economics the dismal science. Did the study control for repeat voters? With the way AfD is, if some idiot regularly invokes IAR in a stupid way, they will be discredited and will skew the eventual outcome of the AfDs they participate in. There's also a huge causality problem as well, the correlations here may imply more about the situations that cause people to invoke IAR, rather than the effects of invoking IAR. Gigs ( talk) 17:47, 8 April 2013 (UTC) reply